Python | Perform Sentence Segmentation Using Spacy
The process of deciding from where the sentences actually start or end in NLP or we can simply say that here we are dividing a paragraph based on sentences. This process is known as Sentence Segmentation. In Python, we implement this part of NLP using the spacy library.
Spacy is used for Natural Language Processing in Python.
To use this library in our python program we first need to install it.
Command to install this library:
pip install spacy python -m spacy download en_core_web_sm Here en_core_web_sm means core English Language available online of small size.
Example:
we have the following paragraph: "I Love Coding. Beginner for Beginner helped me in this regard very much. I Love Beginner for Beginner." here there are 3 sentences. 1. I Love Coding. 2. Beginner for Beginner helped me in this regard very much. 3. I Love Beginner for Beginner
In python, .sents
is used for sentence segmentation which is present inside spacy. The output is given by .sents
is a generator and we need to use the list if we want to print them randomly.
Code:
#import spacy library import spacy #load core english library nlp = spacy.load( "en_core_web_sm" ) #take unicode string #here u stands for unicode doc = nlp(u "I Love Coding. Beginner for Beginner helped me in this regard very much. I Love Beginner for Beginner." ) #to print sentences for sent in doc.sents: print (sent) |
Output:
Now if we try to use doc.sents randomly then what happens:
Code: To overcome this error we first need to convert this generator into a list using list function.
#converting the generator object result in to list doc1 = list (doc.sents) #Now we can use it randomly as doc1[ 1 ] |
Output: