使用NLTK庫可以很容易地分割文本。下面是一種常見的方法:
import nltk
from nltk.tokenize import sent_tokenize
text = "Hello, my name is Alice. How are you doing today?"
sentences = sent_tokenize(text)
for sentence in sentences:
print(sentence)
from nltk.tokenize import word_tokenize
for sentence in sentences:
words = word_tokenize(sentence)
for word in words:
print(word)
通過這種方法,可以輕松地分割文本并對其進行進一步處理。NLTK庫還提供了其他分割文本的方法,具體可以參考NLTK庫的官方文檔。