gpt4 book ai didi

python - word_tokenize TypeError : expected string or buffer

转载 作者:太空宇宙 更新时间:2023-11-04 07:36:33 25 4
gpt4 key购买 nike

<分区>

调用 word_tokenize 时出现以下错误:

File "C:\Python34\lib\site-packages\nltk\tokenize\punkt.py", line 1322,
in _slices_from_text for match in
self._lang_vars.period_context_re().finditer(text):
TypeError: expected string or buffer

我有一个大文本文件 (1500.txt),我想从中删除停用词。我的代码如下:

from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize

with open('E:\\Book\\1500.txt', "r", encoding='ISO-8859-1') as File_1500:
stop_words = set(stopwords.words("english"))
words = word_tokenize(File_1500)
filtered_sentence = [w for w in words if not w in stop_words]
print(filtered_sentence)

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com