gpt4 book ai didi

python - 属性错误: 'tuple' attribute has no attribute 'endswith' Python NLTK Lemmatizer

转载 作者:行者123 更新时间:2023-12-01 04:02:12 24 4
gpt4 key购买 nike

我正在为 NLP 项目创建预处理器,但词形还原器未按预期工作。我希望代码能够对每个单词进行词形还原,但我看到错误 AttributeError: 'tuple' object has no attribute 'endswith'。抱歉,如果这是一个愚蠢的错误,但我做错了什么?我正在使用Python。这是我的代码:

from pymongo import MongoClient
from nltk import *
import nltk
lemma = WordNetLemmatizer()
client = MongoClient()
db = client.qa
main = db.main

while True:
question = input('Ask a question: ').upper()
question = re.sub('[^0-9A-Z\s]', '', question)
question = word_tokenize(question)
question = nltk.pos_tag(question)
for each in question:
lemma.lemmatize(each)
print(question)

更新:

我已经更新了代码以便可以编译,但现在它实际上并没有对单词进行词形还原。这是更新后的代码:

from pymongo import MongoClient
from nltk import *
lemma = WordNetLemmatizer()
client = MongoClient()
db = client.qa
main = db.main

while True:
question = input('Ask a question: ').upper()
question = re.sub('[^0-9A-Z\s]', '', question)
question = word_tokenize(question)
for each in question:
lemma.lemmatize(each[0])
print(question)

最佳答案

TL;DR:

from pymongo import MongoClient
from nltk import word_tokenize, pos_tag, WordNetLemmatizer

wnl = WordNetLemmatizer()
client = MongoClient()
db = client.qa
main = db.main

while True:
question = input('Ask a question: ').upper()
question = re.sub('[^0-9A-Z\s]', '', question)
question = word_tokenize(question)
question = nltk.pos_tag(question)
for each in question:
wnl.lemmatize(each[0])
print(question)

评论中的解释:

>>> from nltk import word_tokenize, pos_tag, WordNetLemmatizer
>>> wnl = WordNetLemmatizer()
>>> sent = "this is a two parts sentence, with some weird lemmas"
>>> word_tokenize(sent) # Return a list of string
['this', 'is', 'a', 'two', 'parts', 'sentence', ',', 'with', 'some', 'weird', 'lemmas']
>>> pos_tag(word_tokenize(sent)) # Returns a list of tuple with (word, pos)
[('this', 'DT'), ('is', 'VBZ'), ('a', 'DT'), ('two', 'CD'), ('parts', 'NNS'), ('sentence', 'NN'), (',', ','), ('with', 'IN'), ('some', 'DT'), ('weird', 'JJ'), ('lemmas', 'NN')]
>>> pos_tag(word_tokenize(sent))[0]
('this', 'DT')
>>> pos_tag(word_tokenize(sent))[0][0]
'this'
>>> each = pos_tag(word_tokenize(sent))[0][0]
>>> each
'this'
>>> wnl.lemmatize(each)
'this'

关于python - 属性错误: 'tuple' attribute has no attribute 'endswith' Python NLTK Lemmatizer,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36285578/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com