gpt4 book ai didi

python - 对 pandas 数据框中列表内的单词进行词形还原

转载 作者:行者123 更新时间:2023-12-01 01:42:59 25 4
gpt4 key购买 nike

应用标记化后,我有一个 pandas 数据框,如下所示。我想在此数据框中应用 nltk 词形还原器。我尝试的是在这里给出。我收到错误消息“if form in excepts:TypeError: unhashable type: 'list'”。我如何在这里正确实现词形还原器?

另请注意,第 5 个数据框单元格有一个空列表。如何删除此数据框中的此类列表?

 [[ive, searching, right, words, thank, breather], [i, promise, wont, take, help, granted, fulfil, promise], [you, wonderful, blessing, times]]                     

[[free, entry, 2, wkly, comp, win, fa, cup, final, tkts, 21st, may, 2005], [text, fa, 87121, receive, entry, questionstd, txt, ratetcs, apply, 08452810075over18s]]

[[nah, dont, think, goes, usf, lives, around, though]]

[[even, brother, like, speak, me], [they, treat, like, aids, patent]]

[[i, date, sunday, will], []]

The lemmatizer function I tried

def lemmatize(fullCorpus):
lemmatizer = nltk.stem.WordNetLemmatizer()
lemmatized = fullCorpus['tokenized'].apply(lambda row: list(map([lemmatizer.lemmatize(y) for y in row])))
return lemmatized

最佳答案

您可以尝试如下:

def lemmatize(fullCorpus):
lemmatizer = nltk.stem.WordNetLemmatizer()
lemmatized = fullCorpus['tokenized'].apply(
lambda row: list(list(map(lemmatizer.lemmatize,y)) for y in row))
return lemmatized

关于python - 对 pandas 数据框中列表内的单词进行词形还原,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51689103/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com