python - NLTK WordNetLemmatizer 中的多线程？-6ren

python - NLTK WordNetLemmatizer 中的多线程？

转载作者：行者123 更新时间：2023-11-28 18:11:44

25

4

我正在尝试使用多线程来加快进程。我正在使用 wordnetlemmatizer 对单词进行词形还原，sentiwordnet 可以进一步使用这些单词来计算文本的情感。我使用 WordNetLemmatizer 的情感分析功能如下:

import nltk
from nltk.corpus import sentiwordnet as swn

def SentimentA(doc, file_path):
    sentences = nltk.sent_tokenize(doc)
    # print(sentences)
    stokens = [nltk.word_tokenize(sent) for sent in sentences]
    taggedlist = []
    for stoken in stokens:
        taggedlist.append(nltk.pos_tag(stoken))
    wnl = nltk.WordNetLemmatizer()
    score_list = []
    for idx, taggedsent in enumerate(taggedlist):
        score_list.append([])
        for idx2, t in enumerate(taggedsent):
            newtag = ''
            lemmatized = wnl.lemmatize(t[0])
            if t[1].startswith('NN'):
                newtag = 'n'
            elif t[1].startswith('JJ'):
                newtag = 'a'
            elif t[1].startswith('V'):
                newtag = 'v'
            elif t[1].startswith('R'):
                newtag = 'r'
            else:
                newtag = ''
            if (newtag != ''):
                synsets = list(swn.senti_synsets(lemmatized, newtag))

                score = 0
                if (len(synsets) > 0):
                    for syn in synsets:
                        score += syn.pos_score() - syn.neg_score()
                    score_list[idx].append(score / len(synsets))
    return SentiCal(score_list)

运行 4 个线程后，前 3 个线程出现以下错误，最后一个线程运行正常。

AttributeError: 'WordNetCorpusReader' object has no attribute '_LazyCorpusLoader__args'

我已经尝试按照此 NLTK issue 中的说明在本地导入 NLTK 包并尝试了此 page 上给出的解决方案.

最佳答案

快速破解:

import nltk
from nltk.corpus import sentiwordnet as swn
# Do this first, that'll do something eval() 
# to "materialize" the LazyCorpusLoader
next(swn.all_senti_synsets()) 

# Your other code here.

更多详细信息稍后......仍在输入

关于python - NLTK WordNetLemmatizer 中的多线程？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50611148/

25

4

0

文章推荐： html - 在同一个按钮上使用 2 个 animate.css 类

文章推荐： javascript - Redux 中间件是如何实现多任务的？

文章推荐： ios - 更新 Parse for iOS 中的 bool 值

python - NLTK WordNetLemmatizer 中的多线程？
我正在尝试使用多线程来加快进程。我正在使用 wordnetlemmatizer 对单词进行词形还原，sentiwordnet 可以进一步使用这些单词来计算文本的情感。我使用 WordNetLemmat
python - WordNetlemmatizer 错误 - 所有字母均已词形还原
我正在尝试对我的数据集进行词形还原以进行情感分析 - 我应该怎么做才能获得预期输出而不是当前输出？输入文件是一个 csv - 存储为 DataFrame 对象。 dataset = pd.read_c
file - python wordnetlemmatizer : not a zip file
我尝试使用以下方法进行词形还原: from nltk.stem.wordnet import WordNetLemmatizer wnl = WordNetLemmatizer() wnl.lemma
python - NLTK WordNetLemmatizer : Not Lemmatizing as Expected
我正在尝试使用 NLTK 的 WordNetLemmatizer 对句子中的所有单词进行词形还原。我有很多句子，但我只是使用第一句话来确保我正确执行此操作。这是我所拥有的: train_sentenc
python - Nltk 中的 WordNetLemmatizer 可以词干吗？
我想用 Wordnet 查找词干。 wordnet 是否具有词干提取功能？我将此导入用于我的词干提取，但它没有按预期工作。 from nltk.stem.wordnet import WordNetL
python - NLTK WordNetLemmatizer 将 "US"处理为 "u"
如果您将单词 "US"(美国)输入 WordNetLemmatizer 进行预处理(变为 "us"，即小写) 来自 nltk.stem 包，它被翻译成 "u"。例如: from nltk.stem i
python - WordNetLemmatizer : Different handling of wn. ADJ 和 wn.ADJ_SAT？
我需要使用 nltk 对文本进行词形还原。为了做到这一点，我申请 nltk.pos_tag到每个句子，然后将生成的 Penn Treebank 标签 (http://www.ling.upenn.ed
python - dask.dataframe 上的 WordNetLemmatizer 错误， 'WordNetCorpusReader' 对象没有属性 '_LazyCorpusLoader__args'
我正在尝试对 dask 数据框进行词干分析 wnl = WordNetLemmatizer() def lemmatizing(sentence): stemSentence = ""
python - 除非 POS 是显式的，否则 WordNetLemmatizer 不会返回正确的引理 - Python NLTK
我正在对 Ted 数据集抄本进行词形还原。我注意到一些奇怪的事情:并非所有单词都被词形还原。可以说， selected -> select 这是对的。但是，involved !-> involve

首页

博学

6Ren·AI

商城

python - NLTK WordNetLemmatizer 中的多线程？