gpt4 book ai didi

python - POS-Tagger 非常慢

转载 作者:太空狗 更新时间:2023-10-29 21:23:53 26 4
gpt4 key购买 nike

我正在使用 nltk 通过首先删除给定的停用词从句子中生成 n-gram。但是,nltk.pos_tag() 在我的 CPU (Intel i7) 上非常慢,最多需要 0.6 秒。

输出:

['The first time I went, and was completely taken by the live jazz band and atmosphere, I ordered the Lobster Cobb Salad.']
0.620481014252
["It's simply the best meal in NYC."]
0.640982151031
['You cannot go wrong at the Red Eye Grill.']
0.644664049149

代码:

for sentence in source:

nltk_ngrams = None

if stop_words is not None:
start = time.time()
sentence_pos = nltk.pos_tag(word_tokenize(sentence))
print time.time() - start

filtered_words = [word for (word, pos) in sentence_pos if pos not in stop_words]
else:
filtered_words = ngrams(sentence.split(), n)

这真的那么慢还是我做错了什么?

最佳答案

使用pos_tag_sents 标记多个句子:

>>> import time
>>> from nltk.corpus import brown
>>> from nltk import pos_tag
>>> from nltk import pos_tag_sents
>>> sents = brown.sents()[:10]
>>> start = time.time(); pos_tag(sents[0]); print time.time() - start
0.934092998505
>>> start = time.time(); [pos_tag(s) for s in sents]; print time.time() - start
9.5061340332
>>> start = time.time(); pos_tag_sents(sents); print time.time() - start
0.939551115036

关于python - POS-Tagger 非常慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33676526/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com