gpt4 book ai didi

python - spacy 中的词向量示例问题

转载 作者:太空宇宙 更新时间:2023-11-03 13:35:33 24 4
gpt4 key购买 nike

from spacy.en import English
from numpy import dot
from numpy.linalg import norm

parser = English()

# you can access known words from the parser's vocabulary
nasa = parser.vocab['NASA']

# cosine similarity
cosine = lambda v1, v2: dot(v1, v2) / (norm(v1) * norm(v2))

# gather all known words, take only the lowercased versions
allWords = list({w for w in parser.vocab if w.has_repvec and w.orth_.islower() and w.lower_ != "nasa"})

# sort by similarity to NASA
allWords.sort(key=lambda w: cosine(w.repvec, nasa.repvec))
allWords.reverse()
print("Top 10 most similar words to NASA:")
for word in allWords[:10]:
print(word.orth_)

我正在尝试运行上面的示例,但出现以下错误:

Traceback (most recent call last):
File "C:\Users\bulusu.kiran\Documents\WORK\nlp\wordVectors1.py", line 8, in <module>
nasa = parser.vocab['NASA']
File "spacy/vocab.pyx", line 330, in spacy.vocab.Vocab.__getitem__ (spacy/vocab.cpp:7708)
orth = id_or_string TypeError: an integer is required

示例取自:Intro to NLP with spaCy

是什么导致了这个错误?

最佳答案

您使用的是什么版本的 Python?这可能是 Unicode 错误的结果;我通过替换让它在 Python 2.7 中工作

nasa = parser.vocab['NASA']

nasa = parser.vocab[u'NASA']

然后你会得到这个错误:

AttributeError: 'spacy.lexeme.Lexeme' object has no attribute 'has_repvec'

有一个 similar issue on the SpaCy repo , 但这些都可以通过将 has_repvec 替换为 has_vector 并将 repvec 替换为 vector 来解决。我也会对该 GitHub 线程发表评论。

我使用的完整、更新的代码:

import spacy

from numpy import dot
from numpy.linalg import norm

parser = spacy.load('en')
nasa = parser.vocab[u'NASA']

# cosine similarity
cosine = lambda v1, v2: dot(v1, v2) / (norm(v1) * norm(v2))

# gather all known words, take only the lowercased versions
allWords = list({w for w in parser.vocab if w.has_vector and w.orth_.islower() and w.lower_ != "nasa"})

# sort by similarity to NASA
allWords.sort(key=lambda w: cosine(w.vector, nasa.vector))
allWords.reverse()
print("Top 10 most similar words to NASA:")
for word in allWords[:10]:
print(word.orth_)

希望这对您有所帮助!

关于python - spacy 中的词向量示例问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40466285/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com