gpt4 book ai didi

python - 使用 WordNet 进行词义消歧。如何选择与相同含义相关的单词?

转载 作者:行者123 更新时间:2023-11-30 22:38:14 25 4
gpt4 key购买 nike

我正在使用 WordNet 和 NLTK 进行词义消歧。我对所有与声音有关的单词都很感兴趣。我有一个这样的单词列表,“roll”就是其中之一。然后我检查我的任何句子是否包含这个单词(我还根据词性进行检查)。如果是,我只想选择与声音相关的句子。在下面的示例中,它将是第二句话。我现在的想法就是选择这样的词,其定义中有一个“声音”一词,即“鼓(尤其是小军鼓)快速连续敲击的声音”。但我怀疑有一种更优雅的方式。任何想法将不胜感激!

from nltk.wsd import lesk
from nltk.corpus import wordnet as wn

samples = [('The van rolled along the highway.','n'),
('The thunder rolled and the lightning striked.','n')]

word = 'roll'
for sentence, pos_tag in samples:
word_syn = lesk(word_tokenize(sentence.lower()), word, pos_tag)
print 'Sentence:', sentence
print 'Word synset:', word_syn
print 'Corresponding definition:', word_syn.definition()

输出:

Sentence: The van rolled along the highway.
Word synset: Synset('scroll.n.02')
Corresponding definition: a document that can be rolled up (as for storage)
Sentence: The thunder rolled and the lightning striked.
Word synset: Synset('paradiddle.n.01')
Corresponding definition: the sound of a drum (especially a snare drum) beaten rapidly and continuously

最佳答案

您可以使用 WordNet 上位词(具有更一般含义的同义词)。我的第一个想法是从当前的同义词集向上(使用synset.hypernyms())并继续检查是否找到“声音”同义词集。当我点击根(没有上位词,即 synset.hypernyms() 返回一个空列表)时,我会停止。

现在,对于您的两个示例,这会生成以下同义词集序列:

Sentence:The van rolled along the highway .
Word synset:Synset('scroll.n.02')
[Synset('manuscript.n.02')]
[Synset('autograph.n.01')]
[Synset('writing.n.02')]
[Synset('written_communication.n.01')]
[Synset('communication.n.02')]
[Synset('abstraction.n.06')]
[Synset('entity.n.01')]

Sentence:The thunder rolled and the lightning striked .
Word synset:Synset('paradiddle.n.01')
[Synset('sound.n.04')]
[Synset('happening.n.01')]
[Synset('event.n.01')]
[Synset('psychological_feature.n.01')]
[Synset('abstraction.n.06')]
[Synset('entity.n.01')]

因此,您可能想要查找的同义词集之一是 sound.n.04。但可能还有其他例子,我认为你可以尝试其他例子并尝试列出一个列表。

关于python - 使用 WordNet 进行词义消歧。如何选择与相同含义相关的单词?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43656078/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com