gpt4 book ai didi

wordnet - 基于 WordNet 相似度的最高分

转载 作者:行者123 更新时间:2023-12-02 17:02:58 25 4
gpt4 key购买 nike

一些相似度得分介于 0 和 1 之间,例如最短路径和 WuP。因此汽车与汽车之间的相似度将为 1,但 LCh 等其他度量将为

lch( car, automobile ) = 3.6889

我想知道这些措施的最高分。 3.6889 被认为是最大值吗?这些是否意味着 LCH 分数在 0 到 3.6889 之间。

我补充以下措施

jcn( car, automobile ) = 12876699.5
res( car, automobile ) = 9.3679
lesk( car, automobile ) = 9519

最佳答案

似乎 3.6375861597263857 是 lch_similarity 的最大值(我无法得到 3.6889...)。 lch_similarity,根据the documentation具有以下属性:

Leacock Chodorow Similarity:
Return a score denoting how similar two word senses are, based on the
shortest path that connects the senses (as above) and the maximum depth
of the taxonomy in which the senses occur. The relationship is given as
-log(p/2d) where p is the shortest path length and d is the taxonomy
depth.
...
:return: A score denoting the similarity of the two ``Synset`` objects,
normally greater than 0. None is returned if no connecting path
could be found. If a ``Synset`` is compared with itself, the
maximum score is returned, which varies depending on the taxonomy
depth.

鉴于 rock_hind.n.01 位于 WordNet 分类中的最深级别 (19),而 change.n.06 位于最浅级别 (2 ),我们可以尝试不同的深度:

>>> from nltk.corpus import wordnet as wn
>>> rock = wn.synset('rock_hind.n.01')
>>> change = wn.synset('change.n.06')
>>> rock.lch_similarity(rock)
3.6375861597263857
>>> change.lch_similarity(change)
3.6375861597263857
>>> change.lch_similarity(rock)
0.7472144018302211
>>> rock.lch_similarity(change)
0.7472144018302211

可以对其他度量进行类似的实验,其中范围似乎要大一些:

>>> from nltk.corpus import wordnet_ic, genesis
>>> brown_ic = wordnet_ic.ic('ic-brown.dat')
>>> semcor_ic = wordnet_ic.ic('ic-semcor.dat')
>>> genesis_ic = wn.ic(genesis, False, 0.0)
>>> rock.res_similarity(rock, brown_ic) # res_similarity, brown
1e+300
>>> rock.res_similarity(change, brown_ic)
-0.0
>>> rock.res_similarity(rock, semcor_ic) # res_similarity, semcor
1e+300
>>> rock.res_similarity(change, semcor_ic)
-0.0
>>> rock.res_similarity(rock, genesis_ic) # res_similarity, genesis
1e+300
>>> rock.res_similarity(change, genesis_ic)
-0.08306855877006339
>>> change.res_similarity(rock, genesis_ic)
-0.08306855877006339
>>> rock.jcn_similarity(rock, brown_ic) # jcn, brown - results are identical with semcor and genesis
1e+300
>>> rock.jcn_similarity(change, brown_ic)
1e-300
>>> change.jcn_similarity(rock, brown_ic)
1e-300

关于wordnet - 基于 WordNet 相似度的最高分,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20112828/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com