gpt4 book ai didi

python - 如何在 Python 中找到两个单词之间的最短依赖路径?

转载 作者:太空狗 更新时间:2023-10-29 17:15:39 28 4
gpt4 key购买 nike

我尝试在给定依赖树的 Python 中找到两个单词之间的依赖路径。

对于句子

Robots in popular culture are there to remind us of the awesomeness of unbound human agency.

我使用 practnlptools ( https://github.com/biplab-iitb/practNLPTools ) 得到依赖解析结果如下:

nsubj(are-5, Robots-1)
xsubj(remind-8, Robots-1)
amod(culture-4, popular-3)
prep_in(Robots-1, culture-4)
root(ROOT-0, are-5)
advmod(are-5, there-6)
aux(remind-8, to-7)
xcomp(are-5, remind-8)
dobj(remind-8, us-9)
det(awesomeness-12, the-11)
prep_of(remind-8, awesomeness-12)
amod(agency-16, unbound-14)
amod(agency-16, human-15)
prep_of(awesomeness-12, agency-16)

也可以形象化为(图片取自https://demos.explosion.ai/displacy/) enter image description here

“robots”和“are”之间的路径长度为 1,“robots”和“awesomeness”之间的路径长度为 4。

我的问题是上面的依赖解析结果,如何获取两个单词之间的依赖路径或依赖路径长度?

从我当前的搜索结果来看,nltk 的 ParentedTree 有帮助吗?

谢谢!

最佳答案

HugoMailhot 的 answer是很棒的。我会为 spacy 写一些类似的东西想要找到两个词之间最短依赖路径的用户(而 HugoMailhot 的答案依赖于 practNLPTools )。

句子:

Robots in popular culture are there to remind us of the awesomeness of unbound human agency.

following dependency tree :

enter image description here

下面是寻找两个词之间最短依赖路径的代码:

import networkx as nx
import spacy
nlp = spacy.load('en')

# https://spacy.io/docs/usage/processing-text
document = nlp(u'Robots in popular culture are there to remind us of the awesomeness of unbound human agency.', parse=True)

print('document: {0}'.format(document))

# Load spacy's dependency tree into a networkx graph
edges = []
for token in document:
# FYI https://spacy.io/docs/api/token
for child in token.children:
edges.append(('{0}-{1}'.format(token.lower_,token.i),
'{0}-{1}'.format(child.lower_,child.i)))

graph = nx.Graph(edges)

# https://networkx.github.io/documentation/networkx-1.10/reference/algorithms.shortest_paths.html
print(nx.shortest_path_length(graph, source='robots-0', target='awesomeness-11'))
print(nx.shortest_path(graph, source='robots-0', target='awesomeness-11'))
print(nx.shortest_path(graph, source='robots-0', target='agency-15'))

输出:

4
['robots-0', 'are-4', 'remind-7', 'of-9', 'awesomeness-11']
['robots-0', 'are-4', 'remind-7', 'of-9', 'awesomeness-11', 'of-12', 'agency-15']

安装 spacy 和 networkx:

sudo pip install networkx 
sudo pip install spacy
sudo python -m spacy.en.download parser # will take 0.5 GB

关于 spacy 的依赖解析的一些基准:https://spacy.io/docs/api/

enter image description here

关于python - 如何在 Python 中找到两个单词之间的最短依赖路径?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32835291/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com