gpt4 book ai didi

python - 关系抽取中如何获取有方向的实体?

转载 作者:太空宇宙 更新时间:2023-11-03 21:40:08 28 4
gpt4 key购买 nike

我从事关系提取工作已经一周了。但我需要的是两个实体之间的方向,例如 Company_x 被 Company_y 收购。因此,模型应该预测像 Company_y->bought-> Company_X 这样的实体。你们认为有什么模型对此有帮助吗?

最佳答案

被动语态通常可以很好地指示关系的方向。

您可以从两个实体之间的上下文中提取以动词开头的模式,然后检测被动语态是否存在。

一些简单的概念验证代码(使用 NLTK 的 RegexpParser 实际上可以更简单)

from nltk import pos_tag
from nltk import word_tokenize
from nltk.stem.wordnet import WordNetLemmatizer

lmtzr = WordNetLemmatizer()
aux_verbs = ['be']

def detect_passive_voice(pattern):
passive_voice = False

if len(pattern) >= 3:
if pattern[0][1].startswith('V'):
verb = lmtzr.lemmatize(pattern[0][0], 'v')
if verb in aux_verbs:
if (pattern[1][1] == 'VBN' or pattern[1][1] == 'VBD') and pattern[-1][0] == 'by':
passive_voice = True

# past verb + by
elif (pattern[-2][1] == 'VBN' or pattern[-2][1] == 'VBD') and pattern[-1][0] == 'by':
passive_voice = True

# past verb + by
elif (pattern[-2][1] == 'VBN' or pattern[-2][1] == 'VBD') and pattern[-1][0] == 'by':
passive_voice = True

# past verb + by
elif len(pattern) >= 2:
if (pattern[-2][1] == 'VBN' or pattern[-2][1] == 'VBD') and pattern[-1][0] == 'by':
passive_voice = True

return passive_voice

运行一些示例:

In [4]: tokens = word_tokenize("was bought by")
...: tags = pos_tag(tokens)
...: detect_passive_voice(tags)
Out[4]: True

In [5]: tokens = word_tokenize("mailed the letter")
...: tags = pos_tag(tokens)
...: detect_passive_voice(tags)
Out[5]: False

In [7]: tokens = word_tokenize("was mailed by")
...: tags = pos_tag(tokens)
...: detect_passive_voice(tags)
Out[7]: True

您可以添加更多助动词,也可以允许中间存在副词或形容词。

关于python - 关系抽取中如何获取有方向的实体?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52923348/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com