gpt4 book ai didi

python - 我如何在 nltk 中使用正则表达式标记器?

转载 作者:太空宇宙 更新时间:2023-11-03 13:48:32 24 4
gpt4 key购买 nike

如果我尝试此代码:

import nltk
pattern = [(r'(March)$','MAR')]
tagger=nltk.RegexpTagger(pattern)
print tagger.tag('He was born in March 1991')

我得到这样的输出:

[('H', None), ('e', None), (' ', None), ('w', None), ('a', None), ('s', None), (' ', None), >('b', None), ('o', None), ('r', None), ('n', None), (' ', None), ('i', None), ('n', None), (' ', None), ('M', None), ('a', None), ('r', None), ('c', None), ('h', None), (' ', None), ('1', None), ('9', None), ('9', None), ('1', None)]

事实上,我希望这个标注器能够识别带有“MAR”标签的“March”单词。

最佳答案

在这里试试这个:

import nltk
pattern = [(r'(March)$','MAR')]
tagger = nltk.RegexpTagger(pattern)
print tagger.tag(nltk.word_tokenize('He was born in March 1991'))

您必须对单词进行分词。

这是我得到的输出:

[('He', None), ('was', None), ('born', None), ('in', None), ('March', 'MAR'), ('1991', None)]

关于python - 我如何在 nltk 中使用正则表达式标记器?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14529782/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com