gpt4 book ai didi

nlp - nltk wordnet lemmatizer语言是独立的吗?

转载 作者:行者123 更新时间:2023-12-04 13:22:04 27 4
gpt4 key购买 nike

nltk's wordnet lemmatizer 是真的吗?不依赖于输入文本的语言?我会使用相同的命令序列吗:

>>> from nltk.stem import WordNetLemmatizer
>>> wnl = WordNetLemmatizer()
>>> print(wnl.lemmatize('dogs'))
dog
>>> print(wnl.lemmatize('churches'))
church
>>> print(wnl.lemmatize('aardwolves'))
aardwolf
>>> print(wnl.lemmatize('abaci'))
abacus
>>> print(wnl.lemmatize('hardrock'))
hardrock

例如英语和法语?

最佳答案

简而言之

不,NLTK 中的 Wordnet lemmatizer 仅适用于英语。

在龙

如果我们看https://github.com/nltk/nltk/blob/develop/nltk/stem/wordnet.py#L15

class WordNetLemmatizer(object):

def __init__(self):
pass

def lemmatize(self, word, pos=NOUN):
lemmas = wordnet._morphy(word, pos)
return min(lemmas, key=len) if lemmas else word

def __repr__(self):
return '<WordNetLemmatizer>'

它基于 https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L1764 处的 _morphy() 函数适用于多个 English specific substitutions

    MORPHOLOGICAL_SUBSTITUTIONS = {
NOUN: [('s', ''), ('ses', 's'), ('ves', 'f'), ('xes', 'x'),
('zes', 'z'), ('ches', 'ch'), ('shes', 'sh'),
('men', 'man'), ('ies', 'y')],
VERB: [('s', ''), ('ies', 'y'), ('es', 'e'), ('es', ''),
('ed', 'e'), ('ed', ''), ('ing', 'e'), ('ing', '')],
ADJ: [('er', ''), ('est', ''), ('er', 'e'), ('est', 'e')],
ADV: []}

MORPHOLOGICAL_SUBSTITUTIONS[ADJ_SAT] = MORPHOLOGICAL_SUBSTITUTIONS[ADJ]

关于nlp - nltk wordnet lemmatizer语言是独立的吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50039310/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com