gpt4 book ai didi

用于查找唯一单词数/总单词数的 Python 函数不起作用...为什么?

转载 作者:行者123 更新时间:2023-11-30 23:31:30 26 4
gpt4 key购买 nike

为什么这段代码不起作用?

def hapax_legomana_ratio(text):
''' Return the hapax_legomana ratio for this text.
This ratio is the number of words that occur exactly once divided
by the total number of words.
text is a list of strings each ending in \n.
At least one line in text contains a word.'''

uniquewords=dict()
words=0
for line in text:
line=line.strip().split()
for word in line:
words+=1
if word in words:
uniquewords[word]-=1
else:
uniquewords[word]=1
HLR=len(uniquewords)/words

print (HLR)

当我测试它时,它给了我错误的答案。例如,当 9 个字符串中有 3 个唯一单词时,它给出的是 0.20454545454545456 而不是 .33333。

最佳答案

要查找文本中唯一单词数与单词总数的比率:

from collections import Counter

def hapax_legomana_ratio(text):
words = text.split() # a word is anything separated by a whitespace
return sum(count == 1 for count in Counter(words).values()) / len(words)

它假设text是一个字符串。如果您有一个行列表,那么您可以获得 words 列表,如下所示:

words = [word for line in all_lines for word in line.split()]

关于用于查找唯一单词数/总单词数的 Python 函数不起作用...为什么?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19885673/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com