gpt4 book ai didi

Python : Word frequency using itertools. 链

转载 作者:行者123 更新时间:2023-12-04 00:57:32 25 4
gpt4 key购买 nike

我正在尝试使用以下代码在文档中查找词频。但是,这不是词频,而是返回字符频率。有人可以解释为什么吗?我正在关注我从中获得此代码的文章,但由于未显示输出,我无法验证。

sentence1 = [token for token in "hello how are you".split()]
sentence2 = [token for token in "i am fine thank you".split()]
print(sentence1)
from collections import Counter
import itertools

def map_word_frequency(document):
print (document)
return Counter(itertools.chain(*document))
word_counts = map_word_frequency((sentence1 + sentence2))

最佳答案

删除对 itertools.chain 的调用:

from collections import Counter
from itertools import chain

sentence1 = [token for token in "hello how are you".split()]
sentence2 = [token for token in "i am fine thank you".split()]


def map_word_frequency(document):
return Counter(chain(*document))


word_counts = map_word_frequency([sentence1, sentence2])

print(word_counts)

输出

Counter({'you': 2, 'hello': 1, 'how': 1, 'are': 1, 'i': 1, 'am': 1, 'fine': 1, 'thank': 1})

从文档中,您有以下示例:

chain('ABC', 'DEF') --> A B C D E F

所以,当:

chain(*document)

被执行,它解压列表并将列表的每个元素作为单独的参数传递。一个更具体的例子:

document = ['bad', 'bat', 'baby']
chain(*document)

相当于:

chain('bad', 'bat, 'baby')

如果你想使用链,删除连接 sentence1 + sentence2 并传递一个列表列表,[sentence1, sentence2],例如:

def map_word_frequency(document):
return Counter(chain(*document))


word_counts = map_word_frequency([sentence1, sentence2])

print(word_counts)

另请注意,首选使用 chain.from_iterable ,对于上面的例子,如:

Counter(chain.from_iterable(document))

关于Python : Word frequency using itertools. 链,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61265456/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com