gpt4 book ai didi

python - 使用集合和数据框计算唯一单词

转载 作者:行者123 更新时间:2023-12-05 02:28:18 25 4
gpt4 key购买 nike

我有一个问题,我想计算数据框中的唯一单词,但不幸的是它只计算第一句话。

                          text
0 hello is a unique sentences
1 hello this is a test
2 does this works
import pandas as pd
d = {
"text": ["hello is a unique sentences",
"hello this is a test",
"does this works"],
}
df = pd.DataFrame(data=d)


from collections import Counter

# Count unique words
def counter_word(text_col):
print(len(text_col.values))
count = Counter()
for i, text in enumerate(text_col.values):
print(i)
for word in text.split():
count[word] += 1
return count

counter = counter_word(df['text'])
len(counter)

最佳答案

我认为更简单的方法是按空格连接值,然后拆分单词和计数:

counter = Counter((' '.join(df['text'])).split())

print (counter)
Counter({'hello': 2, 'is': 2, 'a': 2, 'this': 2, 'unique': 1, 'sentences': 1, 'test': 1, 'does': 1, 'works': 1})

关于python - 使用集合和数据框计算唯一单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72683540/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com