gpt4 book ai didi

python - 计算多个文本文件中单词列表的出现次数

转载 作者:行者123 更新时间:2023-12-04 07:33:24 26 4
gpt4 key购买 nike

我有一个单词列表:

words = ["hello","my","name"]
files = ["file1.txt","file2.txt"]
我想要的是计算所有文本文件中列表中每个单词的出现次数。
我到目前为止的工作:
import re 
occ = []
for file in files:
try:
fichier = open(file, encoding="utf-8")
except:
pass
data = fichier.read()
for wrd in words:
count = sum(1 for _ in re.finditer(r'\b%s\b' % re.escape(wrd), data))
occ.append(wrd + " : " + str(count))
texto = open("occurence.txt", "w+b")
for ww in occ:
texto.write(ww.encode("utf-8")+"\n".encode("utf-8"))
所以这段代码适用于单个文件,但是当我尝试文件列表时,它只给我最后一个文件的结果。

最佳答案

使用字典而不是列表:

import re 
occ = {} # Create an empty dictionary
words = ["hello", "my", "name"]
files = ["f1.txt", "f2.txt", "f3.txt" ]
for file in files:
try:
fichier = open(file, encoding="utf-8")
except:
pass
else:
data = fichier.read()
for wrd in words:
count = sum(1 for _ in re.finditer(r'\b%s\b' % re.escape(wrd), data))
if wrd in occ:
occ[wrd] += count # If wrd is already in dictionary, increment occurrence count
else:
occ[wrd] = count # Else add wrd to dictionary with occurrence count

print(occ)
如果你想把它作为你的问题中的字符串列表:
occ_list = [ f"{key} : {value}" for key, value in occ.items() ]

关于python - 计算多个文本文件中单词列表的出现次数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67851820/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com