gpt4 book ai didi

python - 如何计算一个字符

转载 作者:行者123 更新时间:2023-11-28 22:18:35 26 4
gpt4 key购买 nike

我有一个文件如下,我想统计人们提到其他人的次数:

peter @amy 
tom @amy
tom @amy
peter @tom
edwin @amy
amy @peter
tom @john @peter
amy @edwin
tom @peter
peter @john
peter @john
john @tom?
edwin @john
edwin @amy
amy @tom

我尝试使用:

for line in fhand:
if "@" in line:
indexStart = line.find("@")

但我不知道接下来会发生什么。预期的输出是:

tom 5
amy 3
edwin 3
peter 5
john 1

有什么办法吗?

最佳答案

选项 1
re.findallcollections.Counter

import re
from collections import Counter

with open('test.txt') as f:
data = re.findall(r'(?m)^(\w+).*@.*$', f.read())
print(Counter(data))

# Counter({'tom': 5, 'peter': 4, 'edwin': 3, 'amy': 3, 'john': 1})

regex解释:

(?m)             # asserts multiline matching
^ # asserts position at the start of the line
(\w+) # captures any word character in group 1 (this is the name you want)
.* # Greedily matches any character besides line breaks
@ # Matches an @ symbol
.* # Greedily matches any character besides line breaks
$ # Asserts position at end of line

如果您确实需要他们提到人的次数,而不仅仅是他们提到人的行数:

选项 2
使用 collections.defaultdict:

with open('test.txt') as f:
dct = defaultdict(int)
for line in f:
dct[line.split()[0]] += line.count('@')
print(dct)

# defaultdict(<class 'int'>, {'peter': 5, 'amy': 3, 'tom': 5, 'edwin': 3, 'john': 2})

选项 3
pandas 一起生活在边缘:

import pandas as pd

with open('test.txt') as f:
data = [i.split(' ', 1) for i in f.read().splitlines()]
df = pd.DataFrame(data)
print(df.groupby(0).sum()[1].str.count('@'))

# Result

0
amy 3
edwin 3
john 2
peter 5
tom 5

关于python - 如何计算一个字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50423965/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com