gpt4 book ai didi

python - 如何获得python中字符串出现的次数?

转载 作者:行者123 更新时间:2023-12-04 03:34:24 24 4
gpt4 key购买 nike

我创建了函数来检测我指定的单词并显示它在哪一行,但是,我还想知道这些特定单词在数据中重复了多少次或它们的计数

def search_multiple_strings_in_file(file_name, list_of_strings):
"""Get line from the file along with line numbers, which contains any string from the list"""
line_number = 0
list_of_results = []
# Open the file in read only mode
with open("Hello.csv", 'r') as read_obj:
# Read all lines in the file one by one
for line in read_obj:
line_number += 1
# For each line, check if line contains any string from the list of strings
for string_to_search in list_of_strings:
if string_to_search in line:
# If any string is found in line, then append that line along with line number in list
list_of_results.append((string_to_search, line_number, line.rstrip()))

# Return list of tuples containing matched string, line numbers and lines where string is found
return list_of_results

# search for given strings in the file 'sample.txt'

matched_lines = search_multiple_strings_in_file('hello.csv', ['pre existing ', 'exclusions','limitations','fourteen','authorize','frequency','automatic','renewal','provision','annual limit','fraud notice'])

print('Total Matched lines : ', len(matched_lines))
for elem in matched_lines:
print('Word = ', elem[0], ' :: Line Number = ', elem[1], ' :: Line = ', elem[2])

有没有办法可以在 SO 上上传示例 csv?我是新来的,不确定我是否看过如何添加附件。但是这个应用程序可以与任何虚拟 csv 一起使用。
我只希望我的最终输出也显示单词及其计数,例如-
Words       Count
exclusions 10
renewal 22

最佳答案

在当前代码中包含计数的一种简单方法是使用 collections.defaultdict() 和 simple += 每个匹配字符串的计数。
然后我们可以将 dict 传递给 Dataframe.from_dict() 以生成我们的输出 df

import pandas as pd
from collections import defaultdict

def search_multiple_strings_in_file(file_name, list_of_strings):
"""Get line from the file along with line numbers, which contains any string from the list"""
line_number = 0
list_of_results = []
count = defaultdict(lambda: 0)
# Open the file in read only mode
with open("Hello.csv", 'r') as read_obj:
# Read all lines in the file one by one
for line in read_obj:
line_number += 1
# For each line, check if line contains any string from the list of strings
for string_to_search in list_of_strings:
if string_to_search in line:
count[string_to_search] += line.count(string_to_search)
# If any string is found in line, then append that line along with line number in list
list_of_results.append((string_to_search, line_number, line.rstrip()))

# Return list of tuples containing matched string, line numbers and lines where string is found
return list_of_results, dict(count)


matched_lines, count = search_multiple_strings_in_file('hello.csv', ['pre existing ', 'exclusions','limitations','fourteen','authorize','frequency','automatic','renewal','provision','annual limit','fraud notice'])


df = pd.DataFrame.from_dict(count, orient='index').reset_index()
df.columns = ['Word', 'Count']

print(df)

输出
             Word  Count
0 pre existing 6
1 fourteen 5
2 authorize 5
3 frequency 5
4 automatic 5
5 renewal 5
6 provision 5
7 annual limit 6
8 fraud notice 6
9 exclusions 5
10 limitations 4

关于python - 如何获得python中字符串出现的次数?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67205666/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com