gpt4 book ai didi

python - 计算某些文本中多字子串的出现次数

转载 作者:行者123 更新时间:2023-12-04 13:29:29 25 4
gpt4 key购买 nike

因此,对于某些文本中的单个单词子串计数,我可以使用 some_text.split().count(single_word_substring) .对于某些文本中的多字子串计数,我该如何做到这一点?
例子:

text = 'he is going to school. abc is going to school. xyz is going to school.'
to_be_found = 'going to school'
计数应为 3。
text = 'he is going to school. abc is going to school. xyz is going to school.'
to_be_found = 'going to'
计数应为 3。
text = 'he is going to school. abc is going to school. xyz is going to school.'
to_be_found = 'go'
计数应为 0。
text = 'he is going to school. abc-xyz is going to school. xyz is going to school.'
to_be_found = 'school'
计数应为 3。
text = 'he is going to school. abc-xyz is going to school. xyz is going to school.'
to_be_found = 'abc-xyz'
计数应为 1。
假设1:一切都是小写的。
假设2:文本可以包含任何内容。
假设3:被发现的也可以包含任何东西。例如, car with 4 passengers , xyz & abc , 等等。
注意:基于 REGEX 的解决方案是可以接受的。我只是好奇是否可以不使用正则表达式(很高兴拥有并且仅供将来可能对此感兴趣的其他人)。

最佳答案

这是使用正则表达式的工作解决方案:

import re

def occurrences(text,to_be_found):
return len(re.findall(rf'\W{to_be_found}\W', text))
正则表达式中的大写 W 用于非单词字符,包括空格和其他标点符号。

关于python - 计算某些文本中多字子串的出现次数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65928241/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com