gpt4 book ai didi

Python:检查单词列表中的任何单词是否与正则表达式模式列表中的任何模式匹配

转载 作者:太空狗 更新时间:2023-10-29 22:09:40 26 4
gpt4 key购买 nike

我有一长串单词和regular expression patterns在一个 .txt 文件中,我是这样阅读的:

with open(fileName, "r") as f1:
pattern_list = f1.read().split('\n')

为了说明,前七个看起来像这样:

print pattern_list[:7] 
# ['abandon*', 'abuse*', 'abusi*', 'aching', 'advers*', 'afraid', 'aggress*']

我想知道何时将输入字符串中的单词与 pattern_list 中的任何单词/模式匹配。下面的有点可以工作,但我看到两个问题:

  1. 首先,每次我检查一个新的 string_input 时,re.compile() 我 pattern_list 中的每个项目似乎效率很低......但是当我试图将 re.compile(raw_str) 对象存储在列表中时(到那时能够将已编译的正则表达式列表重用于更像 if w in regex_compile_list: 的内容,它无法正常工作。)
  2. 其次,它有时无法像我预期的那样工作 - 注意如何
    • abuse* 匹配 abusive
    • abusi* 与 abused 和 abuse 相匹配
    • ache* 与疼痛相匹配

我做错了什么,我怎样才能更有效率?预先感谢您对菜鸟的耐心等待,并感谢您提供任何见解!

string_input = "People who have been abandoned or abused will often be afraid of adversarial, abusive, or aggressive behavior. They are aching to abandon the abuse and aggression."
for raw_str in pattern_list:
pat = re.compile(raw_str)
for w in string_input.split():
if pat.match(w):
print "matched:", raw_str, "with:", w
#matched: abandon* with: abandoned
#matched: abandon* with: abandon
#matched: abuse* with: abused
#matched: abuse* with: abusive,
#matched: abuse* with: abuse
#matched: abusi* with: abused
#matched: abusi* with: abusive,
#matched: abusi* with: abuse
#matched: ache* with: aching
#matched: aching with: aching
#matched: advers* with: adversarial,
#matched: afraid with: afraid
#matched: aggress* with: aggressive
#matched: aggress* with: aggression.

最佳答案

为了匹配 shell 样式的通配符,您可以(ab)使用模块 fnmatch

由于 fnmatch 主要用于文件名比较,因此测试将区分大小写或不区分大小写,具体取决于您的操作系统。所以你必须规范化文本和模式(在这里,我使用 lower() 来达到这个目的)

>>> import fnmatch

>>> pattern_list = ['abandon*', 'abuse*', 'abusi*', 'aching', 'advers*', 'afraid', 'aggress*']
>>> string_input = "People who have been abandoned or abused will often be afraid of adversarial, abusive, or aggressive behavior. They are aching to abandon the abuse and aggression."


>>> for pattern in pattern_list:
... l = fnmatch.filter(string_input.split(), pattern)
... if l:
... print pattern, "match", l

制作:

abandon* match ['abandoned', 'abandon']
abuse* match ['abused', 'abuse']
abusi* match ['abusive,']
aching match ['aching']
advers* match ['adversarial,']
afraid match ['afraid']
aggress* match ['aggressive', 'aggression.']

关于Python:检查单词列表中的任何单词是否与正则表达式模式列表中的任何模式匹配,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17068486/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com