gpt4 book ai didi

python - 我的正则表达式没有捕捉到文本中所需的模式?

转载 作者:行者123 更新时间:2023-12-04 03:45:52 25 4
gpt4 key购买 nike

我正在尝试使用正则表达式提取持续时间,

示例文本,

text = "Google, Inc 09/19 - 09/20 CA, USA"

这是我的正则表达式,

pattern = fr"""
(?:
(
\d\d(?:\.|\/)\d\d\d\d|
(?:{months_abr})?
(?:{months_exp})?
(?:
(?:[\s\.\/\-]?\d{{2,4}})
)
)\s*(?:\-|to|\s)\s*
(
\d\d(?:\.|\/)\d\d\d\d|
(?:{months_abr})?
(?:{months_exp})?
(?:
(?:[\s\.\/\-]?\d{{2,4}})
)|
current|present|till\s?\-?date|till\s?\-?now|till\s?\-?date|to\s\-?present|until\s?\-?now|till\s?\-?now
)
)"""

find_all = re.findall(
pattern, text, flags=re.MULTILINE | re.VERBOSE | re.IGNORECASE
)

我得到的输出,

[('/19', '09')]

最佳答案

你可以使用

pattern = fr"""
(?<!\d) # A position not immediately preceded with digit
( # Group 1
(?:\d?\d[./])?\d\d(?:\d\d)? # one or two digits and . or / (optionally), two or four digits
| # or
(?:{months_abr}|{months_exp}) [\s./-]? \d\d(?:\d\d)? # month, space/dot/slash/hyphen and then two/four digits
) # end of Group 1
\s*(?:-|to)\s* # - or "to" enclosed with 0+ whitespaces
( # Group 2
(?:\d?\d[./])?\d\d(?:\d\d)?
|
(?:{months_abr}|{months_exp}) [\s./-]?\d\d(?:\d\d)?
|
current|present|(?:un)?till\s?-?(?:date|now|date)|to\s-?present # some alternatives denoting time
)
"""

参见 Python demo .输出:[('09/19', '09/20')]

参见 regex demo .

注意:我决定使用 \d\d 而不是 \d{2} 来保持代码更短,因为在 f-strings 中你需要使用 {{}} 来定义文字大括号,它们使字符串在这里看起来很难看。

关于python - 我的正则表达式没有捕捉到文本中所需的模式?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65168797/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com