gpt4 book ai didi

python - 使用正则表达式查找第二个可能的搜索组

转载 作者:太空宇宙 更新时间:2023-11-03 14:36:39 24 4
gpt4 key购买 nike

我正在用 python 做一个正则表达式 ( https://regex101.com/r/eYzXfZ/2/ )。 我目前的回复:

^[\d\W]{8}((?=.*Start execution of )|(?=.*Finish execution of))

现在它在行的开头寻找时间,如果该行包含必要的子字符串,但我想不出任何方法如何在搜索中进行第二组,也将在其中找到状态(在方括号中表示),如果它在相应的行中。因此,例如,在以下行中使用正则表达式之后:

01:01:01 - Start executing steps 1-3
01:01:03 - Start execution of steps group
01:01:04 - Start execution of step [1]
01:02:12 - Finish execution of step [1] with status [ok]
01:02:13 - Start execution of step [2]
01:02:48 - Finish execution of step [2] with status [ok]
01:02:48 - Start execution of step [3]
01:13:21 - Finish execution of step [3] with status [ok]
01:13:21 - Finish execution of steps group with status [success]
01:13:22 - Finish executing steps 1-3

我希望返回:

['01:01:03', 
'01:01:04',
('01:02:12', 'ok'),
'01:02:13',
('01:02:48', 'ok'),
'01:02:48',
('01:13:21', 'ok'),
('01:13:21', 'success')]

最佳答案

正则表达式

(^[\d:]{8})(?=.*(?:Start|Finish) execution of (?:.*\[([a-zA-Z]+))?)

Link to Regex

输出

根据您想要的输出,上面的正则表达式将为您提供您想要的结果。如您所见,它具有所有需要的时间以及可选的状态;没有任何额外的东西!

01:01:03
01:01:04
01:02:12, ok
01:02:13
01:02:48, ok
01:02:48
01:13:21, ok
01:13:21, success
02:01:02
02:01:02
02:03:10, ok
02:03:12
02:03:16, fail
02:03:16, failed

差异

你会发现它在几个关键地方与你的不同

  1. 您需要计时,因此必须将它们分组在括号 (^[\d:]{8}) 中。

  2. 您只需要时间中的数字和冒号,因此正则表达式清楚地表明了这一点。 [\d:][\d\W]

    注意:对于上面的 (2),这也适用 (^[\d:]+)

  3. 它删除了前瞻组。您不需要将此前瞻性分组,因为您不希望在 Python 代码中返回该文本。所以删除了额外的括号!

  4. 将 2 个前瞻合并为一个。 (?:Start|Finish) 执行

  5. 在前瞻中添加了您缺少的状态 (?:.*\[([a-zA-Z]+))? 要求。这应该被捕获,所以你需要方括号内的括号!

其他可行的正则表达式

# Implicit status label, explicit letters for status
(^[\d:]+)(?=.*(?:Start|Finish) execution of (?:.*\[([a-zA-Z]+))?)
(^[\d:]{8})(?=.*(?:Start|Finish) execution of (?:.*\[([a-zA-Z]+))?)
(^[\d\W]{8})(?=.*(?:Start|Finish) execution of (?:.*\[([a-zA-Z]+))?)

# Explicit status label, explicit letters for status
(^[\d:]+)(?=.*(?:Start|Finish) execution of (?:.*status \[([a-zA-Z]+))?)
(^[\d:]{8})(?=.*(?:Start|Finish) execution of (?:.*status \[([a-zA-Z]+))?)
(^[\d\W]{8})(?=.*(?:Start|Finish) execution of (?:.*status \[([a-zA-Z]+))?)

# Explicit status label, implicit letters for status
(^[\d:]+)(?=.*(?:Start|Finish) execution of (?:.*status \[(.*?)\])?)
(^[\d:]{8})(?=.*(?:Start|Finish) execution of (?:.*status \[(.*?)\])?)
(^[\d\W]{8})(?=.*(?:Start|Finish) execution of (?:.*status \[(.*?)\])?)

# NOTE: FAILS - Implicit status label and implicit letter for status
# (^[\d:]+)(?=.*(?:Start|Finish) execution of (?:.*\[(.*?)\])?)
# (^[\d:]{8})(?=.*(?:Start|Finish) execution of (?:.*\[(.*?)\])?)
# (^[\d\W]{8})(?=.*(?:Start|Finish) execution of (?:.*\[(.*?)\])?)


# Answers from other posters

^([\d\W]{8})(?=(?=.*Start execution of )|(?=.*Finish execution of))(?=.*?status \[(.*?)\])?


# Customize

# If you prefer the split lookahead, then you can customize any of the above with the middle section
# For example...
(^[\d:]+)(?=(?=.*Start execution of )|(?=.*Finish execution of)(?:.*\[([a-zA-Z]+))?)

关于python - 使用正则表达式查找第二个可能的搜索组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57650375/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com