gpt4 book ai didi

正则表达式超时

转载 作者:行者123 更新时间:2023-12-01 02:10:15 25 4
gpt4 key购买 nike

我正在尝试匹配类似的东西

foo: anything after the colon can be matched with (.*)+
foo.bar1.BAZ: balh5317{}({}(

这是我正在使用的正则表达式:
/^((?:(?:(?:[A-Za-z_]+)(?:[0-9]+)?)+[\.]?)+)(?:\s)?(?:\:)(?:\s)?((?:.*)+)$/

请原谅不匹配的组和额外的括号,这是从构建器类编译的

这适用于示例。当我尝试放入这样的字符串时出现问题:
foo.bar.baz.beef.stew.ect.and.forward

我需要能够像这样检查字符串,但正则表达式引擎在一定数量的 foo. 后超时或运行无穷大(据我所知)。 s 每次。

我确信这是一个我可以解决的逻辑问题,但不幸的是我还远未掌握正则表达式,我希望更有经验的用户可以对我如何提高效率有所了解。

此外,这里是我需要匹配的更详细的描述:
Property Name: can contain A-z, numbers, and underscores but can't start with a number

<Property Name>.<Property Name>.<Prop...:<Anything after the colon>

谢谢你的时间!

最佳答案

从您的正则表达式开始:

^((?:(?:(?:[A-Za-z_]+)(?:[0-9]+)?)+[\.]?)+)(?:\s)?(?:\:)(?:\s)?((?:.*)+)$


^ # Anchors to the beginning to the string.
( # Opens CG1
(?: # Opens NCG
(?: # Opens NCG
(?: # Opens NCG
[A-Za-z_]+ # Character class (any of the characters within)
) # Closes NCG
(?: # Opens NCG
[0-9]+ # Character class (any of the characters within)
)? # Closes NCG
)+ # Closes NCG
[\.]? # Character class (any of the characters within)
)+ # Closes NCG
) # Closes CG1
(?: # Opens NCG
\s # Token: \s (white space)
)? # Closes NCG
(?: # Opens NCG
\: # Literal :
) # Closes NCG
(?: # Opens NCG
\s # Token: \s (white space)
)? # Closes NCG
( # Opens CG2
(?: # Opens NCG
.* # . denotes any single character, except for newline
)+ # Closes NCG
) # Closes CG2
$ # Anchors to the end to the string.

我转换了 [0-9]\d ,只是为了更容易阅读(两者都匹配相同的东西)。我还删除了许多非捕获组,因为它们并没有真正被使用。
^((?:(?:[A-Za-z_]+\d*)+\.?)+)\s?\:\s?((?:.*)+)$

我也合并了 \s和 .* 到 [\s\S]* ,但看到后面跟着一个 +签名,我删除了该组并刚刚创建了 [\s\S] .
^((?:(?:[A-Za-z_]+\d*)+\.?)+)\s?\:([\s\S]+)$
^

现在我不确定是什么 +以上克拉是应该做的。我们可以移除它,从而移除它周围的非捕获组。
^((?:[A-Za-z_]+\d*\.?)+)\s?\:([\s\S]+)$

解释:
 ^                          # Anchors to the beginning to the string.
( # Opens CG1
(?: # Opens NCG
[A-Za-z_]+ # Character class (any of the characters within)
\d* # Token: \d (digit)
\.? # Literal .
)+ # Closes NCG
) # Closes CG1
\s? # Token: \s (white space)
\: # Literal :
( # Opens CG2
[\s\S]+ # Character class (any of the characters within)
) # Closes CG2
$ # Anchors to the end to the string.

现在,您可能想要更改 [\s\S]+返回 .*如果您正在处理多行。对此有几种不同的选择,但您使用的语言很重要。

老实说,我是分步完成的,但最大的问题是 (?:.*)+这告诉引擎 match 0 or more characters 1 or more times catastrophic backtracking (as xufox linked to in comments) .

生成的正则表达式以及您的原始正则表达式允许以 . 结尾的变量我会使用更像这样的东西,你的正则表达式真的离它不远。

这将匹配 foo.ba5r 之类的名称,如果可以接受,则您之前的正则表达式不会。
^([A-Za-z_]\w*(?:\.[A-Za-z_]+\w*)*)\s?\:([\s\S]+)$

解释:
 ^                          # Anchors to the beginning to the string.
( # Opens CG1
[A-Za-z_] # Character class (any of the characters within)
\w* # Token: \w (a-z, A-Z, 0-9, _)
(?: # Opens NCG
\. # Literal .
[A-Za-z_] # Character class (any of the characters within)
\w* # Token: \w (a-z, A-Z, 0-9, _)
)* # Closes NCG
) # Closes CG1
\s? # Token: \s (white space)
\: # Literal :
( # Opens CG2
[\s\S]+ # Character class (any of the characters within)
) # Closes CG2
$ # Anchors to the end to the string.

关于正则表达式超时,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30011067/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com