gpt4 book ai didi

java - 使用具有不情愿、贪婪和占有量词的捕获组

转载 作者:太空宇宙 更新时间:2023-11-04 06:49:50 24 4
gpt4 key购买 nike

我在Oracle的教程中练习了java的正则表达式。为了更好地理解贪婪、不情愿和占有量词,我创建了一些例子。我的问题是这些量词在捕获组时如何工作。我不明白以这种方式使用量词,例如,不情愿的量词看起来好像根本不起作用。另外,我在网上查了很多,只看到了类似(.*?)这样的表达式。人们通常使用这种语法的量词而不是像 "(.foo)??" 这样的东西是否有原因?

这是一个不情愿的例子:

Enter your regex: (.foo)??
Enter input string to search: xfooxxxxxxfoo
I found the text "" starting at index 0 and ending at index 0.
I found the text "" starting at index 1 and ending at index 1.
I found the text "" starting at index 2 and ending at index 2.
I found the text "" starting at index 3 and ending at index 3.
I found the text "" starting at index 4 and ending at index 4.
I found the text "" starting at index 5 and ending at index 5.
I found the text "" starting at index 6 and ending at index 6.
I found the text "" starting at index 7 and ending at index 7.
I found the text "" starting at index 8 and ending at index 8.
I found the text "" starting at index 9 and ending at index 9.
I found the text "" starting at index 10 and ending at index 10.
I found the text "" starting at index 11 and ending at index 11.
I found the text "" starting at index 12 and ending at index 12.
I found the text "" starting at index 13 and ending at index 13.

如果不情愿的话,索引 0 和 4 不应该显示“xfoo”吗?这是所有格:

Enter your regex: (.foo)?+ 
Enter input string to search: afooxxxxxxfoo
I found the text "afoo" starting at index 0 and ending at index 4
I found the text "" starting at index 4 and ending at index 4.
I found the text "" starting at index 5 and ending at index 5.
I found the text "" starting at index 6 and ending at index 6.
I found the text "" starting at index 7 and ending at index 7.
I found the text "" starting at index 8 and ending at index 8.
I found the text "xfoo" starting at index 9 and ending at index 13.
I found the text "" starting at index 13 and ending at index 13.

对于所有格,它不应该只尝试输入一次吗?我真的很困惑,尤其是这个,因为尝试了每一种可能性。

提前致谢!

最佳答案

正则表达式引擎(基本上)从左侧开始逐一检查字符串中的每个字符,尝试使它们适合您的模式。它返回找到的第一个匹配项。

应用于子模式的勉强量词意味着正则表达式引擎将优先考虑(如首先尝试)以下子模式。

aabab 上使用 .*?b 逐步查看会发生什么:

aabab # we try to make '.*?' match zero '.', skipping it directly to try and 
^ # ... match b: that doesn't work (we're on a 'a'), so we reluctantly
# ... backtrack and match one '.' with '.*?'
aabab # again, we by default try to skip the '.' and go straight for b:
^ # ... again, doesn't work. We reluctantly match two '.' with '.*?'
aabab # FINALLY there's a 'b'. We can skip the '.' and move forward:
^ # ... the 'b' in '.*?b' matches, regex is over, 'aab' is a general match

在您的模式中,没有与 b 等效的内容。 (.foo) 是可选的,引擎优先考虑模式的以下部分。

什么都没有,并且匹配一个空字符串:找到了整体匹配,并且它始终是一个空字符串。

<小时/>

关于所有格量词,您对它们的作用感到困惑。它们与匹配数量没有直接关系:不清楚您用来应用正则表达式的聊天工具,但它会查找全局匹配,这就是为什么它不会在第一个匹配处停止。

参见http://www.regular-expressions.info/possessive.html有关它们的更多信息。

此外,正如 HamZa 指出的那样,https://stackoverflow.com/a/22944075正成为正则表达式相关问题的一个很好的引用。

关于java - 使用具有不情愿、贪婪和占有量词的捕获组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23474945/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com