gpt4 book ai didi

regex - 使用 PCRE 匹配 a^n b^n c^n for n > 0

转载 作者:行者123 更新时间:2023-12-03 21:19:57 24 4
gpt4 key购买 nike

您如何将 a^n b^n c^n for n > 0 与 PCRE 匹配?

以下情况应匹配:

abc
aabbcc
aaabbbccc

以下情况不应匹配:
abbc
aabbc
aabbbccc

这是我“尝试过”的; /^(a(?1)?b)$/gmx但这与 n > 0 的 a^n b^n 匹配:
ab
aabb
aaabbb

Online demo

Note: This question is the same as this one with the change in language.

最佳答案

Qtax 把戏

(强大的自引用捕获组)

^(?:a(?=a*(\1?+b)b*(\2?+c)))+\1\2$

这个解决方案也被称为“Qtax 技巧”,因为它使用了与 "vertical" regex matching in an ASCII "image" 相同的技术。通过 Qtax。

有问题的问题归结为需要断言三个组匹配的长度相同。作为简化版本,匹配:
xyz

哪里 x , yz实际上只是具有匹配长度的变量的子模式 na , bc .对于使用带有自引用捕获组的前瞻的表达式,我们指定的字符会添加到前瞻的每个重复中,这可以有效地用于“计数”:
aaabbbccc ^  ^  ^

This is achieved by the following:

  • (?:a…)+ A character of subpattern a is matched. With (?=a*, we skip directly to the "counter".
  • (\1?+b) Capturing group (\1) effectively consumes whatever has previously been matched, if it is there, and uses a possessive match which does not permit backtracking, and the match fails if the counter goes out of sync - That is, there has been more of subpatterns b than subpattern a. On the first iteration, this is absent, and nothing is matched. Then, a character of subpattern b is matched. It is added to the capturing group, effectively "counting" one of b in the group. With b*, we skip directly to the next "counter".
  • (\2?+c) Capturing group (\2) effectively consumes whatever has previously been matched just like the above. Because this additional character capture works just like the previous group, characters are allowed to sync up in length within these character groups. Assuming continuous sequences of a..b..c..:

(Excuse my art.)

First iteration:

| The first 'a' is matched by the 'a' in '^(?:a…)'.
| The pointer is stuck after it as we begin the lookahead.
v,- Matcher pointer
aaaa...bbbbbbbb...cccc...
^^^ |^^^ ^
skipped| skipped Matched by c in (\2?+c);
by a* | by b* \2 was "nothing",
| now it is "c".
Matched by b
in (\1?+b).
\1 was "nothing", now it is "b".

第二次迭代:
 | The second 'a' is matched by the 'a' in '^(?:a…)'.
| The pointer is stuck after it as we begin the lookahead.
v,- Matcher pointer
aaaa...bbbbbbbb...cccc...
/|^^^ |^
eaten by| skipped |Matched by c in (\2?+c);
\1?+ | by b* | '\2' was "nothing",
^^ | \2?+ now it is "cc".
skipped|
by a* \ Matched by b
in (\1?+b).
'\1' was "nothing", now it is "bb".

由于上面讨论的三个组“消耗”了 a 中的每一个, b , c它们分别以循环方式匹配并由 (?:a…)+“计数”。 , (\1?+b)(\2?+c)组分别。通过额外的 anchor 定和捕获我们开始的内容,我们可以断言我们匹配 xyz (代表以上各组) where x , yzan , bncn分别。

作为奖励,要“计数”更多,可以这样做:

模式:^(?:a(?=a*(\1?+b)b*(\2?+c)))+\1{3}\2$
比赛: abbbc
aabbbbbbcc
aaabbbbbbbbbbccc

模式:^(?:a(?=a*(\1?+bbb)b*(\2?+c)))+\1\2$
比赛: abbbc
aabbbbbbcc
aaabbbbbbbbbbccc

关于regex - 使用 PCRE 匹配 a^n b^n c^n for n > 0,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29866345/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com