python - 正则表达式过于频繁地递归-6ren

python - 正则表达式过于频繁地递归

转载作者：行者123 更新时间：2023-11-28 16:48:37

我正在尝试编写一个正则表达式来匹配可选的引用值(有效引号是 "' 和 `)。规则是两个引号的出现是一个转义引号。

这是我想出的正则表达式:

(?P<quote>["'`])?(?P<value>(?(quote)((?!(?P=quote).)|((?=(?P=quote)).){2})*|[^\s;]*))(?(quote)(?P=quote)|)

现在是可读的(评论表明我认为它做了什么):

(?P<quote>["'`])?                   #named group Quote (any quoting character?)

    (?P<value>                      #name this group "value", what I am interested in
        (?(quote)               #if quoted 
            ((?!(?P=quote).)|((?=(?P=quote)).){2})* #see below
                                    #match either anything that is not the quote
                                    #or match 2 quotes
        |
            [^\s;]*         #match anything that is not whitespace or ; (my seperators if there are no quotes)
        )
    )

(?(quote)(?P=quote)|)               #if we had a leeding quote we need to consume a closing quote

它对不带引号的字符串表现良好，带引号的字符串会崩溃:

    match = re.match(regexValue, line)
  File "****/jython2.5.1/Lib/re.py", line 137, in match
    return _compile(pattern, flags).match(string)
RuntimeError: maximum recursion depth exceeded

我做错了什么？

编辑:示例输入 => 输出(用于捕获组“值”(所需)

text    => text
'text'  => text
te xt   => te
'te''xt'=> te''xt   #quote=' => strreplace("''","'") => desired result: te'xt
'te xt' => te xt

edit2:在查看它时我发现了一个错误，见下文，但我相信上面的内容仍然有效 +> 它可能是一个 Jython 错误，但它仍然没有我想让它做什么:(非常细微的差别，点移出前瞻组

new:(?P<quote>["'`])?(?P<value>(?(quote)((?!(?P=quote)).|((?=(?P=quote)).){2})*|[^\s;]*))(?(quote)(?P=quote)|)
old:(?P<quote>["'`])?(?P<value>(?(quote)((?!(?P=quote).)|((?=(?P=quote)).){2})*|[^\s;]*))(?(quote)(?P=quote)|)

最佳答案

正如评论中所建议的，我建议明确并写下所有可能性:

r = r"""
    ([^"'`]+)
    |
    " ((?:""|[^"])*) "
    |
    ' ((?:''|[^'])*) '
    |
    ` ((?:``|[^`])*) `
"""

当提取匹配项时，您可以使用这样一个事实，即只填充一组四个，并简单地删除所有空组:

r = re.compile(r, re.X)
for m in r.findall(''' "fo""o" and 'bar''baz' and `quu````x` '''):
    print ''.join(m)

关于python - 正则表达式过于频繁地递归，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/10852873/

文章推荐： javascript - 选择 select n°1 后如何启用禁用的 select n°2？

文章推荐： html - 如何使用 css 设置 xml 标签属性的样式

文章推荐： javascript - 全屏背景幻灯片弄乱了响应式粘性页脚

文章推荐： python - lxml etree.parse 内存分配错误

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 正则表达式过于频繁地递归