gpt4 book ai didi

python - 为什么 Python 正则表达式字符串有时可以在不使用原始字符串的情况下工作?

转载 作者:太空狗 更新时间:2023-10-30 01:46:29 25 4
gpt4 key购买 nike

Python 建议在 re 模块中定义正则表达式时使用原始字符串。来自Python documentation :

Regular expressions use the backslash character ('\') to indicate special forms or to allow special characters to be used without invoking their special meaning. This collides with Python’s usage of the same character for the same purpose in string literals; for example, to match a literal backslash, one might have to write '\\' as the pattern string, because the regular expression must be \, and each backslash must be expressed as \ inside a regular Python string literal.

然而,在很多情况下,这不是必需的,无论是否使用原始字符串,您都会得到相同的结果:

$ ipython

In [1]: import re

In [2]: m = re.search("\s(\d)\s", "a 3 c")

In [3]: m.groups()
Out[3]: ('3',)

In [4]: m = re.search(r"\s(\d)\s", "a 3 c")

In [5]: m.groups()
Out[5]: ('3',)

然而,在某些情况下情况并非如此:

In [6]: m = re.search("\s(.)\1\s", "a 33 c")

In [7]: m.groups()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-12-84a8d9c174e2> in <module>()
----> 1 m.groups()

AttributeError: 'NoneType' object has no attribute 'groups'

In [8]: m = re.search(r"\s(.)\1\s", "a 33 c")

In [9]: m.groups()
Out[9]: ('3',)

并且在不使用原始字符串时必须对特殊字符进行转义:

In [10]: m = re.search("\\s(.)\\1\\s", "a 33 c")

In [11]: m.groups()
Out[11]: ('3',)

我的问题是为什么非转义的、非原始的正则表达式字符串与特殊字符一起工作(如上面的命令 [2] 中所示)?

最佳答案

上面的示例之所以有效,是因为 \s\d 不是 python 中的转义序列。根据文档:

Unlike Standard C, all unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the string. 

但最好只使用原始字符串,而不用担心什么是或不是 python 转义,或者担心如果您更改正则表达式以后会更改它。

关于python - 为什么 Python 正则表达式字符串有时可以在不使用原始字符串的情况下工作?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28334871/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com