gpt4 book ai didi

regex - 点元字符如何匹配换行符?

转载 作者:行者123 更新时间:2023-12-01 11:41:17 25 4
gpt4 key购买 nike

我以为点 . in regex 将匹配任何字符,除了行尾字符。

但是,在 R 中,我发现点可以匹配任何内容,包括换行符 \n , \r\r\n :

grep(c("\r","\n","\r\n"),pattern=".")
[1] 1 2 3

有人能解释一下这个矛盾吗?

最佳答案

页面在这里 http://www.regular-expressions.info/dot.html解释点与行尾字符不匹配的规则主要是由于历史原因而存在的:

The first tools that used regular expressions were line-based. They would read a file line by line, and apply the regular expression separately to each line. The effect is that with these tools, the string could never contain line breaks, so the dot could never match them.



然而,

Modern tools and languages can apply regular expressions to very large strings or even entire files. Except for JavaScript and VBScript, all regex flavors discussed here have an option to make the dot match all characters, including line breaks.



显然,R 就是这样一种语言,默认情况下,点将匹配每个字符。 (我将您指向 Joshua 上面的评论,建议您查看 ?regex 和 POSIX 1003.2 标准。)

我上面链接的页面也提到了 Perl 并建议在其默认模式下,点将不匹配换行符。

请注意 R 的 grep函数有一个 perl选项。如果你打开它,你会得到不同的输出:
> grep(".", c("\r","\n","\r\n"), perl = TRUE)
[1] 1 3

这告诉我 \n是换行符,但不是 \r .比较的东西 cat("\r")cat("\n")可以确认。

(如果有什么不同的话,我在 Mac OS 上。)

关于regex - 点元字符如何匹配换行符?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20437151/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com