gpt4 book ai didi

regex - Grep 的 "Invalid range end"— 错误还是功能?

转载 作者:行者123 更新时间:2023-12-01 16:34:57 25 4
gpt4 key购买 nike

我有这三个文件:

$ cat pattern-ok 
['\-]
$ cat pattern-buggy
[\-']
$ cat text
abc'def-ghi

现在,以下是我不知道的错误还是正则表达式功能?

$ cat text | grep -f pattern-ok 
abc'def-ghi
$ cat text | grep -f pattern-buggy
grep: Invalid range end

我正在使用:

$ grep --version | head -n 1
grep (GNU grep) 2.20

最佳答案

这是因为您在其他字符中使用了连字符,因此 grep 将其理解为一个范围,而这恰好是无效的。

你基本上在做

grep "[\-']" file

这由 grep 解释为您提供要检查的字符范围,例如 grep "[a-z]" file 。但从 \' 的范围无效,因此出现错误。

为什么另一个可以工作?你可能会问自己。因为你正在做的是:

grep "['\-]" file

在本例中,您要在文件中查找字符 '\-

查看另一个示例,我想在给定字符串中查找字符 a-3 :

$ echo "23-2" | grep -o '[a-3]'
grep: Invalid range end
$ echo "23-2" | grep -o '[a3-]'
3
-
$ echo "23-2" | grep -o '[a3\-]'
3
-

因此,根本问题是您在 some character block 中使用表达式 - + another character + [] ,并且它尝试被读取为 some characteranother character 之间的字符范围。

<小时/>

如何解决这个问题?

如果您想匹配字符 - 等,只需将其添加到表达式的边缘:作为第一项或最后一项。

来自man grep:

Character Classes and Bracket Expressions

A bracket expression is a list of characters enclosed by [ and ]. It matches any single character in that list; if the first character of the list is the caret ^ then it matches any character not in the list. For example, the regular expression [0123456789] matches any single digit.

Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. For example, in the default C locale, [a-d] is equivalent to [abcd]. Many locales sort characters in dictionary order, and in these locales [a-d] is typically not equivalent to [abcd]; it might be equivalent to [aBbCcDd], for example. To obtain the traditional interpretation of bracket expressions, you can use the C locale by setting the LC_ALL environment variable to the value C.

Finally, certain named classes of characters are predefined within bracket expressions, as follows. Their names are self explanatory, and they are [:alnum:], [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:]. For example, [[:alnum:]] means the character class of numbers and letters in the current locale. In the C locale and ASCII character set encoding, this is the same as [0-9A-Za-z]. (Note that the brackets in these class names are part of the symbolic names, and must be included in addition to the brackets delimiting the bracket expression.) Most meta-characters lose their special meaning inside bracket expressions. To include a literal ] place it first in the list. Similarly, to include a literal ^ place it anywhere but first. Finally, to include a literal - place it last.

关于regex - Grep 的 "Invalid range end"— 错误还是功能?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26754181/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com