详解基于Linux下正则表达式（基本正则和扩展正则命令使用实例）-6ren

详解基于Linux下正则表达式（基本正则和扩展正则命令使用实例）

转载作者：qq735679552 更新时间：2022-09-29 22:32:09

CFSDN坚持开源创造价值，我们致力于搭建一个资源共享平台，让每一个IT人在这里找到属于你的精彩世界.

这篇CFSDN的博客文章详解基于Linux下正则表达式（基本正则和扩展正则命令使用实例）由作者收集整理，如果你对这篇文章有兴趣，记得点赞哟.

前言。

正则表达式应用广泛，在绝大多数的编程语言都可以完美应用，在Linux中，也有着极大的用处.

使用正则表达式，可以有效的筛选出需要的文本，然后结合相应的支持的工具或语言，完成任务需求.

在本篇博客中，我们使用grep/egrep来完成对正则表达式的调用，其实也可以使用sed等工具，但是sed的使用极大的需要正则表达式，为了在后面sed篇的书写，就只能这样排序了，有需要的朋友可以把这两篇一起来看.

正则表达式的类型。

正则表达式可以使用正则表达式引擎实现，正则表达式引擎是解释正则表达式模式并使用这些模式匹配文本的基础软件.

在Linux中，常用的正则表达式有:

- POSIX 基本正则表达式（BRE）引擎。

- POSIX 扩展正则表达式（BRE）引擎。

基本正则表达式的基本使用。

环境文本准备。

 
    ? 
   
         [root@service99 ~] 
         # mkdir /opt/regular 
        
         [root@service99 ~] 
         # cd /opt/regular 
        
         [root@service99 regular] 
         # pwd 
        
         /opt/regular 
        
         [root@service99 regular] 
         # cp /etc/passwd temp_passwd

纯文本。

纯文本可以完全匹配对应的单词，需要注意的有正则表达式模式严格区分大小写.

 
    ? 
   
         //grep 
         --color 主要是可以将匹配到的文本高亮显示，这样便于观察效果 
        
         [root@service99 regular] 
         # grep --color "root" temp_passwd  
        
         root:x:0:0:root: 
         /root 
         : 
         /bin/bash 
        
         operator:x:11:0:operator: 
         /root 
         : 
         /sbin/nologin

在正则表达式中，不必局限于完整的单词，所定义的文本出现在数据流的任意位置，正则表达式都将匹配.

 
    ? 
   
         [root@service99 regular] 
         # ifconfig eth1 | grep --color "add" 
        
         eth1   Link encap:Ethernet HWaddr 54:52:01:01:99:02  
        
         inet addr:192.168.2.99 Bcast:192.168.2.255 Mask:255.255.255.0 
        
         inet6 addr: fe80::5652:1ff:fe01:9902 
         /64 
         Scope:Link

当然也不必局限于单独的单词，也可以在文本字符串中出现空格和数字.

 
    ? 
   
         [root@service99 regular] 
         # echo "This is line number 1" | grep --color "ber 1" 
        
         This is line number 1

特殊字符。

在正则表达式模式中使用文本字符串时，有一个问题需要注意.

在正则表达式中定义文本字符串时有几个例外，正则表达式赋予了它们特殊的含义，如果在文本中使用这些特殊字符，有可能得不到预期的效果.

正则表达式认可的特殊字符:

。

复制代码代码如下:

.*[]^${}+?|()

。

如果想要使用这些特殊字符作为普通的文本字符，就需要转义（escape）它，即是在该字符前添加一个特殊字符，向正则表达式引擎说明：它应该将下一个字符解释为普通文本字符.

实现该功能的特殊字符是：“\”反斜杠字符。

 
    ? 
   
         [root@service99 regular] 
         # echo "This cat is $4.99" //双引号不会屏蔽特殊符号，所以系统会读取变量4.99的值，然而当前系统并没有该变量，就显示为空   
        
         This  
         cat 
         is .99 
        
         [root@service99 regular] 
         # echo "This cat is \$4.99"  //使用"\"转义$ 
        
         This  
         cat 
         is $4.99 
        
         [root@service99 regular] 
         # echo 'This cat is \$4.99'  //单引号屏蔽元字符$ 
        
         This  
         cat 
         is \$4.99 
        
         [root@service99 regular] 
         # echo 'This cat is $4.99'  
        
         This  
         cat 
         is $4.99 
        
         [root@service99 regular] 
         # cat price.txt  
        
         This price is $4.99 
        
         hello,world! 
        
         $5.00 
        
         #$#$ 
        
         This is "\". 
        
         [root@service99 regular] 
         # grep --color '\\' price.txt  
        
         This is "\".

定位符。

从头开始。

脱字符（^）尖角号定义从数据流中文本行开头开始的模式.

 
    ? 
   
         [root@service99 regular] 
         # grep --color '^h' price.txt  //以字母h开头的行 
        
         hello,world! 
        
         [root@service99 regular] 
         # grep --color '^$' price.txt //无输出结果，由于没有屏蔽特殊含义 
        
         [root@service99 regular] 
         # grep --color '^\$' price.txt   //以$符号开头的行 
        
         $5.00 
        
         [root@service99 regular] 
         # echo "This is ^ test. " >> price.txt  
        
         [root@service99 regular] 
         # cat price.txt  
        
         This price is $4.99 
        
         hello,world! 
        
         $5.00 
        
         #$#$ 
        
         This is "\". 
        
         This is ^  
         test 
         .  
        
         [root@service99 regular] 
         # grep --color '^' price.txt //直接使用会显示所有的内容 
        
         This price is $4.99 
        
         hello,world! 
        
         $5.00 
        
         #$#$ 
        
         This is "\". 
        
         This is ^  
         test 
         .  
        
         [root@service99 regular] 
         # grep --color '\^' price.txt //单独使用，并在最前面时需要屏蔽 
        
         This is ^  
         test 
         .  
        
         [root@service99 regular] 
         # grep --color 'is ^' price.txt //符号不在最前面时，无需屏蔽，直接使用即可 
        
         This is ^  
         test 
         .

查找结尾。

美元符号$特殊字符定义结尾定位，在文本模式之后添加这个特殊字符表示数据行必须以此文本模式结束.

 
    ? 
   
         [root@service99 regular] 
         # grep --color '\.$' price.txt //“.”在正则表达式中也有特殊含义，请屏蔽，具体的请往下看 
        
         This is "\". 
        
         [root@service99 regular] 
         # grep --color '\. $' price.txt //由于我在输入的时候，多加了一个空格，所以各位需要慎重和小心 
        
         This is ^  
         test 
         .            
         // 
         在正则表达式中，空格作为字符计。 
        
         [root@service99 regular] 
         # grep --color '0$' price.txt  
        
         $5.00 
        
         [root@service99 regular] 
         # grep --color '9$' price.txt  
        
         This price is $4.99

联合定位。

比较常用的就是“^$” 表示空行。

结合“^#”，由于#在Linux代表注释。

输出该文本的有效配置。

 
    ? 
   
         [root@service99 regular] 
         # cat -n /etc/vsftpd/vsftpd.conf | wc -l 
        
         121 
        
         [root@service99 regular] 
         # grep -vE '^#|^$' /etc/vsftpd/vsftpd.conf  //v表示反选，E表示支持扩展正则“|”是扩展正则的符号，往下看，后面有 
        
         anonymous_enable=YES 
        
         local_enable=YES 
        
         write_enable=YES 
        
         local_umask=022 
        
         anon_upload_enable=YES 
        
         anon_mkdir_write_enable=YES 
        
         anon_other_write_enable=YES 
        
         anon_umask=022 
        
         dirmessage_enable=YES 
        
         xferlog_enable=YES 
        
         connect_from_port_20=YES 
        
         xferlog_std_format=YES 
        
         listen=YES 
        
         pam_service_name=vsftpd 
        
         userlist_enable=YES 
        
         tcp_wrappers=YES

字符出现范围。

{n,m} //前一个字符出现了n到m次。

{n,} //前一个字符出现了n次以上。

{n} //前一个字符出现了n次。

 
    ? 
   
         [root@service99 regular] 
         # grep --color "12345\{0,1\}" price.txt  
        
         1234556 
        
         [root@service99 regular] 
         # grep --color "12345\{0,2\}" price.txt  
        
         1234556

点字符。

点特殊字符用于匹配除换行符之外的任意单个字符，但点字符必须匹配一个字符；如果在圆点位置没有字符，那么模式匹配失败.

 
    ? 
   
         [root@service99 regular] 
         # grep --color ".s" price.txt  
        
         This price is $4.99 
        
         This is "\". 
        
         This is ^  
         test 
         .  
        
         [root@service99 regular] 
         # grep --color ".or" price.txt  
        
         hello,world!

字符类。

字符类可以定义一类字符来匹配文本模式中的某一位置。如果在字符类中的某一字符在数据流中，就和模式匹配。为定义字符类，需要使用方括号。应该将要包括在该类中的所有字符用方括号括起来，然后模式中使用整个字符类，就像任意的其他通配符一样.

 
    ? 
   
         [root@service99 regular] 
         # grep --color "[abcdsxyz]" price.txt  
        
         This price is $4.99 
        
         hello,world! 
        
         This is "\". 
        
         This is ^  
         test 
         .  
        
         [root@service99 regular] 
         # grep --color "[sxyz]" price.txt  
        
         This price is $4.99 
        
         This is "\". 
        
         This is ^  
         test 
         .  
        
         [root@service99 regular] 
         # grep --color "[abcd]" price.txt  
        
         This price is $4.99 
        
         hello,world! 
        
         [root@service99 regular] 
         # grep --color "Th[ais]" price.txt //Th 后的第一个字符在【ais】中匹配的 
        
         This price is $4.99 
        
         This is "\". 
        
         This is ^  
         test 
         .  
        
         [root@service99 regular] 
         # grep -i --color "th[ais]" price.txt //-i 表示不区分大小写 
        
         This price is $4.99 
        
         This is "\". 
        
         This is ^  
         test 
         .

如果不能确定某个字符的大小写，就可以使用该模式:

 
    ? 
   
         [root@service99 regular] 
         # echo "Yes" | grep --color "[yY]es"  []内字符顺序没有影响 
        
         Yes 
        
         [root@service99 regular] 
         # echo "yes" | grep --color "[Yy]es" 
        
         yes

在单个表达式内可以使用多个字符类:

 
    ? 
   
         [root@service99 regular] 
         # echo "Yes/no" | grep "[Yy][Ee]" 
        
         Yes 
         /no 
        
         [root@service99 regular] 
         # echo "Yes/no" | grep "[Yy].*[Nn]" //*在正则表达式中的用法，请往下看 
        
         Yes 
         /no

字符类对数字同样支持:

 
    ? 
   
         [root@service99 regular] 
         # echo "My phone number is 123456987" | grep --color "is [1234]" 
        
         My phone number is 123456987 
        
         [root@service99 regular] 
         # echo "This is Phone1" | grep --color "e[1234]" 
        
         This is Phone1 
        
         [root@service99 regular] 
         # echo "This is Phone1" | grep --color "[1]" 
        
         This is Phone1

字符类还有一种极为常见的用途是解析可能拼错的单词:

 
    ? 
   
         [root@service99 regular] 
         # echo "regular" | grep --color "r[ea]g[ua]l[ao]" 
        
         regular

否定字符类。

用于查找不在该字符类中的字符，只需在字符类范围的开头添加脱字符（^）. 。

即使使用否定，字符类仍必须匹配一个字符.

 
    ? 
   
         [root@service99 regular] 
         # cat price.txt  
        
         This price is $4.99 
        
         hello,world! 
        
         $5.00 
        
         #$#$ 
        
         This is "\". 
        
         this is ^  
         test 
         .  
        
         cat 
        
         car 
        
         [root@service99 regular] 
         # sed -n '/[^t]his/p' price.txt  
        
         This price is $4.99 
        
         This is "\". 
        
         [root@service99 regular] 
         # grep --color "[^t]his" price.txt  
        
         This price is $4.99 
        
         This is "\". 
        
         [root@service99 regular] 
         # grep --color "ca[tr]" price.txt  
        
         cat 
        
         car 
        
         [root@service99 regular] 
         # grep --color "ca[^r]" price.txt  
        
         cat

使用范围。

当你需要匹配的字符很多并且有一定规律时，可以这样:

 
    ? 
   
         [root@service99 regular] 
         # cat price.txt  
        
         This price is $4.99 
        
         hello,world! 
        
         $5.00 
        
         #$#$ 
        
         This is "\". 
        
         this is ^  
         test 
         .  
        
         cat 
        
         car 
        
         1234556 
        
         911 
        
         11806 
        
         [root@service99 regular] 
         # egrep --color '[a-z]' price.txt  
        
         This price is $4.99 
        
         hello,world! 
        
         This is "\". 
        
         this is ^  
         test 
         .  
        
         cat 
        
         car 
        
         [root@service99 regular] 
         # egrep --color '[A-Z]' price.txt  
        
         This price is $4.99 
        
         This is "\". 
        
         [root@service99 regular] 
         # grep --color "[0-9]" price.txt  
        
         This price is $4.99 
        
         $5.00 
        
         1234556 
        
         911 
        
         11806 
        
         [root@service99 regular] 
         # sed -n '/^[^a-Z]/p' price.txt  
        
         $5.00 
        
         #$#$ 
        
         1234556 
        
         911 
        
         11806 
        
         [root@service99 regular] 
         # grep --color "^[^a-Z]" price.txt  
        
         $5.00 
        
         #$#$ 
        
         1234556 
        
         911 
        
         11806 
        
         [root@service99 regular] 
         # echo $LANG  //在使用 [a-Z]时，注意LANG环境变量的值，该值若是进行修改的话，要注意修改的值的合法性 
        
         zh_CN.UTF-8  
        
         [root@service99 regular] 
         # LANG=en_US.UTF-8

特殊字符类。

用于匹配特定类型的字符.

[[:blank:]] 空格（space）与定位（tab）字符。

[[:cntrl:]] 控制字符。

[[:graph:]] 非空格（nonspace）字符。

[[:space:]] 所有空白字符。

[[:print:]] 可显示的字符。

[[:xdigit:]] 十六进制数字。

[[:punct:]] 所有标点符号。

[[:lower:]] 小写字母。

[[:upper:]] 大写字母。

[[:alpha:]] 大小写字母。

[[:digit:]] 数字。

[[:alnum:]] 数字和大小写字母。

星号。

在某个字符之后加一个星号表示该字符在匹配模式的文本中不出现或出现多次。

 
    ? 
   
         [root@service99 regular] 
         # cat test.info  
        
         goole 
        
         go go go 
        
         come on 
        
         goooooooooo 
        
         [root@service99 regular] 
         # grep --color "o*" test.info  
        
         goole 
        
         go go go 
        
         come on 
        
         goooooooooo 
        
         [root@service99 regular] 
         # grep --color "go*" test.info  
        
         goole 
        
         go go go 
        
         goooooooooo 
        
         [root@service99 regular] 
         # grep --color "w.*d" price.txt   //经常与.一起使用 
        
         hello,world!

扩展正则表达式。

问号。

问号表示前面的字符可以不出现或者出现一次。不匹配重复出现的字符.

 
    ? 
   
         [root@service99 regular] 
         # egrep --color "91?" price.txt  
        
         This price is $4.99 
        
         911

加号。

加号表示前面的字符可以出现一次或者多次，但必须至少出现一次，该字符若是不存在，则模式不匹配.

 
    ? 
   
         [root@service99 regular] 
         # egrep --color "9+" price.txt  
        
         This price is $4.99 
        
         911 
        
         [root@service99 regular] 
         # egrep --color "1+" price.txt  
        
         1234556 
        
         911 
        
         11806

使用大括号。

使用大括号指定对可重复的正则表达式的限制，通常称为间隔.

- m：该正则表达式正好出现m次。

- m，n：该正则表达式出现最少m次，最多n次。

 
    ? 
   
         [root@service99 regular] 
         # echo "This is test,test is file." | egrep --color "test{0,1}" 
        
         This is  
         test 
         , 
         test 
         is  
         file 
         . 
        
         [root@service99 regular] 
         # echo "This is test,test is file." | egrep --color "is{1,2}" 
        
         This is  
         test 
         , 
         test 
         is  
         file 
         .

正则表达式实例。

这里有一个实例，对基本的正则表达式进行了练习和实例。因为正则表达式，单看概念或者理论还是比较简单的，然而在实际的使用中，却不是那么好用，一旦用好了，对效率的提升绝对时可观的.

1.过滤下载文件中包含 the 关键字。

 
    ? 
   
         grep 
         --color  
         "the" 
         regular_express.txt

2.过滤下载文件中丌包含 the 关键字。

 
    ? 
   
         grep 
         --color -vn  
         "the" 
         regular_express.txt

3.过滤下载文件中丌论大小写 the 关键字。

 
    ? 
   
         grep 
         --color - 
         in 
         "the" 
         regular_express.txt

4.过滤 test 或 taste 这两个单字。

 
    ? 
   
         grep 
         --color -En  
         'test|taste' 
         regular_express.txt  
        
         grep 
         --color -i  
         "t[ae]ste\{0,1\}" 
         1.txt

5.过滤有 oo 的字节。

 
    ? 
   
         grep 
         --color  
         "oo" 
         regular_express.txt

6.过滤丌想要 oo 前面有 g 的。

 
    ? 
   
         grep 
         --color [^g] 
         "oo" 
         regular_express.txt  
        
         grep 
         --color  
         "[^g]oo" 
         regular_express.txt

7.过滤 oo 前面丌想有小写字节。

 
    ? 
   
         egrep 
         --color  
         "[^a-z]oo" 
         regular_express.txt

8.过滤有数字的那一行。

 
    ? 
   
         egrep 
         --color [0-9] regular_express.txt

9.过滤以 the 开头的。

 
    ? 
   
         egrep 
         --color ^the regular_express.txt

10.过滤以小写字母开头的。

 
    ? 
   
         egrep 
         --color ^[a-z] regular_express.txt

11.过滤开头丌是英文字母。

 
    ? 
   
         egrep 
         --color ^[^a-Z] regular_express.txt

12.过滤行尾结束为小数点.那一行。

 
    ? 
   
         egrep 
         --color $ 
         "\." 
         regular_express.txt

13.过滤空白行。

 
    ? 
   
         egrep 
         --color  
         "^$" 
         regular_express.txt

14.过滤出 g??d 的字串。

 
    ? 
   
         egrep 
         --color  
         "g..d" 
         regular_express.txt

15.过滤至少两个 o 以上的字串。

 
    ? 
   
         egrep 
         --color  
         "ooo*" 
         regular_express.txt  
        
         egrep 
         --color o\{2,\} regular_express.txt

16.过滤 g 开头和 g 结尾但是两个 g 之间仅存在至少一个 o 。

 
    ? 
   
         egrep 
         --color go\{1,\}g regular_express.txt

17.过滤任意数字的行。

 
    ? 
   
         egrep 
         --color [0-9] regular_express.txt

18.过滤两个 o 的字串。

 
    ? 
   
         egrep 
         --color  
         "oo" 
         regular_express.txt

19.过滤 g 后面接 2 到 5 个 o,然后在接一个 g 的字串。

 
    ? 
   
         egrep 
         --color go\{2,5\}g regular_express.txt

20.过滤 g 后面接 2 个以上 o 的。

 
    ? 
   
         egrep 
         --color go\{2,\} regular_express.txt

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持我.

原文链接：http://blog.csdn.net/ll845876425/article/details/53958083 。

最后此篇关于详解基于Linux下正则表达式（基本正则和扩展正则命令使用实例）的文章就讲到这里了,如果你想了解更多关于详解基于Linux下正则表达式（基本正则和扩展正则命令使用实例）的内容请搜索CFSDN的文章或继续浏览相关文章，希望大家以后支持我的博客！。

文章推荐：将pandas.dataframe的数据写入到文件中的方法

文章推荐： python用pandas数据加载、存储与文件格式的实例

文章推荐： Python判断一个文件夹内哪些文件是图片的实例

文章推荐：用python脚本24小时刷浏览器的访问量方法

grammar - 是否有可能使这个 YACC 语法明确？表达式 : . .. |表达式表达式
我正在用 yacc/bison 编写一个简单的计算器。表达式的语法看起来有点像这样: expr : NUM | expr '+' expr { $$ = $1 + $3; } | expr '-'
java - Lambda 表达式 - 使用 lambda 表达式
我开始学习 lambda 表达式，并在以下情况下遇到了以下语句: interface MyNumber { double getValue(); } MyNumber number; nu
C# Linq Where(表达式).FirstorDefault() 与 .FirstOrDefault(表达式)
这两个 Linq 查询有什么区别: var result = ResultLists().Where( c=> c.code == "abc").FirstOrDefault(); // vs. va
c++ - 为什么在未计算的操作数中不允许使用 lambda 表达式，但在常量表达式的未计算部分中允许使用 lambda 表达式？
如果我们查看 draft C++ standard 5.1.2 Lambda 表达式段 2 说(强调我的 future ): The evaluation of a lambda-expressio
java - -source 1.6 不支持 lambda 表达式 [错误](使用 -source 8 或更高版本启用 lambda 表达式)
我使用的是 Mule 4.2.2 运行时、studio 7.5.1 和 Oracle JDK 1.8.0_251。我在 java 代码中使用 Lambda 表达式，该表达式由 java Invoke
XPath 表达式
我是 XPath 的新手。我有网页的html源 http://london.craigslist.co.uk/com/1233708939.html 现在我想从上面的页面中提取以下数据完整日期电子
boolean 表达式
已关闭。这个问题是 off-topic 。目前不接受答案。想要改进这个问题吗？ Update the question所以它是on-topic用于堆栈溢出。已关闭10 年前。 Improve th
Cron 表达式
我将如何编写一个 Cron 表达式以在每天上午 8 点和下午 3:30 触发？我了解如何创建每天触发一次的表达式，而不是在多个设定时间触发。提前致谢最佳答案你应该只使用两行。 0 8 * * *
Java "..."表达式
这个问题已经有答案了: What do 3 dots next to a parameter type mean in Java? (9 个回答) varargs and the '...' argu
python 表达式
我是 python 新手，在阅读 BeautifulSoup 教程时，我不明白这个表达式“[x for x in titles if x.findChildren()][:-1]”我不明白？你能解释一
ruby 表达式
(?:) 这是一个有效的 ruby 正则表达式，谁能告诉我它是什么意思？谢谢最佳答案正如其他人所说，它被用作正则表达式的非捕获语法，但是，它也是正则表达式之外的有效 ruby 语法。在
JavaScript 表达式
这个问题在这里已经有了答案: Why does ++[[]][+[]]+[+[]] return the string "10"? (10 个答案) 关闭 8 年前。谁能帮我处理这个 JavaSc
Java 表达式
这个问题在这里已经有了答案: What is the "-->" operator in C++? (29 个答案) Java: Prefix/postfix of increment/decrem
Python单行 "for"表达式
这个问题在这里已经有了答案: List comprehension vs. lambda + filter (16 个答案) 关闭 10 个月前。我不确定我是否需要 lambda 或其他东西。但是，
C assert() 表达式
C 中的 assert() 函数工作原理对我来说就像一片黑暗的森林。根据这里的答案https://stackoverflow.com/a/1571360 ，您可以使用以下构造将自定义消息输出到您的断言
ada - 类型转换和 if 表达式
在this页，John Barnes 写道: If the conditional expression is the argument of a type conversion then effec
调度程序的 Cron 表达式
我必须创建一个调度程序，它必须每周从第一天上午 9 点到第二天晚上 11 点 59 分运行 2 天(星期四和星期五)。为此，我需要提供一个 cron 表达式。 0-0 0-0 9-23 ? * THU
派生类型列表上的 Linq 表达式
我正在尝试编写一个 Linq 表达式来检查派生类中的属性，但该列表由来自基类的成员组成。下面的示例代码。以“var list”开头的 Process 方法的第二行无法编译，但我不确定应该使用什么语法来
将某些匹配项转换为大写的 Sed 表达式
此 sed 表达式将输入字符串转换为两行输出字符串。两条输出行中的每一行都由输入的子串组成。第一行需要转换成大写: s:random_stuff$choice1\|choice2${\([^}]*
时间范围的 Cron 表达式
我正在使用 Quartz.Net 在我的应用程序中安排我的工作。我只是想知道是否可以为以下场景构建 CRON 表达式: Every second between 2:15AM and 5:20AM 最

qq735679552

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

详解基于Linux下正则表达式（基本正则和扩展正则命令使用实例）