gpt4 book ai didi

regex - 如何使用 Regex 和 diff 实用程序 ("-I regex"选项忽略特定的文件行)?

转载 作者:太空宇宙 更新时间:2023-11-04 10:49:19 24 4
gpt4 key购买 nike

我正在编写自动化测试来比较 HTML 文件。为了比较,我使用 diff linux utility

所以,第一个 HTML 文件 1.html

<!-- just example -->
<html>
<div id="userdata_hidden">bla bla bla</div>
<div id="something else" >bla bla bla</div>
<div id="waiver_id" >bla bla bla</div>
<html>

第二个 HTML 文件 2.html

<!-- just example -->
<html>
<div id="userdata_hidden">bla bla bla DIFFERENCE </div>
<div id="something else" >bla bla bla</div>
<div id="waiver_id" >bla bla bla DIFFERENCE </div>
<html>

比较文件的命令:

diff -biw 1.html 2.html

结果:

3c3
< <div id="userdata_hidden">bla bla bla</div>
---
> <div id="userdata_hidden">bla bla bla DIFFERENCE </div>
5c5
< <div id="waiver_id" >bla bla bla</div>
---
> <div id="waiver_id" >bla bla bla DIFFERENCE </div>

比较工作正常,但我需要忽略包含特殊词的行的差异 - waiver_iduserdata_hidden

diff 命令有 -I option按数字或正则表达式匹配忽略行:

To ignore insertions and deletions of lines that match a grep-style regular expression, use the --ignore-matching-lines=regexp (-I regexp) option. You should escape regular expressions that contain shell metacharacters to prevent the shell from expanding them. For example, ‘diff -I '^[[:digit:]]'’ ignores all changes to lines beginning with a digit.

However, -I only ignores the insertion or deletion of lines that contain the regular expression if every changed line in the hunk—every insertion and every deletion—matches the regular expression. In other words, for each nonignorable change, diff prints the complete set of changes in its vicinity, including the ignorable ones.

You can specify more than one regular expression for lines to ignore by using more than one -I option. diff tries to match each line against each regular expression.

因此,我可以使用正则表达式来忽略与 waiver_iduserdata_hidden 的行比较。如果文件没有差异,diff 不会向控制台返回任何内容(空字符串)。

问题:

  1. 如何编写正则表达式,排除包含单词 waiver_id 或 userdata_hidden 的字符串?

  2. 使用 -I 选项和正则表达式时,diff 命令应该如何正确?

附言不幸的是,这个变体不起作用:

diff -biw -I '^(?!.*(?:userdata_hidden|waiver_id))' 1.html 2.html

最佳答案

I need to check that string does not contain words waiver_id and userdata_hidden.

^(?!.*\bwaiver_id\b)(?!.*\buserdata_hidden\b)

如果您不想显示任何一个字符串。

^(?!.*\b(?:userdata_hidden|waiver_id)\b)

RUbular

关于regex - 如何使用 Regex 和 diff 实用程序 ("-I regex"选项忽略特定的文件行)?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31695082/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com