gpt4 book ai didi

macos - 为什么 fgrep 表现得很奇怪?

转载 作者:行者123 更新时间:2023-12-02 09:02:48 25 4
gpt4 key购买 nike

我正在尝试使用 grep 从 file2 中的 file1 中提取匹配的单词并定向到输出文件。

我有文件1

Acetoanaerobium sticklandii
Acetobacter pasteurianus

喜欢 >1000 个物种。

文件2

>WP_013360383.1 ATP-dependent Clp protease ATP-binding subunit [Acetoanaerobium sticklandii]
>WP_013360396.1 ATP-dependent Clp protease ATP-binding subunit [Acetoanaerobium sticklandii]
>WP_003623694.1 multidrug efflux RND transporter permease subunit [Acetobacter pasteurianus]
>WP_003624003.1 superoxide dismutase [Acetobacter pasteurianus]
>WP_003624029.1 UDP-galactopyranose mutase [Acetobacter pasteurianus]
>WP_003624540.1 mannose-1-phosphate guanylyltransferase/mannose-6-phosphate isomerase [Acetobacter pasteurianus]
>WP_077905956.1 effector protein [Salmonella enterica]
>WP_077905962.1 type III secretion system YopJ family effector AvrA [Salmonella enterica]
>WP_005544680.1 3-deoxy-8-phosphooctulonate synthase [Aggregatibacter actinomycetemcomitans]
>WP_005545812.1 MFS transporter [Aggregatibacter actinomycetemcomitans]
>WP_005546163.1 UTP--glucose-1-phosphate uridylyltransferase GalU [Aggregatibacter actinomycetemcomitans]

等等..

当我使用代码时

grep -f file1 file2 > output 

fgrep -f file1 file2 > output

结果输出是包含 file1 最后一行的列表,其余输入列表将被 grep 忽略。即使使用 -w 选项,输出也是相同的。

为什么 grep 会这样?难道是我的外壳有问题?我使用的是 Mojave 操作系统版本的 MacBook。

请建议我等效的 awk 命令。

我试过了

awk 'NR==FNR{a[$0];next}$NF in a{print}' file1 file2 > output

但结果是空文件。

最佳答案

第一个解决方案:比我的第二个解决方案通用且更快的解决方案。这将在 Input_file2 中查找从 [] 的字符串,而不管此处的字段硬编码如何。

awk '
{ gsub(/\r/,"") }
FNR==NR{
array[$0]
next
}
match($0,/\[[^]]*/){
val=substr($0,RSTART+1,RLENGTH-1)
}
(val in array)
' file1 file2

说明:为上述内容添加了详细说明。

awk '                                     ##Starting awk program from here.
{ gsub(/\r/,"") }
FNR==NR{ ##Checkiing condition FNR==NR which will be TRUE hen file1 is being read.
array[$0] ##Creating array with index of current line.
next ##next will skip all statements from here.
}
match($0,/\[[^]]*/){ ##Using match to match from [ to till ] in line.
val=substr($0,RSTART+1,RLENGTH-1) ##Creating val which has sub-string from RSTART to RLENGH here.
}
(val in array) ##Checking condition if val is present in array then print that line.
' file1 file2 ##Mentioning Input_file names here.


第二个解决方案:您能否尝试按照所示示例进行编写和测试。这是更通用的解决方案,因为字段编号没有硬编码在解决方案中,但无论此处的字段编号如何,此代码都将起作用。

awk '{ gsub(/\r/,"") } FNR==NR{array[$0];next} {for(i in array){if(match($0,i)){print;next}}}' file1 file2

说明:为上述内容添加详细说明。

awk '                    ##Starting awk program from here.
{ gsub(/\r/,"") }
FNR==NR{ ##Checking condition FNR==NR if that's true then do following.
array[$0] ##Creating an array with index of current line.
next ##next will skip further statements from here.
}
{
for(i in array){ ##Looping through array here.
if(match($0,i)){ ##Checking if current key is present in current line then do following.
print ##Printing current line here.
next ##next will skip further statements from here.
}
}
}
' file1 file2 ##Mentioning Input_file names here.

关于macos - 为什么 fgrep 表现得很奇怪?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62086866/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com