gpt4 book ai didi

linux - 根据列的值保留唯一 ID?

转载 作者:行者123 更新时间:2023-12-03 09:58:45 26 4
gpt4 key购买 nike

我有一个制表符分隔的文件:

GH76.hmm - 358 VENTURIA_I_00885.t1 - 411 7.50E-83 273.9 26.1 1 1 7.80E-85 2.30E-82 272.3 26.1 15 354 24 406 21 410 0.87
GH105.hmm - 332 VENTURIA_I_00885.t1 - 411 7.80E-10 33.7 5.3 1 2 8.80E-07 0.00026 15.5 1.9 63 153 159 250 131 260 0.78
GH105.hmm - 332 VENTURIA_I_00885.t1 - 411 7.80E-10 33.7 5.3 2 2 2.70E-07 7.90E-05 17.2 0.1 12 104 275 378 268 383 0.73
AA3_2.hmm - 570 VENTURIA_I_04612.t1 - 614 2.80E-98 324.9 0 1 1 3.70E-100 3.60E-98 324.5 0 2 566 34 608 33 610 0.87
AA3.hmm - 618 VENTURIA_I_04612.t1 - 614 7.50E-91 300.5 0 1 1 9.70E-93 9.50E-91 300.1 0 81 398 28 609 22 613 0.86
AA3_3.hmm - 591 VENTURIA_I_04612.t1 - 614 2.30E-57 189.7 0 1 2 5.00E-49 4.90E-47 155.6 0 3 463 36 508 34 515 0.81
AA3_3.hmm - 591 VENTURIA_I_04612.t1 - 614 2.30E-57 189.7 0 2 2 3.40E-11 3.30E-09 30.7 0 511 583 531 604 525 611 0.87

我想根据第 7 列中最小的 e 值从第 4 列中保留一个 id。我尝试使用以下命令,但没有输出:
$cat ./file2 | sed '/#/d'| sed '/\n/d' | \
awk -F'[\t]' '$7 > smallest[$4] { smallest[$7]=$4; line[$1] = $0 };END { for (id in smallest) { print line[id] }}'

输出应该是这样的:
GH76.hmm - 358 VENTURIA_I_00885.t1 - 411 7.50E-83 273.9 26.1 1 1 7.80E-85 2.30E-82 272.3 26.1 15 354 24 406 21 410 0.87
AA3_2.hmm - 570 VENTURIA_I_04612.t1 - 614 2.80E-98 324.9 0 1 1 3.70E-100 3.60E-98 324.5 0 2 566 34 608 33 610 0.87

谢谢你。

最佳答案

您能否尝试以下,用所示样本进行测试。

awk '
{
val=sprintf("%.100f",$7)
a[$4]=a[$4]<val?a[$4]?a[$4]:val:val
b[$4,val]=$0
}
END{
for(i in a){
print b[i,a[i]]
}
}
' Input_file

解释:
awk '                                        ##Starting awk program from here.
{ ##Starting main BLOCK of this awk code here.
val=sprintf("%.100f",$7) ##Saving 7th field value in variable val in float form here to make comparison easy.
a[$4]=a[$4]<val?a[$4]?a[$4]:val:val ##Creating a variable named a whose index is $4 and checking condition for each cycle if a[$4] is lesser than val then keep its value as it is else change it to current value of val here.
b[$4,val]=$0 ##Creating an array named b whose index is $4 and val here with value of $0.
} ##Closing main block for this awk code here.
END{ ##Starting END block for this awk code.
for(i in a){ ##With for loop traversing through an array a all items here.
print b[i,a[i]] ##Printing array b value whose index is variable i and value of array a with index of variable i.
}
}
' Input_file ##Mentioning Input_file name here.

关于linux - 根据列的值保留唯一 ID?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59466816/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com