gpt4 book ai didi

python - 在模式之间对文本文件中的行进行排序

转载 作者:行者123 更新时间:2023-12-04 15:17:01 24 4
gpt4 key购买 nike

我正在尝试在 Bash 或 Python 中对模式之间的行进行排序。我想根据第二个字段对行进行排序,并使用“,”作为分隔符。

给定以下文本输入文件:

Sample1
T1,64,0.65 MEDIUM
T2,60,0.45 LOW
T3,301,0.68 MEDIUM
T4,65,0.75 HIGH
T5,59,0.72 MEDIUM
T6,51,0.82 HIGH
Sample2
T1,153,0.77 HIGH
T2,152,0.61 MEDIUM
T3,154,0.67 MEDIUM
T4,283,0.66 MEDIUM
T5,161,0.65 MEDIUM
Sample3
T1,147,0.71 MEDIUM
T2,154,0.63 MEDIUM
T3,45,0.63 MEDIUM
T4,259,0.77 HIGH

我希望作为输出:
Sample1
T6,51,0.82 HIGH
T5,59,0.72 MEDIUM
T2,60,0.45 LOW
T1,64,0.65 MEDIUM
T4,65,0.75 HIGH
T3,301,0.68 MEDIUM
Sample2
T2,152,0.61 MEDIUM
T1,153,0.77 HIGH
T3,154,0.67 MEDIUM
T5,161,0.65 MEDIUM
T4,283,0.66 MEDIUM
Sample3
T3,45,0.63 MEDIUM
T1,147,0.71 MEDIUM
T2,154,0.63 MEDIUM
T4,259,0.77 HIGH

我曾尝试改编 glenn jackman 在另一篇文章中找到的这个建议,但据我测试,它仅适用于 2 种模式:
> gawk -v cmd="sort -k2" p=1 '
> /^PATTERN2/ { # when we we see the 2nd marker:
> close("cmd", "to");
> while (("cmd" |& getline line) >0) print line
> p=1
> }
> p {print} # if p is true, print the line
> !p {print |& "cmd"} # if p is false, send the line to `sort`
> /^PATTERN1/ {p=0} # when we see the first marker, turn off printing ' FILE

最佳答案

您可以通过以下方式使用 GNU awk 执行此操作:

$ awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_asc"; FS=","}
/PATTERN/{
for(i in a) print i
delete a
print; next
}
{ a[$0]=$2 }
END{ for(i in a) print i }' file

PROCINFO["sorted_in"]="@val_num_asc" ,我们告诉 GNU awk 以数组元素的值以数字升序出现的方式遍历数组。我们的想法是制作一个带有完整行的键并为第二个字段赋值的数组。我们不使用第二个字段作为键,因为可能存在重复项。但是,这仍然可以通过以下方式实现:
$ awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_asc"; FS=","}
/PATTERN/{
for(i in a) print a[i]
delete a
print; next
}
($2 in a){ a[$2]=a[$2] ORS $0; next }
{ a[$2] = $0 }
END{ for(i in a) print a[i] }' file

关于python - 在模式之间对文本文件中的行进行排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58854538/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com