gpt4 book ai didi

for-loop - 如何使用 awk 和 grep 计算两个文件的平均值

转载 作者:行者123 更新时间:2023-12-02 22:12:53 27 4
gpt4 key购买 nike

我有以下 2 个文件:

积分:

John,12
Joseph,14
Madison,15
Elijah,14
Theodore,15
Regina,18

团队:

Theodore,team1
Elijah,team2
Madison,team1
Joseph,team3
Regina,team2
John,team3

我想计算每支球队的平均分。我想出了一个仅使用 2 个 awk 语句的解决方案。但我想以更有效的方式做到这一点(不使用 for 循环和 if 语句)。

这是我所做的:

#!/bin/bash

awk 'BEGIN { FS="," }
FNR==NR { a[FNR] = $1; b[FNR] = $2; next } { for(i = 0; i <= NR; ++i) { if(a[i] == $1) print b[i], $2 } }' teams points > output.txt

在第一个 awk 命令中,我将团队(team1、team2、team3)与名称分开,并创建了一个仅包含我的团队和每个团队的正确分数的新文件(因此使用 for 循环if 语句 的必要性)。

其次:

awk 'BEGIN { FS=" "; 
count_team1 = 0;
count_team2 = 0;
count_team3 = 0
average_team1 = 0;
average_team2 = 0;
average_team3 = 0 }

/team1/ { count_team1 = count_team1 + 1; average_team1 = average_team1 + $2 }
/team2/ { count_team2 = count_team2 + 1; average_team2 = average_team2 + $2 }
/team3/ { count_team3 = count_team3 + 1; average_team3 = average_team3 + $2 }


END { print "The average of team1 is: " average_team1 / count_team1;
print "The average of team2 is: " average_team2 / count_team2;
print "The average of team3 is: " average_team3 / count_team3 }' output.txt

在第二个 awk 命令中,我只是创建变量来存储我拥有的每个团队的成员数量,以及其他变量来存储每个团队的总得分。我很容易做到,因为我的新文件 output.txt 仅包含团队和分数。

这个解决方案是有效的,但正如我之前所说,我希望在不使用 for 循环和 if 语句的情况下完成此操作。我想过不使用 FNR==NR 并使用 grep -f 进行匹配,但我没有得到任何结论性的结果。

最佳答案

仅使用 awk:

$ awk -F, '
NR==FNR { # process teams file
a[$1]=$2 # hash to a: a[name]=team
next
}
{ # process points file
b[a[$1]]+=$2 # add points to b, index on team: b[team]=pointsum
c[a[$1]]++ # add count to c, index on team: c[team]=count
}
END {
for(i in b)
print i,b[i]/c[i] # compute average
}' teams points
team1 15
team2 16
team3 13

编辑:END 中没有 for 循环的解决方案:

如果团队文件按团队排序,则可以避免 END 中的 for 循环。作为奖励,团队按顺序输出:

$ awk -F, '
NR==FNR { # process the points file
a[$1]=$2 # hash to a on name a[name]=points
next
}
{ # process the sorted teams file
if($2!=p && FNR>1) { # then the team changes
print p,b/c # its time to output team name and average
b=c=0 # reset counters
}
c++ # count
b+=a[$1] # sum of points for the team
p=$2 # p stores the team name for testing on the next round
}
END { # in the END
print p,b/c # print for the last team
}' points <(sort -t, -k2 teams)
team1 15
team2 16
team3 13

关于for-loop - 如何使用 awk 和 grep 计算两个文件的平均值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53495662/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com