我有以下 2 个文件:
积分:
John,12
Joseph,14
Madison,15
Elijah,14
Theodore,15
Regina,18
团队:
Theodore,team1
Elijah,team2
Madison,team1
Joseph,team3
Regina,team2
John,team3
我想计算每支球队的平均分。
我想出了一个仅使用 2 个 awk 语句的解决方案。但我想以更有效的方式做到这一点(不使用 for
循环和 if
语句)。
这是我所做的:
#!/bin/bash
awk 'BEGIN { FS="," }
FNR==NR { a[FNR] = $1; b[FNR] = $2; next } { for(i = 0; i <= NR; ++i) { if(a[i] == $1) print b[i], $2 } }' teams points > output.txt
在第一个 awk
命令中,我将团队(team1、team2、team3)与名称分开,并创建了一个仅包含我的团队和每个团队的正确分数的新文件(因此使用 for 循环
和 if 语句
的必要性)。
其次:
awk 'BEGIN { FS=" ";
count_team1 = 0;
count_team2 = 0;
count_team3 = 0
average_team1 = 0;
average_team2 = 0;
average_team3 = 0 }
/team1/ { count_team1 = count_team1 + 1; average_team1 = average_team1 + $2 }
/team2/ { count_team2 = count_team2 + 1; average_team2 = average_team2 + $2 }
/team3/ { count_team3 = count_team3 + 1; average_team3 = average_team3 + $2 }
END { print "The average of team1 is: " average_team1 / count_team1;
print "The average of team2 is: " average_team2 / count_team2;
print "The average of team3 is: " average_team3 / count_team3 }' output.txt
在第二个 awk
命令中,我只是创建变量来存储我拥有的每个团队的成员数量,以及其他变量来存储每个团队的总得分。我很容易做到,因为我的新文件 output.txt
仅包含团队和分数。
这个解决方案是有效的,但正如我之前所说,我希望在不使用 for
循环和 if
语句的情况下完成此操作。我想过不使用 FNR==NR
并使用 grep -f
进行匹配,但我没有得到任何结论性的结果。
最佳答案
仅使用 awk:
$ awk -F, '
NR==FNR { # process teams file
a[$1]=$2 # hash to a: a[name]=team
next
}
{ # process points file
b[a[$1]]+=$2 # add points to b, index on team: b[team]=pointsum
c[a[$1]]++ # add count to c, index on team: c[team]=count
}
END {
for(i in b)
print i,b[i]/c[i] # compute average
}' teams points
team1 15
team2 16
team3 13
编辑:在 END
中没有 for
循环的解决方案:
如果团队文件按团队排序,则可以避免 END
中的 for
循环。作为奖励,团队按顺序输出:
$ awk -F, '
NR==FNR { # process the points file
a[$1]=$2 # hash to a on name a[name]=points
next
}
{ # process the sorted teams file
if($2!=p && FNR>1) { # then the team changes
print p,b/c # its time to output team name and average
b=c=0 # reset counters
}
c++ # count
b+=a[$1] # sum of points for the team
p=$2 # p stores the team name for testing on the next round
}
END { # in the END
print p,b/c # print for the last team
}' points <(sort -t, -k2 teams)
team1 15
team2 16
team3 13
关于for-loop - 如何使用 awk 和 grep 计算两个文件的平均值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53495662/