for-loop - 如何使用 awk 和 grep 计算两个文件的平均值

我有以下 2 个文件:

积分:

John,12
Joseph,14
Madison,15
Elijah,14
Theodore,15
Regina,18

团队:

Theodore,team1
Elijah,team2
Madison,team1
Joseph,team3
Regina,team2
John,team3

我想计算每支球队的平均分。我想出了一个仅使用 2 个 awk 语句的解决方案。但我想以更有效的方式做到这一点(不使用 for 循环和 if 语句)。

这是我所做的:

#!/bin/bash

awk 'BEGIN { FS="," }
      FNR==NR { a[FNR] = $1; b[FNR] = $2; next } { for(i = 0; i <= NR; ++i) { if(a[i] == $1) print b[i], $2 } }' teams points > output.txt

在第一个 awk 命令中，我将团队(team1、team2、team3)与名称分开，并创建了一个仅包含我的团队和每个团队的正确分数的新文件(因此使用 for 循环 和 if 语句 的必要性)。

其次:

awk 'BEGIN { FS=" "; 
              count_team1 = 0; 
              count_team2 = 0; 
              count_team3 = 0
              average_team1 = 0; 
              average_team2 = 0; 
              average_team3 = 0 } 

        /team1/  { count_team1 = count_team1 + 1; average_team1 = average_team1 + $2 }
        /team2/  { count_team2 = count_team2 + 1; average_team2 = average_team2 + $2 }
        /team3/  { count_team3 = count_team3 + 1; average_team3 = average_team3 + $2 }


      END { print "The average of team1 is: " average_team1 / count_team1;
            print "The average of team2 is: " average_team2 / count_team2; 
            print "The average of team3 is: " average_team3 / count_team3 }' output.txt

在第二个 awk 命令中，我只是创建变量来存储我拥有的每个团队的成员数量，以及其他变量来存储每个团队的总得分。我很容易做到，因为我的新文件 output.txt 仅包含团队和分数。

这个解决方案是有效的，但正如我之前所说，我希望在不使用 for 循环和 if 语句的情况下完成此操作。我想过不使用 FNR==NR 并使用 grep -f 进行匹配，但我没有得到任何结论性的结果。

最佳答案

仅使用 awk:

$ awk -F, '
NR==FNR {                 # process teams file
    a[$1]=$2              # hash to a: a[name]=team
    next
}
{                         # process points file
    b[a[$1]]+=$2          # add points to b, index on team: b[team]=pointsum
    c[a[$1]]++            # add count to c, index on team: c[team]=count
}
END {
    for(i in b)           
        print i,b[i]/c[i] # compute average
}' teams points
team1 15
team2 16
team3 13

编辑:在 END 中没有 for 循环的解决方案:

如果团队文件按团队排序，则可以避免 END 中的 for 循环。作为奖励，团队按顺序输出:

$ awk -F, '
NR==FNR {                # process the points file
    a[$1]=$2             # hash to a on name a[name]=points
    next
}
{                        # process the sorted teams file
    if($2!=p && FNR>1) { # then the team changes
        print p,b/c      # its time to output team name and average
        b=c=0            # reset counters
    }
    c++                  # count 
    b+=a[$1]             # sum of points for the team
    p=$2                 # p stores the team name for testing on the next round
}
END {                    # in the END
    print p,b/c          # print for the last team
}' points <(sort -t, -k2 teams)
team1 15
team2 16
team3 13

关于for-loop - 如何使用 awk 和 grep 计算两个文件的平均值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53495662/

for-loop - 如何使用 awk 和 grep 计算两个文件的平均值

上一篇：security - 自验证二进制文件？

下一篇：kendo-ui - Kendo Upload 的准备/初始化事件？