bash - 使用 awk 转置 CSV 数据(枢轴转换)

我的 CSV 数据如下所示:

Indicator;Country;Value
no_of_people;USA;500
no_of_people;Germany;300
no_of_people;France;200
area_in_km;USA;18
area_in_km;Germany;16
area_in_km;France;17
proportion_males;USA;5.3
proportion_males;Germany;7.9
proportion_males;France;2.4

我希望我的数据如下所示:

Country;no_of_people;area_in_km;proportion_males
USA;500;18;5.3
Germany;300;16;7.9
France;200;17;2.4

指标和国家/地区比此处列出的还要多。

相当大的文件(行数有 5 位数字)。四处寻找一些转置线程，但没有任何符合我的情况(而且我对 awk 还很陌生，所以我无法更改我发现的代码以适合我的数据)。

感谢您的帮助。问候广告

最佳答案

如果Ind字段的数量是固定的，你可以这样做:

awk 'BEGIN{FS=OFS=";"}
     {a[$2,$1]=$3; count[$2]}
     END {for (i in count) print i, a[i,"Ind1"], a[i, "Ind2"], a[i, "Ind3"]}' file

说明

BEGIN{FS=OFS=";"} 将输入和输出字段分隔符设置为分号。
{a[$2,$1]=$3; count[$2]} 获取 count[] 数组中的国家/地区列表以及 a["country","Ind"上每个 Ind 的值]数组。
END {for (i in count) print i, a[i,"Ind1"], a[i, "Ind2"], a[i, "Ind3"]} 打印值(value)观的总结。

输出

$ awk 'BEGIN{FS=OFS=";"} {a[$2,$1]=$3; count[$2]} END {for (i in count) print i, a[i,"Ind1"], a[i, "Ind2"], a[i, "Ind3"]}' file
France;200;17;2.4
Germany;300;16;7.9
USA;500;18;5.3

更新

unfortunately, the number of Indicators is not fixed. Also, they are not named like "Ind1", "Ind2" etc. but are just strings.' I clarified my question.

$ awk -v FS=";" '{a[$2,$1]=$3; count[$2]; indic[$1]} END {for (j in indic) printf "%s ", j; printf "\n"; for (i in count) {printf "%s ", i; for (j in indic) printf "%s ", a[i,j]; printf "\n"}}' file
proportion_males no_of_people area_in_km 
France 2.4 200 17 
Germany 7.9 300 16 
USA 5.3 500 18

要分隔 ;，请将每个空格替换为 ;:

$ awk -v FS=";" '{a[$2,$1]=$3; count[$2]; indic[$1]} END {for (j in indic) printf "%s ", j; printf "\n"; for (i in count) {printf "%s ", i; for (j in indic) printf "%s ", a[i,j]; printf "\n"}}' file | tr ' ' ';'
proportion_males;no_of_people;area_in_km;
France;2.4;200;17;
Germany;7.9;300;16;
USA;5.3;500;18;

关于bash - 使用 awk 转置 CSV 数据(枢轴转换)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/23764454/

bash - 使用 awk 转置 CSV 数据(枢轴转换)

说明

输出

更新

上一篇：wcf - Autofac WCF - CloseChannel 在负载测试下被多次调用

下一篇：gulp - compass 失败 : You must compile individual stylesheets from the project directory.