我有以下数据集,需要进行一些转置。我在剧本上苦苦挣扎。任何帮助,将不胜感激。所有列/值都是动态的
文件格式:
ID FieldName FieldValue
1 Rooms Required? Yes
1 Country of Meeting US
2 Rooms Required?
2 Country of Meeting
3 Rooms Required? Yes
3 Country of Meeting US
4 Rooms Required? No
4 Country of Meeting BL
所需输出:
ID Rooms Required? Country of Meeting
1 Yes US
2
3 Yes US
4 No BL
请帮忙
最佳答案
基于由制表符 '\t'
分隔的字段的纯 awk
解决方案如下:
awk 'BEGIN { FS = "\t"; PROCINFO["sorted_in"] = "@ind_num_asc" } { if ( $1 !~ /^[0-9]+$/ ) next; A[$1][$2] = $3; H[$2] } END { printf "ID"; for (h in H) printf "\t" h; for (i in A) { printf "\n\n" i; for (j in A[i]) printf "\t" A[i][j] } print "\n" }' filename
分割:
awk 'BEGIN {
FS = "\t" #Set Field Separator as the Tab
PROCINFO["sorted_in"] = "@ind_num_asc" #Set array order as numbers
}
{
if ( $1 !~ /^[0-9]+$/ ) #Skip all rows without numeric ID
next
A[$1][$2] = $3 #Store value in multi-dimensional array
H[$2] #Store header name
}
END {
printf "ID"
for (h in H) #Print all headers found
printf "\t" h
for (i in A) { #Print each record with corresponding values
printf "\n\n" i
for (j in A[i])
printf "\t" A[i][j]
}
print "\n"
}' filename
如果需要任何进一步的解释,请告诉我。这将适用于您以任意顺序设置的任意数量的和字段。如果记录不具有所有相同的字段,您的输出可能看起来参差不齐。
关于将行转列的 Linux 脚本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52598682/