GNU awk 支持 multidimensional arrays :
q[1][1] = "dog"
q[1][2] = 999
q[2][1] = "mouse"
q[2][2] = 777
q[3][1] = "bird"
q[3][2] = 888
我想对 q
的“第二列”进行排序,这样我就剩下:
q[1][1] = "mouse"
q[1][2] = 777
q[2][1] = "bird"
q[2][2] = 888
q[3][1] = "dog"
q[3][2] = 999
如您所见,“第一列”值移动到与第二列保持一致。我懂了
GNU Awk 提供了一个 asort function但它似乎不支持
多维数组。如果有帮助,这是一个
working Ruby example :
q = [["dog", 999], ["mouse", 777], ["bird", 888]]
q.sort_by{|z|z[1]}
=> [["mouse", 777], ["bird", 888], ["dog", 999]]
我最终使用了一个常规数组,然后用换行符分隔重复项:
q[777] = "mouse"
q[999] = "dog" RS "fish"
q[888] = "bird"
for (z in q) {
print q[z]
}
FWIW,这里有一个解决方法“sort_by()”函数:
$ cat tst.awk
BEGIN {
a[1][1] = "dog"
a[1][2] = 999
a[2][1] = "mouse"
a[2][2] = 777
a[3][1] = "bird"
a[3][2] = 888
print "\n############################\nBefore:"
for (i=1; i in a; i++)
for (j=1; j in a[i]; j++)
printf "a[%d][%d] = %s\n",i,j,a[i][j]
print "############################"
sort_by(a,2)
print "\n############################\nAfter:"
for (i=1; i in a; i++)
for (j=1; j in a[i]; j++)
printf "a[%d][%d] = %s\n",i,j,a[i][j]
print "############################"
}
function sort_by(arr,key, keys,vals,i,j)
{
for (i=1; i in arr; i++) {
keys[i] = arr[i][key]
for (j=1; j in arr[i]; j++)
vals[keys[i]] = vals[keys[i]] (j==1?"":SUBSEP) arr[i][j]
}
asort(keys)
for (i=1; i in keys; i++)
split(vals[keys[i]],arr[i],SUBSEP)
return (i - 1)
}
$ gawk -f tst.awk
############################
Before:
a[1][1] = dog
a[1][2] = 999
a[2][1] = mouse
a[2][2] = 777
a[3][1] = bird
a[3][2] = 888
############################
############################
After:
a[1][1] = mouse
a[1][2] = 777
a[2][1] = bird
a[2][2] = 888
a[3][1] = dog
a[3][2] = 999
############################
它的工作原理是首先转换它:
a[1][1] = "dog"
a[1][2] = 999
a[2][1] = "mouse"
a[2][2] = 777
a[3][1] = "bird"
a[3][2] = 888
为此:
keys[1] = 999
vals[999] = dog SUBSEP 999
keys[2] = 777
vals[777] = mouse SUBSEP 777
keys[3] = 888
vals[888] = bird SUBSEP 888
然后 asort()ing keys[] 得到:
keys[1] = 777
keys[2] = 888
keys[3] = 999
然后循环遍历 keys 数组,使用它的元素作为 vals 数组的索引,以重新填充原始数组。
如果有人想知道为什么我不只是使用我们想要排序的值作为索引然后执行 asorti() 因为这会导致代码稍微更简短,原因如下:
$ cat tst.awk
BEGIN {
a[1] = 888
a[2] = 9
a[3] = 777
b[888]
b[9]
b[777]
print "\n\"a[]\" sorted by content:"
asort(a,A)
for (i=1; i in A; i++)
print "\t" A[i]
print "\n\"b[]\" sorted by index:"
asorti(b,B)
for (i=1; i in B; i++)
print "\t" B[i]
}
$ awk -f tst.awk
"a[]" sorted by content:
9
777
888
"b[]" sorted by index:
777
888
9
请注意,asorti() 将“9”视为比“888”更高的值。这是因为 asorti() 对数组索引进行排序,并且所有数组索引都是字符串(即使它们看起来像数字)并且按字母顺序,字符串“9”的第一个字符高于字符串“888”的第一个字符。另一方面,asort() 对数组的内容进行排序,数组内容可以是字符串或数字,因此适用正常的 awk 比较规则——任何看起来像数字的都被视为数字,数字 9 小于编号 888,在这种情况下恕我直言,这是期望的结果。