bash - 将数据格式化为表格格式

标签 bash

我是一个编码新手，但我想知道从 grep 计数数据生成表的最简单方法。

我的 grep 计数输出文件如下所示:

AAR34355.1
./006D_id70.m8:0
./20D_id70.m8:0
./28D_id70.m8:0
AAR38850.1
./006D_id70.m8:0
./20D_id70.m8:2
./28D_id70.m8:4
A13520.1
./006D_id70.m8:0
./20D_id70.m8:0
./28D_id70.m8:0

我需要一个输出看起来更像这样:

            ./006D_id70.m8    ./20D_id70.m8    ./28D_id70.m8
AAR34355.1         0                0                 0
AAR38850.1         0                2                 4
A13520.1           0                0                 0

或至少一个分隔的等价物。

请原谅我的描述，因为我对此很陌生。

有没有一种相对简单的方法来以这种方式格式化数据？

最佳答案

您可以在 awk 中完成这一切，无需 reshape grep 的输出。假设要搜索的模式列在名为 patterns 的文件中, 和要搜索的文件是 file1 , file2 , 和 file3 ;将以下代码块复制并保存到名为 tst.awk 的文件中,

NR == FNR {
  pat[NR] = $0
  next
}

FNR == 1 {
  fil[c++] = FILENAME
}

{
  for (i in pat)
    if ($0 ~ pat[i])
      mat[FILENAME, pat[i]]++
}

END {
  for (i in fil)
    printf "\t%s", fil[i]

  print ""

  for (i in pat) {
    printf "%s", pat[i]

    for (j in fil)
      printf "\t%d", mat[fil[j], pat[i]]

    print ""
  }
}

并运行

awk -f tst.awk patterns file1 file2 file3

演示:

$ seq 5 > file1
$ seq 3 7 > file2
$ seq 5 9 > file3
$ seq 3 2 7 > patterns
$ awk -f tst.awk patterns file1 file2 file3
        file1   file2   file3
3       1       1       0
5       1       1       1
7       0       1       1

关于bash - 将数据格式化为表格格式，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/60736570/

上一篇：c - 具有多个参数的 kvm super 调用

下一篇：python - 在 MacOS 上使用 PyInstaller 加载 Python 库时出错

bash - 如何从 Bash 脚本中检查程序是否存在？

Bash - 如何检索 'or' 语句中第一个命令的退出状态

linux - 如果正在运行，则重新启动 dropbox-daemon

bash - 将变量设置为命令输出的正确格式

linux - 如果 SSH 中的任何命令返回非零值，如何退出 shell 脚本

linux - 自动下载一个pgp key

linux - Bash:如何在后台 while 循环中检索变量值

linux - 如何将 ssh 的输出发送到/dev/null

linux - Bash Grep 和发送