bash - 使用 awk 重新格式化文本文件并将其剪切为单行

数据:

CHR SNP BP A1 TEST NMISS BETA SE L95 U95 STAT P 
1   chr1:1243:A:T 1243 T ADD 16283 -6.124 0.543 -1.431 0.3534 -1.123 0.14

期望的输出:

MarkerName P-Value 
  chr1:1243  0.14

实际文件是 1.2G 的行，如上
我需要将文本的第二列从第二个冒号处剥离，然后将其粘贴到最后的第 12 列并为其指定一个新标题。
我试过了:

awk '{print $2, $12}' | cut -d: -f1-2

但这会删除冒号后的整行，我想保留“p”列
我将其输出到一个新文件，然后使用 awk 将其粘贴到 P 值列上，但想知道是否有一种单行方法可以做到这一点？
非常感谢

最佳答案

我以更容易理解的形式发表评论:

$ awk '
BEGIN {
    print "MarkerName P-Value"          # output header
}
NR>1 {                                  # skip the funky first record
    split($2,a,/:/)                     # split by :
    printf "%s:%s %s\n",a[1],a[2],$12   # printf allows easier output formating
}' file

输出:

MarkerName P-Value
chr1:1243 0.14

关于bash - 使用 awk 重新格式化文本文件并将其剪切为单行，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/64821030/

上一篇：python - 导入 geopandas 时出错 OSError : Could not find lib c or load any of its variants []

下一篇：r - 如何从waldo::compare()返回的对象中提取未完成的值？

linux - 从父目录在同一上下文中调用 bash 脚本

linux - 从文件脚本错误添加用户帐户

python - 使用默认等宽字体在终端窗口中对齐 unicode 文本

bash - 文档中唯一单词的数量

python - 在文本文件中存储 Python 实例属性的更好方法是什么？

linux - 脚本编写 - 使用 While 循环迭代数字(newusers 命令)

linux - 如何通过 sed 或 awk 递归替换文件中的字符串？

regex - 使用正则表达式解析 lspci 树

awk - 如何在bash中以科学计数法形式对所有数字求和