我想删除第 1 列中的重复记录,保留第一个实例。但保持其余列不变。
输入
444444 21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
444444 116,118,124-125,120,122-123,126,132.
444444 25-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
444444 110,118,124-125,120,122-123,126,132.
111111 21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
111111 116,118,124-125,120,122.
111111 21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
232323 20-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
232323 116,118,124-125,120,122-123,126,132.
输出
444444 21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
116,118,124-125,120,122-123,126,132.
25-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
110,118,124-125,120,122-123,126,132.
111111 21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
116,118,124-125,120,122.
21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
232323 20-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
116,118,124-125,120,122-123,126,132.
我试过了
awk '!NF {print;next}; !($1 in a) {a[$1];print}' file
此外,尝试将文件分成两部分:
file 1: first column and remove the duplicates and keep first > output1
file 2: Second Column
paste output1 file2 > file-output.
是否可以选择在简单的 awk 行中执行。
最佳答案
这个 awk
可能适合你:
awk 'seen[$1]++{$1="\t\t"} 1' file
444444 21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
116,118,124-125,120,122-123,126,132.
111111 21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
116,118,124-125,120,122.
232323 21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117,
116,118,124-125,120,122-123,126,132.
关于bash - 删除第一列中的重复记录但不修改其余列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50427878/