我有一个源文件,其中包含 200 万多行文本,如下所示:
388708091|347|||||0010.60|N01/2012|
388708101|348|||||0011.60|N01/2012|
388708101|349|||||0012.60|N01/2012|
388719001|348|||||0010.38|M05/2013|
388719001|349|||||0011.38|M05/2013|
我想用如下所示的 map 映射并替换第二列(其值如 347,348,349 等):
346 309
347 311
348 312
349 313
350 314
351 315
352 316
请注意,虽然 map 是二维的,但有超过 100 行。
用目标映射替换源文件第二列中的数据的最有效的命令行方法是什么?
最佳答案
awk
似乎是完成这项工作的工具:
awk 'NR == FNR { a[$1] = $2; next } FNR == 1 { FS = "|"; OFS = FS; $0 = $0 } { $2 = a[$2] } 1' mapfile datafile
代码的工作原理如下:
NR == FNR { # while processing the first file (mapfile)
a[$1] = $2 # remember the second field by the first
next # do nothing else
}
FNR == 1 { # at the first line of the second file (datafile):
FS = "|" # start splitting by | instead of whitespace
OFS = FS # delimit output the same way as the input
$0 = $0 # force resplitting of this first line
}
{ # for all lines in the second file:
$2 = a[$2] # replace the 2nd field with the remembered value for that key
}
1 # print the line
警告:这假设数据文件第二列中的每个值在映射文件中都有相应的条目;那些不存在的将被替换为空字符串。如果这种行为不合需要,请更换
{ $2 = a[$2] }
与
{ if($2 in a) { $2 = a[$2] } else { $2 = "something else" } }
我不清楚在这种情况下会发生什么。
关于macos - Excel vlookup 之类的命令行函数或工具?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30270440/