file - 比较两个文件并输出两个文件的差异(包括行号和内容)

标签 file awk compare

我试图在另一个文件或标准输出中获取两个文件、行号和内容的差异。我尝试了以下操作,但无法获得所需的确切输出。请看下面。

文件内容:

文件1:

Col1,Col2,Col3
Text1,text1,text1
Text2,text2,Rubbish

文件2:

Col1,Col2,Col3
Text1,text1,text1
Text2,text2,text2
Text3,text3,text3

我已经尝试了以下代码,它没有提供确切的所需输出,因为它只显示了第一个文件中的差异,而不是 file2 中的额外行。

sort file1 file2 | uniq | awk 'FNR==NR{ a[$1]; next } !($1 in a) {print FNR": "$0}' file2 file1

输出

3: Text2,text2,Rubbish

期望的输出

3: Text2,text2,Rubbish (File1)
3: Text2,text2,text2 (File2)
4: Text3,text3,text3 (File2)

我不希望因为输出而使用 diff/sdiff/comm,因为我无法添加行号并并排组织数据以便于阅读。普通文件会超过 1000 行,因此 diff/sdiff 实用程序变得更难阅读。

最佳答案

使用您展示的示例,请尝试遵循 awk 代码。用 GNU awk 编写和测试。

awk '
BEGIN { OFS=": " }
FNR==1{ next     }
FNR==NR{
  arr[$0]=FNR
  next
}
!($0 in arr){
  print FNR,$0" ("FILENAME")"
  next
}
{
  arr1[$0]
}
END{
  for(key in arr){
    if(!(key in arr1)){
      print arr[key],key" ("ARGV[1]")"
    }
  }
}
' file1 file2

说明: 为以上添加详细说明。

awk '                                   ##Starting awk program from here.
BEGIN { OFS=": " }                      ##Setting OFS to colon space in BEGIN section of this program.
FNR==1{ next     }                      ##Skipping if there is FNR==1 for both the files.
FNR==NR{                                ##Checking condition if FNR==NR then do following.
  arr[$0]=FNR                           ##Creating arr with index of current line has FNR as value.
  next                                  ##Will skip all further statements from here.
}
!($0 in arr){                           ##If current line is NOT in arr(to get lines which are in file2 but not in file1)
  print FNR,$0" ("FILENAME")"           ##Printing as per OP request number with file name, line.
  next                                  ##Will skip all further statements from here.
}
{
  arr1[$0]                              ##Creating arr1 which has index as current line in it.
}
END{                                    ##Starting END section of this program from here.
  for(key in arr){                      ##Traversing through arr here.
    if(!(key in arr1)){                 ##If key is NOT present in arr1.
      print arr[key],key" ("ARGV[1]")"   ##Printing values of arr and first file name, basically getting lines which are present in file1 and NOT in file2.
    }
  }
}
' file1 file2                           ##Mentioning Input_file names here.

关于file - 比较两个文件并输出两个文件的差异(包括行号和内容),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68467871/

相关文章:

java - 检查文件夹和所有数据是否已存在

python - 迁移 python 2 到 3 : types. 文件类型

linux - 如何执行在匹配模式的每一行都有多个参数的命令

java - 比较文件,然后删除字符较大的文件

C: 无法将文件中的行存储到数组中

file - 如何使用 Gradle 获取文件和目录名称?

shell - awk - 计算文件内容

c# - 比较 C#/Unity 中的列表并忽略项目

csv - 用 awk 替换 CSV 文件中的列值