linux - 计算多个文件中的重复项

我有五个文件，其中包含一些重复的字符串。

文件1:

文件2:

文件3:

a
b

文件4:

文件5:

所以我使用了 awk 'NR==FNR{A[$0];next}$0 in A' file1 file2 file3 file4 file5

它打印$a，但是如你所见，b 字符串在其他文件中重复了 3 次，但只打印了 a。

那么如何使用一行命令从分析/比较每个文件中获取所有重复的字符串(a b)呢？另外，我如何获得每个元素的重复次数。

最佳答案

我建议使用 GNU sort 和 uniq:

sort file[1-5] | uniq -dc

输出:

2 a
3 b

来自 man uniq:

-d: only print duplicate lines

-c: prefix lines by the number of occurrences

关于linux - 计算多个文件中的重复项，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41278672/