cat file1.txt
abc bcd abc ...
abcd bcde cdef ...
abcd bcde cdef ...
abcd bcde cdef ...
efg fgh ...
efg fgh ...
hig ...
我的预期结果如下:
abc bcd abc ...
abcd bcde cdef ...
<!!! pay attention, above sentence has repeated 3 times !!!>
efg fgh ...
<!!! pay attention, above sentence has repeated 3 times !!!>
hig ...
我找到了处理这些问题的方法,但我的代码有点嘈杂。
cat file1.txt | uniq -c | sed -e 's/ \+/ /g' -e 's/^.//g' | awk '{print $0," ",$1}'| sed -e 's/^[2-9] /\n/g' -e 's/^[1] //g' |sed -e 's/[^1]$/\n<!!! pay attention, above sentence has repeated & times !!!> \n/g' -e 's/[1]$//g'
abc bcd abc ...
abcd bcde cdef ...
<!!! pay attention, above sentence has repeated 3 times !!!>
efg fgh ...
<!!! pay attention, above sentence has repeated 2 times !!!>
hig ...
我想知道你是否可以告诉我更高效的方法来实现目标。非常感谢。
最佳答案
sort
+ uniq
+ sed
解决方案:
sort file1.txt | uniq -c | sed -E 's/^ +1 (.+)/\1\n/;
s/^ +([2-9]|[0-9]{2,}) (.+)/\2\n<!!! pay attention, the above sentence has repeated \1 times !!!>\n/'
输出:
abc bcd abc ...
abcd bcde cdef ...
<!!! pay attention, the above sentence has repeated 3 times !!!>
efg fgh ...
<!!! pay attention, the above sentence has repeated 2 times !!!>
hig ...
或者使用 awk
:
sort file1.txt | uniq -c | awk '{ n=$1; sub(/^ +[0-9]+ +/,"");
printf "%s\n%s",$0,(n==1? ORS:"<!!! pay attention, the above sentence has repeated "n" times !!!>\n\n") }'
关于linux - 如何计算Shell中重复的句子,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47884428/