Bash查找大文件中多个字符串的数量

我正在尝试使用 bash 命令获取大型 txt 文件中各种字符串的数量。

即使用 bash 查找字符串“pig”、“horse”和“cat”的计数，并得到输出“pig: 7, horse: 3, cat: 5”。我想要一种只搜索 txt 文件一次的方法，因为它非常大(所以我不想在整个 txt 文件中搜索“pig”，然后返回搜索“horse”等)

任何有关命令的帮助将不胜感激。谢谢!

最佳答案

grep -Eo 'pig|horse|cat' txt.file | sort | uniq -c | awk '{print $2": "$1}'

将其分解:

grep -Eo 'pig|horse|cat'  Print all the occurrences (-o) of the
                          extended (-e) regex 
sort                      Sort the resulting words
uniq -c                   Output unique values (of sorted input)
                          with the count (-c) of each value
awk '{print $2": "$1}'    For each line, print the second field (the word)
                          then a colon and a space, and then the first
                          field (the count).

关于Bash查找大文件中多个字符串的数量，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28145077/

上一篇：arrays - 修改数组函数参数 Bash

下一篇：r - 将具有 1000 列的数据框/文件中的两列连接到新数据框/文件中的一列

相关文章：

bash - 如何从 100 个文件中求和 - bash/awk？

linux - 从键盘文件描述符中读取

java - 无法使用 Git Bash 运行 Java 应用程序

unix - 如何更改 shell 脚本字符编码？

java - 如何使用 contains 在自定义对象 ArrayList 中搜索特定字符串？

php - 我网站上的搜索模块出现问题

algorithm - search.twitter.com 的 "trending topics"算法是什么？

macos - BASH 列表搜索 Awk

multithreading - 如何在Unix中使用fork()？为什么不使用fork(pointerToFunctionToRun)形式的东西？

linux - 学习 bash : Append a line to list of files