perl - grep 变量并给出信息输出

我想查看特定单词在文件/行中被提及了多少次。

我的虚拟示例如下所示:

cat words
blue
red 
green
yellow 

cat text
TEXTTEXTblueTEXTTEXTblue
TEXTTEXTgreenblueTEXTTEXT
TEXTTEXyeowTTEXTTEXTTEXT

我这样做:

for i in $(cat words); do grep "$i" text | wc >> output; done

cat output
  2       2      51
  0       0       0
  1       1      26
  0       0       0

但我真正想要得到的是:
1. 用作变量的单词；
2. 在多少行中(除了文本命中之外)单词被找到。

最好的输出是这样的:

blue    3   2
red     0   0 
green   1   1
yellow  0   0

$1 - 被 grep 处理的变量
$2 - 在文本中找到变量的次数
$3 - 在多少行中找到了变量

希望有人可以帮助我使用 grep、awk、sed 来完成这项工作，因为它们对于大型数据集来说足够快，但是 Perl one liner 也会帮助我。

编辑

试过了

   for i in $(cat words); do grep "$i" text > out_${i}; done && wc out*

它看起来不错，但有些单词超过 300 个字母，所以我无法创建与该单词同名的文件。

最佳答案

您可以使用grep option -o它只打印匹配行的匹配部分，每个匹配项在单独的输出行上。

while IFS= read -r line; do
    wordcount=$(grep -o "$line" text | wc -l)
    linecount=$(grep -c "$line" text)
    echo $line $wordcount $linecount
done < words | column -t

您可以将所有内容放在一行中，使其成为一行。

如果 column 给出“column too long”错误，只要知道最大字符数，就可以使用 printf。使用下面而不是 echo 并删除管道到列:

printf "%-20s %-2s %-2s\n" "$line" $wordcount $linecount

将 20 替换为您的最大字长，如果需要，也可以替换其他数字。

关于perl - grep 变量并给出信息输出，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/14536051/

perl - grep 变量并给出信息输出

上一篇：bash - 为什么这个带有 `while [expr]` 的 bash 脚本不会运行？

下一篇：bash set -x 和流