linux - 如何在 grep 中进行 grep

我有一堆大量的文本文件，每个大约 100MB。

我想用 grep 查找其中包含“INDIANA JONES”的条目:

$ grep -ir 'INDIANA JONES' ./

然后，我想在 INDIANA JONES 术语的 5,000 个字符内找到单词 PORTUGAL 的条目。我该怎么做？

# in pseudocode
grep -ir 'INDIANA JONES' ./ | grep 'PORTUGAL' within 5000 char

最佳答案

使用 grep 的 -o 标志输出匹配项周围的 5000 个字符，然后在这些字符中搜索第二个字符串。例如:

grep -ioE ".{5000}INDIANA JONES.{5000}" file.txt | grep "PORTUGAL"

如果您需要原始匹配项，请将 -n 标志添加到第二个 grep 并通过管道输入:

cut -f1 -d: > line_numbers.txt

然后您可以使用 awk 打印这些行:

awk 'FNR==NR { a[$0]; next } FNR in a' line_numbers.txt file.txt

为了避免临时文件，可以这样写:

awk 'FNR==NR { a[$0]; next } FNR in a' <(grep -ioE ".{50000}INDIANA JONES.{50000}" file.txt | grep -n "PORTUGAL" | cut -f1 -d:) file.txt

对于多个文件，使用find 和一个bash 循环:

for i in $(find . -type f); do
    awk 'FNR==NR { a[$0]; next } FNR in a' <(grep -ioE ".{50000}INDIANA JONES.{50000}" "$i" | grep -n "PORTUGAL" | cut -f1 -d:) "$i"
done

关于linux - 如何在 grep 中进行 grep，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/19870863/

上一篇：Python shutil.copy 在 FAT 文件系统 (Ubuntu) 上失败

下一篇：c++ - 如何在 Linux 中配置和设置谷歌测试框架

python - 以随机顺序打印字典的内容

linux - 为 mvn jetty :run 生成 System V 初始化脚本

svn - Svn Up 错误 : "svn: Server sent unexpected return value (403 Forbidden) in response to OPTIONS request for <URL>"

c - 某些调用后套接字不工作

c - iNotify 如何检测移出

linux - 如何在脚本中运行 .profile

linux - grep、awk 和 sed 有什么区别？

perl - 哪个更有效，Perl 模式匹配还是 grep？

linux - 如何使用 grep 过滤掉包含特定日期之前的日期的行