regex - 删除匹配和上一行

我需要使用 grep、awk、sed 或其他工具从流中删除包含“不是动态可执行文件”的行和前一行。我当前的工作解决方案是 tr 整个流以去除换行符，然后使用 sed 将我的匹配项之前的换行符替换为其他内容，然后使用 tr 将换行符添加回去，然后使用 grep -v。我对这种方法的人工制品有些厌倦，但目前我看不出还有什么办法:

tr '\n' '|' | sed 's/|\tnot a dynamic executable/__MY_REMOVE/g' | tr '|' '\n'

编辑:

输入是通过管道传输到 xargs ldd 的混合文件列表，基本上我想忽略所有关于非库文件的输出，因为这与我接下来要做的事情无关。我不想使用 lib*.so 掩码，因为这可能会有所不同

最佳答案

在多行模式下使用 pcregrep 最简单:

pcregrep -vM '\n\tnot a dynamic executable' filename

如果 pcregrep 对您不可用，那么 awk 或 sed 也可以通过提前读取一行并跳过打印出现标记线时的前几行。

你可能对 awk 感到厌烦(但理智):

awk '/^\tnot a dynamic executable/ { flag = 1; next } !flag && NR > 1 { print lastline; } { flag = 0; lastline = $0 } END { if(!flag) print }' filename

即:

/^\tnot a dynamic executable/ {  # in lines that start with the marker
  flag = 1                       # set a flag
  next                           # and do nothing (do not print the last line)
}
!flag && NR > 1 {                # if the last line was not flagged and
                                 # is not the first line
  print lastline                 # print it
}
{                                # and if you got this far,
  flag = 0                       # unset the flag
  lastline = $0                  # and remember the line to be possibly
                                 # printed.
}
END {                            # in the end
  if(!flag) print                # print the last line if it was not flagged
}

但是 sed 很有趣:

sed ':a; $! { N; /\n\tnot a dynamic executable/ d; P; s/.*\n//; ba }' filename

解释:

:a                                  # jump label

$! {                                # unless we reached the end of the input:

  N                                 # fetch the next line, append it

  /\n\tnot a dynamic executable/ d  # if the result contains a newline followed
                                    # by "\tnot a dynamic executable", discard
                                    # the pattern space and start at the top
                                    # with the next line. This effectively
                                    # removes the matching line and the one
                                    # before it from the output.

                                    # Otherwise:
  P                                 # print the pattern space up to the newline
  s/.*\n//                          # remove the stuff we just printed from
                                    # the pattern space, so that only the
                                    # second line is in it

  ba                                # and go to a
}
                                    # and at the end, drop off here to print
                                    # the last line (unless it was discarded).

或者，如果文件小到可以完全存储在内存中:

sed ':a $!{N;ba}; s/[^\n]*\n\tnot a dynamic executable[^\n]*\n//g' filename

在哪里

:a $!{ N; ba }                                  # read the whole file into
                                                # the pattern space
s/[^\n]*\n\tnot a dynamic executable[^\n]*\n//g # and cut out the offending bit.

关于regex - 删除匹配和上一行，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28566616/

regex - 删除匹配和上一行

上一篇：r - `smooth.spline` 严重欠拟合长(周期)时间序列

下一篇：azure - 如何通过Azure数据工厂调用exe文件？