regex - Grep 仅在空格后的第二部分

我在 shell 脚本中有一个解析器:

这是要解析的输入文件 (input.txt):

input.txt:
system.switch_cpus.commit.swp_count                 0                       # Number of s/w prefetches committed
  system.switch_cpus.commit.refs                2682887                       # Number of memory references committed
  system.switch_cpus.commit.loads               1779328                       # Number of loads committed                                                                                                                                                                                                                
  system.switch_cpus.commit.membars                   0                       # Number of memory barriers committed
  system.switch_cpus.commit.branches             921830                       # Number of branches committed
  system.switch_cpus.commit.vec_insts                 0                       # Number of committed Vector instructions.
  system.switch_cpus.commit.fp_insts                  0                       # Number of committed floating point instructions.
  system.switch_cpus.commit.int_insts          10000000                       # Number of committed integer instructions.

该脚本执行以下操作:

 $ cpu1_name="system.switch_cpus"
 $ echo "$(grep "${cpu1_name}.commit.loads" ./input.txt |grep -Eo '[0-9]+')"
 correct expected output: 1779328

但在另一个文件中，变量“cpu1_name”被更改为“system.switch_cpus_1” 现在运行相同的脚本会给我 2 个值:

New input file:
system.switch_cpus_1.commit.swp_count               0                       # Number of s/w prefetches committed
 system.switch_cpus_1.commit.refs              2682887                       # Number of memory references committed
 system.switch_cpus_1.commit.loads             1779328                       # Number of loads committed                                                                                                                                                                                                               
 system.switch_cpus_1.commit.membars                 0                       # Number of memory barriers committed
 system.switch_cpus_1.commit.branches           921830                       # Number of branches committed
 system.switch_cpus_1.commit.vec_insts               0                       # Number of committed Vector instructions.
 system.switch_cpus_1.commit.fp_insts                0                       # Number of committed floating point instructions.   


Modified Script line:
$ cpu1_name="system.switch_cpus_1"
$ echo "$(grep "${cpu1_name}.commit.loads" ./new_input.txt |grep -Eo '[0-9]+')"
1
1779328

如您所见，管道 grep 正在搜索任何数字，并由于更改了变量名称而报告了一个额外的“1”。

有没有办法只选择数字的第二部分(即只选择 1779328)？我知道我可以使用 awk'{print $2} 但这意味着要更改脚本中的很多行。所以我在想，现有的脚本行是否有更简单的技巧。

提前致谢

最佳答案

自 _被认为是一个单词 char，_ 之间没有单词边界和 1 .期望数两边有字界。

因此，您需要做的就是使用带有单词边界的模式。您可以使用 w选项匹配整个单词，或在 \b 之间选择或 \</\> ，以你的grep为准支持:

grep -Ewo '[0-9]+'
grep -Eo '\b[0-9]+\b'
grep -Eo '\<[0-9]+\>'

参见 online demo .

请注意，您也可以使用 sed从行中提取第二个非空白 block :

sed -E 's/^\s*\S+\s+(\S+).*/\1/'

查看此演示。

详情

^ - 行首
\s* - 0+ 个空格
\S+ - 1+ 个除空格以外的字符
\s+ - 1+ 个空白字符
(\S+) - 1+ 个非空白字符(第 1 组，正是我们在替换模式中保留的 \1 占位符)
.* - 该行的其余部分。

关于regex - Grep 仅在空格后的第二部分，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49188503/

regex - Grep 仅在空格后的第二部分

上一篇：Bash - 计算列的平均值和频率

下一篇：Bash 命令输出不保存到文本文件或变量