linux - Grep 用于文件中的多个模式

我想计算我的 xml 文件中的 xml 节点数(grep 或某种方式)。

....
<countryCode>GBR</countryCode>
<countryCode>USA</countryCode>
<countryCode>CAN</countryCode>
...
<countryCode>CAN</countryCode>
<someNode>USA</someNode>
<countryCode>CAN</countryCode>
<someNode>Otherone</someNode>
<countryCode>GBR</countryCode>
...

如何计算各个国家/地区的数量，例如 CAN = 3、美国 = 1、GBR = 2？如果不传入国家名称，可能还会有更多国家？

更新:

除了countrycode还有其他节点

最佳答案

我的简单建议是使用 sort 和 uniq -c

$ echo '<countryCode>GBR</countryCode>
<countryCode>USA</countryCode>
<countryCode>CAN</countryCode>
<countryCode>CAN</countryCode>
<countryCode>CAN</countryCode>
<countryCode>GBR</countryCode>' | sort | uniq -c
      3 <countryCode>CAN</countryCode>
      2 <countryCode>GBR</countryCode>
      1 <countryCode>USA</countryCode>

您将在 grep 而不是 echo 的输出中进行管道传输。一个更健壮的解决方案是使用 XPath。如果您的 XML 文件看起来像

<countries>
  <countryCode>GBR</countryCode>
  <countryCode>USA</countryCode>
  <countryCode>CAN</countryCode>
  <countryCode>CAN</countryCode>
  <countryCode>CAN</countryCode>
  <countryCode>GBR</countryCode>
</countries>

然后你可以使用:

$ xpath -q -e '/countries/countryCode/text()'  countries.xml  | sort | uniq -c
      3 CAN
      2 GBR
      1 USA

我说它更健壮，因为使用专为解析平面文本而设计的工具在处理 XML 时本身就不稳定。根据原始 XML 文件的上下文，不同的 XPath 查询可能会更好，这将在任何地方匹配它们:

$ xpath -q -e '//countryCode/text()'  countries.xml  | sort | uniq -c
      3 CAN
      2 GBR
      1 USA

关于linux - Grep 用于文件中的多个模式，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/9587310/

linux - Grep 用于文件中的多个模式

上一篇：linux - 这个shell脚本有什么错误

下一篇：linux - 如何在命令中写命令