ruby - bash 中的正则表达式返回的结果与 ruby 不同

我正在尝试从字符串中提取匹配组 - 我使用过 Rubular提出一个模式:

\[(.*?)\]

在 ruby 中，似乎提取了以下字符串的预期组:

1547156981784 : Served [ Code128 ] with [ this_is_a_test ] in [ 12ms ] size [ 385B ] using [ http://barcodeapi.org/index.html ] for [ 1.2.3.4 ] via [ 5.6.7.8 ]

1: Code128
2: this_is_a_test
3: 12ms
4: 385B
5: http://barcodeapi.org/index.html
6: 1.2.3.4
7: 5.6.7.8

但是这个问题是我试图在 Bash 脚本中实现这个正则表达式来解析日志文件:

reg='\[(.*?)\]'
while read line; do
  if [[ $line =~ $reg ]]; then
    echo ${BASH_REMATCH[1]};
  fi
done < $log

但是结果和ruby/rubular不一样；在 Bash 中，匹配组 #1 包含整个字符串，减去第一个和最后一个括号；对于同一日志行，bash 仅返回一个匹配项:

1: Code128 ] with [ this_is_a_test ] in [ 12ms ] size [ 385B ] using [ http://barcodeapi.org/index.html ] for [ 1.2.3.4 ] via [ 5.6.7.8

问题是，

为什么两个引擎给出不同的结果？如何使用 Bash 正确分隔组？

最佳答案

几个问题:

Bash 中没有全局匹配；
您需要在 Bash 中手动循环多个匹配项并手动管理字符串索引；
Bash 正则表达式中使用的 ERE 中没有非贪婪量词，因此 .*? 的工作方式与 Ruby 中的工作方式不同。

您可以以此作为开始:

while read line; do
    while [[ $line =~ ([^\[]*)\[([^\]]*)\] ]]; do 
        i=${#BASH_REMATCH}
        line=${line:i}
        echo "${BASH_REMATCH[2]}"
    done
done < file

打印:

 Code128 
 this_is_a_test 
 12ms 
 385B 
 http://barcodeapi.org/index.html 
 1.2.3.4 
 5.6.7.8

如果您只使用 Perl/GNU grep/Ruby/等创建匹配列表，然后使用 Bash 循环该列表，那么您的头痛就会减少方式 :

while read m; do echo "Match: $m" done < <(ggrep -oP '(?<=\[)(.*?)(?=\])' file) # GNU grep is ggrep here

如果您的代码必须是 POSIX，请使用 awk:

$ awk -v RS=[ -v FS=] 'NR>1{print $1}' file

关于ruby - bash 中的正则表达式返回的结果与 ruby 不同，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54189421/

ruby - bash 中的正则表达式返回的结果与 ruby 不同

上一篇：使用 gets 时，Ruby 循环无法中断

下一篇：ruby - 当 block 有两个参数时，Ruby Array#map 的行为如何？

ruby - bash 中的正则表达式返回的结果与 ruby​​ 不同

上一篇：使用 gets 时，Ruby 循环无法中断

下一篇：ruby - 当 block 有两个参数时，Ruby Array#map 的行为如何？

ruby - bash 中的正则表达式返回的结果与 ruby 不同