regex - 使用 [] 的扩展正则表达式中的顺序是否重要？

我试图在 grep 中理解带有扩展正则表达式的 [] 语法。

下面两种模式是等价的:

$ echo "foo_bar" | grep -E "[a-z_]+$"     
foo_bar
$ echo "foo_bar" | grep -E "[_a-z]+$" 
foo_bar

然而，这两个不是:

$ echo "foobar[]" | grep -E "[a-z_\[\]]+$" 
foobar[]
$ echo "foobar[]" | grep -E "[a-z\[\]_]+$"

这是为什么？这在任何地方都有记录吗？我在 man grep 中看不到任何关于此的内容。

最佳答案

使用双引号 " 和反斜杠 \ 时要小心，因为 BASH 首先处理反斜杠。这会将正则表达式更改为 [a-z_[ ]]+$。不过，还有一点很好，对于这个问题的其余部分，我假设您使用了单引号。

在第一种情况下，您有字符组 [a-z_\[\]，它匹配字符 a-z、_、\，[。最后的 \] 没有将 ] 列为字符组的另一个字符，而是另一个 \ 和字符类的右括号.注意如何:

$ echo "foobar[]" | grep -E '[a-z\[\]+\]+$'
foobar[]
$ echo '\' | grep -E '[\]$'
\

如果你想添加]你必须先列出它，即[]]匹配单个]。

$ echo "]" | grep -E '[]]$'
]

有关引用，请参阅 man grep:

To include a literal ] place it first in the list. Similarly, to include a literal ^ place it anywhere but first. Finally, to include a literal - place it last.

以及https://www.regular-expressions.info/charclass.html

In most regex flavors, the only special characters or metacharacters inside a character class are the closing bracket ], the backslash \, the caret ^, and the hyphen -. The usual metacharacters are normal characters inside a character class, and do not need to be escaped by a backslash. To search for a star or plus, use [+*]. Your regex will work fine if you escape the regular metacharacters inside a character class, but doing so significantly reduces readability.

更多的测试用例来检查 [\s](与 [s\] 相同，与 [[:space:]] 不同):

$ echo 'a ' | grep -E 'a[\s]$'
$ echo 's' | grep -E '[\s]$'
s
$ echo '\' | grep -E '[\s]$'
\
$ echo 'a ' | grep -E 'a[[:space:]]$'
a

所以要点是:在列出字符类的字符时，顺序无关紧要，除非它确实如此。

关于regex - 使用 [] 的扩展正则表达式中的顺序是否重要？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52938561/

regex - 使用 [] 的扩展正则表达式中的顺序是否重要？

上一篇：sass - scss bootstrap 4 覆盖 map

下一篇：terraform - 使用terraform模块时如何 "override"一些资源参数？