regex - 如何提取正则表达式注释

我有一个这样的正则表达式

(?<!(\w/))$#Cannot end with a word and slash

我想从最后提取评论。虽然该示例没有反射(reflect)这种情况，但可能存在一个正则表达式，其中包含哈希上的正则表达式。

\##value must be a hash

什么是正则表达式来提取注释以确保它在用于可能包含不是注释的 # 的正则表达式时是安全的。

最佳答案

这是一个 .Net 风格的正则表达式，用于部分解析 .Net 风格的模式，应该非常接近:

\A
(?>
    \\.         # Capture an escaped character
    |           # OR
    \[\^?       # a character class
        (?:\\.|[^\]])*    # which may also contain escaped characters
    \]
    |           # OR
    \(\?(?# inline comment!)\#      
        (?<Comment>[^)]*)
    \)
    |           # OR
    \#(?<Comment>.*$)   # a common comment!
    |           # OR
    [^\[\\#]    # capture any regular character - not # or [
)*
\z

幸运的是，在 .Net 中，每个捕获组都会记住它的所有捕获，而不仅仅是最后一个，因此我们可以在一次解析中找到 Comment 组的所有捕获。正则表达式几乎可以解析正则表达式 - 但几乎不能完全解析，它只是解析到足以找到注释。
以下是您如何使用结果:

Match parsed = Regex.Match(pattern, pattern,
                           RegexOptions.IgnorePatternWhitespace | 
                           RegexOptions.Multiline);
if (parsed.Success)
{
    foreach (Capture capture in parsed.Groups["Comment"].Captures)
    {
        Console.WriteLine(capture.Value);
    }
}

工作示例:http://ideone.com/YP3yt

最后一句警告 - 此正则表达式假定整个模式处于 IgnorePatternWhitespace 模式。如果未设置，则所有 # 都按字面意思匹配。请记住，标志可能会在一个模式中多次更改。以(?-x)#(?x)#comment为例，不管IgnorePatternWhitespace如何，第一个#都是字面匹配的，(?x) 重新打开 IgnorePatternWhitespace 标志，第二个 # 被忽略。

如果您想要一个强大的解决方案，您可以使用正则表达式语言解析器。
您可能可以调整 .Net 源代码并提取解析器:

关于regex - 如何提取正则表达式注释，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/5073826/

regex - 如何提取正则表达式注释

上一篇：jasper-reports - 将值从主报告传递到子报告？

下一篇：asp.net - 在 .ashx 文件中接收 POST 数据