python - 错误的 Word() 字符串给出了误导性的错误位置

标签 python pyparsing

在我非常深入的 PyParsing(132 个关键字)中,我遇到了一些奇怪的事情。这可能是我对逻辑的运用。但话又说回来,可能不是。

ISC Bind9 配置文件有子句(有点像 INI 部分):

  • 有一个子句是强制性的(选项)
  • 其他子句是 ZeroOrMore()

任何向强制 options 子句添加解析器复杂性的尝试都会导致上述逻辑的破坏。

我必须剥离不受影响的解析器逻辑,直到它开始工作,然后必须来回摇动代码,直到我找到由于引入此 pyparsing 代码而导致的确切损坏:

print("Using 'example1' as a Word() to inside 'options{ };':")
clauses_mandatory_complex = (
        Keyword('options')
        + Literal('{')
        + Word('[a-zA-Z0-9]')
        + Literal(';')
        + Literal('}')
        + Literal(';')
)

作为一个独立的ParserElement,这个clause_mandatory_complex工作得很好。

直到我尝试引入子句逻辑:

    # Exactly one parse_element ('options' clause)
    # and any number of other clauses
    clauses_all_and = (
        clause_mandatory_complex
        & ZeroOrMore(clauses_zero_or_more)
    )

它的子句逻辑开始失败。

如果我取出Word(),如下所示:

print("Using 'example1' as a Literal() to inside 'options{ };':")
clauses_mandatory_simple = (
        Keyword('options')
        + Literal('{')
        + Literal('example1')
        + Literal(';')
        + Literal('}')
        + Literal(';')
)

我的子句逻辑按预期再次开始工作。

这对我来说太奇怪了,所以我把它贴在这里。

下面是一个工作的独立测试程序,演示了上面给出的差异:

#!/usr/bin/env python3
from pyparsing import ZeroOrMore, Word, Keyword, Literal
from pprint import PrettyPrinter

pp = PrettyPrinter(width=81, indent=4)

clauses_zero_or_more = (
        (Keyword('acl') + ';')
        | (Keyword('server') + ';')
        | (Keyword('view') + ';')
        | (Keyword('zone') + ';')
    )

def test_me(parse_element, test_data, fail_assert):
    # Exactly one parse_element ('options' clause)
    # and any number of other clauses
    clauses_all_and = (
        parse_element
        & ZeroOrMore(clauses_zero_or_more)
    )
    result = clauses_all_and.runTests(test_data, parseAll=True, printResults=True,
                                      failureTests=fail_assert)
    pp.pprint(result)
    return result

def print_all_results(pass_result, fail_result):
    print("Purposely passed test: {}. ".format(pass_result[0]))
    print("Purposely failed test: {}. ".format(fail_result[0]))
    print('\n')

passing_test_data = """
options { example1; };
acl; options { example1; };
options { example1; }; acl;
options { example1; }; server;
server; options { example1; };
acl; options { example1; }; server;
acl; server; options { example1; };
options { example1; }; acl; server;
options { example1; }; server; acl;
server; acl; options { example1; };
server; options { example1; }; acl;
"""
failing_test_data = """
acl;
acl; acl;
server; acl;
server;
acl; server;
options { example1; }; options { example1; };
"""


print("Using 'example1' as a Literal() to inside 'options{ };':")
clauses_mandatory_simple = (
        Keyword('options')
        + Literal('{')
        + Literal('example1')
        + Literal(';')
        + Literal('}')
        + Literal(';')
)
pass_result = test_me(clauses_mandatory_simple, passing_test_data, False)
fail_result = test_me(clauses_mandatory_simple, failing_test_data, True)
print_all_results(pass_result, fail_result)

# Attempted to introduced some more qualifiers to 'options' failed
print("Using 'example1' as a Word() to inside 'options{ };':")
clauses_mandatory_complex = (
        Keyword('options')
        + Literal('{')
        + Word('[a-zA-Z0-9]')
        + Literal(';')
        + Literal('}')
        + Literal(';')
)
pass_result = test_me(clauses_mandatory_complex, passing_test_data, False)
fail_result = test_me(clauses_mandatory_complex, failing_test_data, True)
print_all_results(pass_result, fail_result)

测试运行的输出如下:

/work/python/parsing/isc_config2/how-bad.py
Using 'example1' as a Literal() to inside 'options{ };':

options { example1; };
['options', '{', 'example1', ';', '}', ';']

acl; options { example1; };
['acl', ';', 'options', '{', 'example1', ';', '}', ';']

options { example1; }; acl;
['options', '{', 'example1', ';', '}', ';', 'acl', ';']

options { example1; }; server;
['options', '{', 'example1', ';', '}', ';', 'server', ';']

server; options { example1; };
['server', ';', 'options', '{', 'example1', ';', '}', ';']

acl; options { example1; }; server;
['acl', ';', 'options', '{', 'example1', ';', '}', ';', 'server', ';']

acl; server; options { example1; };
['acl', ';', 'server', ';', 'options', '{', 'example1', ';', '}', ';']

options { example1; }; acl; server;
['options', '{', 'example1', ';', '}', ';', 'acl', ';', 'server', ';']

options { example1; }; server; acl;
['options', '{', 'example1', ';', '}', ';', 'server', ';', 'acl', ';']

server; acl; options { example1; };
['server', ';', 'acl', ';', 'options', '{', 'example1', ';', '}', ';']

server; options { example1; }; acl;
['server', ';', 'options', '{', 'example1', ';', '}', ';', 'acl', ';']
(   True,
    [   (   'options { example1; };',
            (['options', '{', 'example1', ';', '}', ';'], {})),
        (   'acl; options { example1; };',
            (['acl', ';', 'options', '{', 'example1', ';', '}', ';'], {})),
        (   'options { example1; }; acl;',
            (['options', '{', 'example1', ';', '}', ';', 'acl', ';'], {})),
        (   'options { example1; }; server;',
            (['options', '{', 'example1', ';', '}', ';', 'server', ';'], {})),
        (   'server; options { example1; };',
            (['server', ';', 'options', '{', 'example1', ';', '}', ';'], {})),
        (   'acl; options { example1; }; server;',
            (['acl', ';', 'options', '{', 'example1', ';', '}', ';', 'server', ';'], {})),
        (   'acl; server; options { example1; };',
            (['acl', ';', 'server', ';', 'options', '{', 'example1', ';', '}', ';'], {})),
        (   'options { example1; }; acl; server;',
            (['options', '{', 'example1', ';', '}', ';', 'acl', ';', 'server', ';'], {})),
        (   'options { example1; }; server; acl;',
            (['options', '{', 'example1', ';', '}', ';', 'server', ';', 'acl', ';'], {})),
        (   'server; acl; options { example1; };',
            (['server', ';', 'acl', ';', 'options', '{', 'example1', ';', '}', ';'], {})),
        (   'server; options { example1; }; acl;',
            (['server', ';', 'options', '{', 'example1', ';', '}', ';', 'acl', ';'], {}))])

acl;
^
FAIL: Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)

acl; acl;
^
FAIL: Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)

server; acl;
^
FAIL: Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1)

server;
^
FAIL: Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1)

acl; server;
^
FAIL: Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)

options { example1; }; options { example1; };
                       ^
FAIL: Expected end of text, found 'o'  (at char 23), (line:1, col:24)
(   True,
    [   (   'acl;',
            Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)),
        (   'acl; acl;',
            Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)),
        (   'server; acl;',
            Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1)),
        (   'server;',
            Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1)),
        (   'acl; server;',
            Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)),
        (   'options { example1; }; options { example1; };',
            Expected end of text, found 'o'  (at char 23), (line:1, col:24))])
Purposely passed test: True. 
Purposely failed test: True. 


Using 'example1' as a Word() to inside 'options{ };':
/usr/local/lib/python3.7/site-packages/pyparsing.py:3161: FutureWarning: Possible nested set at position 1
  self.re = re.compile(self.reString)

options { example1; };
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o'  (at char 0), (line:1, col:1)

acl; options { example1; };
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)

options { example1; }; acl;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o'  (at char 0), (line:1, col:1)

options { example1; }; server;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o'  (at char 0), (line:1, col:1)

server; options { example1; };
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1)

acl; options { example1; }; server;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)

acl; server; options { example1; };
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)

options { example1; }; acl; server;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o'  (at char 0), (line:1, col:1)

options { example1; }; server; acl;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o'  (at char 0), (line:1, col:1)

server; acl; options { example1; };
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1)

server; options { example1; }; acl;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1)
(   False,
    [   (   'options { example1; };',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o'  (at char 0), (line:1, col:1)),
        (   'acl; options { example1; };',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)),
        (   'options { example1; }; acl;',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o'  (at char 0), (line:1, col:1)),
        (   'options { example1; }; server;',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o'  (at char 0), (line:1, col:1)),
        (   'server; options { example1; };',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1)),
        (   'acl; options { example1; }; server;',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)),
        (   'acl; server; options { example1; };',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)),
        (   'options { example1; }; acl; server;',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o'  (at char 0), (line:1, col:1)),
        (   'options { example1; }; server; acl;',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o'  (at char 0), (line:1, col:1)),
        (   'server; acl; options { example1; };',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1)),
        (   'server; options { example1; }; acl;',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1))])

acl;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)

acl; acl;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)

server; acl;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1)

server;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1)

acl; server;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)

options { example1; }; options { example1; };
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o'  (at char 0), (line:1, col:1)
(   True,
    [   (   'acl;',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)),
        (   'acl; acl;',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)),
        (   'server; acl;',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1)),
        (   'server;',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's'  (at char 0), (line:1, col:1)),
        (   'acl; server;',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a'  (at char 0), (line:1, col:1)),
        (   'options { example1; }; options { example1; };',
            Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o'  (at char 0), (line:1, col:1))])
Purposely passed test: False. 
Purposely failed test: True. 

编辑: 在这里发现错误:

Word('[a-zA-Z0-9]')

应该是:

Word(srange('[a-zA-Z0-9]'))

有没有办法改进该错误的插入符“^”定位,使其指向测试数据“example1”而不是关键字?这会节省很多时间。

最佳答案

此类问题的基本答案通常是将一个或几个'+' 运算符替换为'-' 运算符。 '-' 告诉 pyparsing 如果在后续匹配中发现错误,则禁用回溯。

例如,如果您的语法中有一个在其他地方没有使用的关键字,那么您应该合理地预期该关键字之后的任何解析错误都是真正的错误,而不仅仅是不匹配的替代错误。在此关键字后面加上 '-' 是让解析器指示特定错误位置的好方法,而不是仅仅标记一组更高级别的替代项都不匹配。

确实必须小心使用“-”,而不仅仅是将所有“+”实例替换为“-”,因为这会破坏所有回溯,并且可能会阻止您的解析器匹配合法的替代表达式。

所以我打算发布以下内容可以改善您的错误消息:

clauses_mandatory_complex = (
        Keyword('options')
        - Literal('{')
        + Word('[a-zA-Z0-9]')
        + Literal(';')
        + Literal('}')
        + Literal(';')
)

但是当我尝试时,我并没有真正得到更好的结果。在这种情况下,令人困惑的问题是您使用 '&' 来获得无序的 Each 匹配,虽然在您的解析器中完全合法,但混合了异常处理(可能会发现错误)在 pyparsing 中)。如果您在 clauses_all_and 表达式中将 '&' 替换为 '+',您将看到 '-' > 运算符(operator)在这里工作:

options { example1; };
          ^(FATAL)
FAIL: Expected W:([a-z...), found 'e'  (at char 10), (line:1, col:11)

事实上,这指向 pyparsing 的一般调试策略:如果复杂表达式没有给出有用的异常消息,则单独尝试子表达式。

当使用包含 MatchFirst 或 Or 表达式('|''^' 运算符)的语法时,Pyparsing 会进行大量回溯和重试,但更是如此处理 Each('&' 运算符)时。在您的情况下,当我使用 '-' 运算符时,引发了非回溯异常,但 Each 将其降级为回溯异常,以便它可以继续尝试其他组合。我会进一步研究这个问题,看看是否有好的方法来避免这种降级。

关于python - 错误的 Word() 字符串给出了误导性的错误位置,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57342937/

相关文章:

python - tkinter:下拉菜单不可点击

python - 一个接一个地计算多个正弦波

python - 解析 : issues with setResultsName

python - pyparsing 性能和内存使用

python - 使用 multiprocessing.pool map() 时,pyparsing.asDict 出现 MaybeEncodingError

python - pyparsing,转发和递归

python - 生成序列的字符数

python - 如何检查字符串是否是python中的有效JSON

python - Micromamba 和 Dockerfile 错误 :/bin/bash: activate: No such file or directory

python - pyparsing 定量不可交换