java - 记录 antlr v3 解析器 Activity

标签 java antlr

我希望能够记录 Antlr v3(准确地说是 3.0.1)生成的语法的解析过程。我尝试使用 DebugTreeParser 但它什么也没做,看起来它的方法从未被调用过。

理想情况下,我希望能够输出如下内容,即已尝试/已执行规则的痕迹:

 parsing: program (token: Foo)
 parsing: statements (token: Foo)
 parsing: statement (token: Foo)
 parsing: block (token: Foo)
 parsed: block -> false (at 0)
 parsing: method call (token: Foo)
 parsing: variable (token: Foo)
 parsed: variable -> true (at 1)
 ...

这是我的解析代码:

        CharStream cs = new ANTLRReaderStream(script);
        MyLexer lex = new MyLexer(cs);
        CommonTokenStream tokens = new CommonTokenStream(lex);
        MyParser parser = new MyParser(tokens);
        return new Program(makeProgram((Tree) parser.program().getTree()));

我尝试了在 Antlr Wiki 上找到的解决方案:

 ...
 ParseTreeBuilder builder = new ParseTreeBuilder("prog");

 MyParser parser = new MyParser(tokens);
 parser.setTreeAdaptor(new DebugTreeAdaptor(builder, parser.getTreeAdaptor()));

但是构建器没有输出任何有趣的东西。

也许有一个选项可以在源语法中激活以生成调试兼容的解析器?

最佳答案

首先,使用-debug 命令行选项生成语法。完成此操作后,您的 token 解析器将具有额外的、以调试为中心的构造函数,允许您使用自定义 DebugEventListener 或内置构造函数。由于您要进行自定义日志记录,下面是一个使用自定义 DebugEventListener 的示例解决方案来帮助您入门。

这是我将用于测试的语法。它可能包含问题。

DebugMe.g

grammar DebugMe;


compilationUnit : statements EOF;
statements      : statement+;
statement       : block | call | decl;
block           : LCUR statements RCUR;    
call            : ID LPAR arglist? RPAR SEMI;
arglist         : ID (COMMA ID)*;    
decl            : VAR ID EQ expr SEMI;
expr            : add_expr;     
    
add_expr        : primary_expr ((PLUS|MINUS) primary_expr)*;    
primary_expr    : STRING | ID | INT | LPAR expr RPAR;    
    
VAR: 'var';   
ID: ('a'..'z'|'A'..'Z')+;
INT: ('0'..'9')+;
STRING: '"' ~('\r'|'\n'|'"')* '"';
SEMI: ';';
LPAR: '(';
RPAR: ')';
LCUR: '{';
RCUR: '}';
PLUS: '+';
MINUS: '-';    
COMMA: ',';
EQ: '=';
WS: (' '|'\t'|'\f'|'\r'|'\n') {skip();};

这是我将使用的测试程序。请注意,我省略了 newEventListener 的实现。

TestDebugMeGrammar.java

public class TestDebugMeGrammar {

    public static void main(String[] args) throws Exception {

        CharStream input = new ANTLRStringStream("var x = 3; print(x);");

        DebugMeLexer lexer = new DebugMeLexer(input);
        CommonTokenStream tokens = new CommonTokenStream(lexer);

        DebugMeParser parser = new DebugMeParser(tokens, newEventListener());

        parser.compilationUnit();
    }
    //...
}

我不太熟悉解析器如何调用 DebugEventListener,所以我将从一个简单的 Proxy 实现开始,它会转储每个调用最少的麻烦:

//TestDebugMeGrammar.java

    private static DebugEventListener newEventListener() {
        return (DebugEventListener) Proxy.newProxyInstance(TestDebugMeGrammar.class.getClassLoader(),
                new Class[] { DebugEventListener.class },
                new DebugListenerHandler());
    }

    public static class DebugListenerHandler implements InvocationHandler {
        @Override
        public Object invoke(Object proxy, Method method, Object[] args)
                throws Throwable {

            // simply print out the method call.
            System.out.print(method.getName());

            if (args != null && args.length > 0) {
                System.out.print(": ");
                for (int i = 0, count = args.length; i < count; ++i) {
                    Object arg = args[i];
                    if (arg == null) {
                        System.out.printf("<(null)> ");
                    } else {
                        System.out.printf("<%s> ", arg.toString());
                    }
                }
            }
            
            System.out.println();
            return null;
        }
    }

输出内容很丰富,但它可以很好地了解听者听到的内容。

enterRule: <DebugMe.g> <compilationUnit> 
commence
location: <4> <1> 
enterAlt: <1> 
location: <5> <7> 
enterRule: <DebugMe.g> <statements> 
location: <7> <1> 
enterAlt: <1> 
location: <8> <7> 
enterSubRule: <1> 
enterDecision: <1> <false> 
LT: <1> <[@0,0:2='var',<11>,1:0]> 
exitDecision: <1> 
enterAlt: <1> 
location: <8> <7> 
enterRule: <DebugMe.g> <statement> 
location: <10> <1> 
enterDecision: <2> <false> 
LT: <1> <[@0,0:2='var',<11>,1:0]> 
exitDecision: <2> 
enterAlt: <3> 
location: <13> <7> 
enterRule: <DebugMe.g> <decl> 
...

根据我从上面收集到的内容,这里有一个小型的、专注的监听器。输出更接近您想要的,可以作为您有用的起点。

//TestDebugMeGrammar.g
    //redefinition
    private static DebugEventListener newEventListener() {
        return new SimpleDebugEventListener();
    }

    private static class SimpleDebugEventListener extends
            BlankDebugEventListener {
        
        private Token lastToken;
        @Override
        public void LT(int i, Object t) {
            System.out.println("Read object \"" + t + "\"");
        }

        @Override
        public void LT(int i, Token t) {
            if (!t.equals(lastToken)){
                System.out.println("Read input \"" + t.getText() + "\"");
                lastToken = t;
            }
        }

        @Override
        //public void enterRule(String ruleName) { // <-- ANTLR 3.0.1
        public void enterRule(String grammarFileName, String ruleName) { //<-- ANTLR 3.4
            System.out.println("Entered rule " + ruleName);
        }

        @Override
        //public void exitRule(String ruleName) { // <-- ANTLR 3.0.1
        public void exitRule(String grammarFileName, String ruleName) { //<-- ANTLR 3.4
            System.out.println("Exited rule " + ruleName);
        }

        @Override
        public void consumeToken(Token token) {
            System.out.println("Consumed \"" + token.getText() + "\"");
        }
    }

这是输出:

Entered rule compilationUnit
Entered rule statements
Read input "var"
Entered rule statement
Entered rule decl
Consumed "var"
Read input "x"
Consumed "x"
Read input "="
Consumed "="
Entered rule expr
Entered rule add_expr
Entered rule primary_expr
Read input "3"
Consumed "3"
Exited rule primary_expr
Read input ";"
Exited rule add_expr
Exited rule expr
Consumed ";"
Exited rule decl
Exited rule statement
Read input "print"
Entered rule statement
Entered rule call
Consumed "print"
Read input "("
Consumed "("
Read input "x"
Entered rule arglist
Consumed "x"
Read input ")"
Exited rule arglist
Consumed ")"
Read input ";"
Consumed ";"
Exited rule call
Exited rule statement
Read input "<EOF>"
Exited rule statements
Consumed "<EOF>"
Exited rule compilationUnit

我最初使用 ANTLR 3.4 测试并运行了上述代码。根据您的规范,我用 ANTLR 3.0.1 重新测试了它,您需要做的唯一更改是在 SimpleDebugEventListener 类中。我已更新代码以指示需要更改的位置以及更改的内容。


只是为了好玩,这是一个修改后的 SimpleDebugEventListener,它打印我认为更类似于您的日志记录目标的输出。

    private static class SimpleDebugEventListener extends
            BlankDebugEventListener {

        private LinkedList<String> activeRules = new LinkedList<String>();

        @Override
        public void enterRule(String grammar, String ruleName) {  //ANTLR 3.4
            activeRules.add(ruleName);
        }

        @Override
        public void exitRule(String grammar, String ruleName) { //ANTLR 3.4
            activeRules.removeLast();
        }

        @Override
        public void consumeToken(Token token) {
            System.out.printf("%s consumed \"%s\"%n", formatRules(),
                    token.getText());
        }

        private String formatRules() {
            if (activeRules.size() == 1) {
                return activeRules.getLast();
            } else { 
                StringBuilder builder = new StringBuilder();
                boolean first = true;
                for (String rule : activeRules){
                    if (!first){
                        builder.append(" -> ");
                    } else { 
                        first = false;
                    }
                    builder.append(rule);
                }
                
                return builder.toString();
            }
        }
    }

输出:

compilationUnit -> statements -> statement -> decl consumed "var"
compilationUnit -> statements -> statement -> decl consumed "x"
compilationUnit -> statements -> statement -> decl consumed "="
compilationUnit -> statements -> statement -> decl -> expr -> add_expr -> primary_expr consumed "3"
compilationUnit -> statements -> statement -> decl consumed ";"
compilationUnit -> statements -> statement -> call consumed "print"
compilationUnit -> statements -> statement -> call consumed "("
compilationUnit -> statements -> statement -> call -> arglist consumed "x"
compilationUnit -> statements -> statement -> call consumed ")"
compilationUnit -> statements -> statement -> call consumed ";"
compilationUnit consumed "<EOF>"

关于java - 记录 antlr v3 解析器 Activity ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13740697/

相关文章:

java - 在android中持续 "check"fragment 可见性/键盘可见性的最佳方法

ant - 是否可以使用 ant 的 antlr 任务来使用 stringtemplate 进行代码生成?

Erlang 和 Antlr

java - 如何将自定义 java 类导入到我的 Antlr 语法中?

即使不匹配,ANTLR 词法分析器规则也会消耗字符?

java - 丰富 :dataTable sort dont update managed

java - 持续检查自动增量值是否发生变化的有效方法

java - 如何在 Android 应用程序中运行 Java jdt AST

java - 作为cronjob运行时如何处理Java运行时错误

java - 中间代码生成