c - 如何输出使用 ANTLR 构建的 AST？

我正在为 C 制作一个静态分析器。我已经使用生成 Java 代码的 ANTLR 完成了词法分析器和解析器。

ANTLR 是否通过 options {output=AST;} 自动为我们构建 AST？还是我必须自己制作树？如果是，那么如何吐出该 AST 上的节点？

我目前在想该 AST 上的节点将用于制作 SSA，然后进行数据流分析以制作静态分析器。我在正确的道路上吗？

最佳答案

Raphael wrote:

Does antlr build the AST for us automatically by option{output=AST;}? Or do I have to make the tree myself? If it does, then how to spit out the nodes on that AST?

不，解析器不知道你想要什么作为每个解析器规则的根和叶，所以你需要做的不仅仅是把 options { output=AST; 在你的语法中。

例如，当使用语法生成的解析器解析源代码“true && (false || true && (true || false))”时:

grammar ASTDemo;

options { 
  output=AST; 
}

parse
  :  orExp
  ;

orExp
  :  andExp ('||' andExp)*
  ;

andExp
  :  atom ('&&' atom)*
  ;

atom
  :  'true'
  |  'false'
  |  '(' orExp ')'
  ;

// ignore white space characters
Space
  :  (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}
  ;

生成如下解析树:

enter image description here

(即只是一个平面的一维标记列表)

您需要告诉 ANTLR 在您的语法中哪些标记成为根、叶或干脆离开树。

可以通过两种方式创建 AST:

使用如下所示的重写规则:foo : A B C D -> ^(D A B);，其中 foo 是匹配标记 A B C D 的解析器规则。所以 -> 之后的所有内容都是实际的重写规则。如您所见，重写规则中未使用标记 C，这意味着它在 AST 中被省略。紧接在^(之后的token会成为树的根；
使用树运算符 ^ 和 ! 解析器规则中的标记，其中 ^ 将使 token 成为根，而 ! 将从树中删除 token 。 foo : A B C D -> ^(D A B); 的等价物是 foo : A B C! D^;

foo : A B C D -> ^(D A B); 和 foo : A B C! D^; 将产生以下 AST:

enter image description here

现在，您可以按如下方式重写语法:

grammar ASTDemo;

options { 
  output=AST; 
}

parse
  :  orExp
  ;

orExp
  :  andExp ('||'^ andExp)* // Make `||` root
  ;

andExp
  :  atom ('&&'^ atom)* // Make `&&` root
  ;

atom
  :  'true'
  |  'false'
  |  '(' orExp ')' -> orExp // Just a single token, no need to do `^(...)`, 
                            // we're removing the parenthesis. Note that
                            // `'('! orExp ')'!` will do exactly the same.
  ;

// ignore white space characters
Space
  :  (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}
  ;

这将从源 “true && (false || true && (true || false))” 中创建以下 AST:

enter image description here

编辑

以下是如何使用生成的词法分析器和解析器:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String src = "true && (false || true && (true || false))";
    ASTDemoLexer lexer = new ASTDemoLexer(new ANTLRStringStream(src));
    ASTDemoParser parser = new ASTDemoParser(new CommonTokenStream(lexer));
    CommonTree tree = (CommonTree)parser.parse().getTree();
    DOTTreeGenerator gen = new DOTTreeGenerator();
    StringTemplate st = gen.toDOT(tree);
    System.out.println(st);
  }
}

关于c - 如何输出使用 ANTLR 构建的 AST？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/4931346/

c - 如何输出使用 ANTLR 构建的 AST？

编辑

上一篇：c - 通过恰好更改一个字符来修复损坏的循环

下一篇：c - 你如何在 C 中生成另一个进程？