c++ - GLL Parser Combinator or Generator in/for C or C++

标签 c++ c parsing parser-combinators parse-forest

是否有任何现有的 GLL 的实现?算法,无论是解析器组合器的形式(首选)还是作为 C 或 C++ 的解析器生成器?

我的要求是输出是一个共享的打包解析林 (SPPF),我以后可以使用语义和/或上下文规则来消除歧义。还有其他解析算法,例如 GLR,它们能够处理一般的上下文无关语法,但是,我能找到的所有 GLR 解析器生成器要么返回第一个成功的解析树,要么在最后仍然存在歧义时失败。

最佳答案

如果您尝试 GLL Combinators 会怎样? ?虽然它使用 Scala,但您可以通过 JNI 为它编写“瘦”包装器。 .

GLL Combinators is a framework designed to implement the GLL parsing algorithm (Scott and Johnstone, LDTA 2009) in a functional manner. More specifically, the framework makes use of atomic parser combinators to compose grammars which are then evaluated using the GLL algorithm. The framework provides a syntax for this task which is almost identical to that of the parser combinator framework built into Scala. For example, we can render the classic "parentheses grammar" using GLL combinators:

lazy val expr: Parser[Any] = (
    "(" ~ expr ~ ")"
  | ""
)

As the type annotation implies, expr will reference an instance of type Parser[Any]. That is, an atomic parser which consumes some input and returns a value of type Any. We can invoke this parser against a String input in the following way:

expr("((()))")

This will return a value of type Stream[Result[Any]]. The Result[A] ADT is defined as one of the following (for some type A):

  • Success[A] -- Represents a successful parse and contains the resulting value.
  • Failure -- Represents a failed parse and contains the relevant error message as well as the remainder of the parse stream (the characters which were not consumed).

If any result is successful (i.e. an instance of Success[A]), then no failures will be returned. Thus, the Stream returned will be completely homogeneous, containing either Success or Failure, but not both. A Stream is returned rather than a single value to allow for ambiguity in the grammar (see below).

It's worth mentioning that GLL is a form of recursive-descent parsing. It has all of the advantages of conventional recursive-descent, including an intuitive control flow, arbitrary positioning of semantic actions and superior error messages. In fact, it is the fact that GLL is recursive-descent which allows it to be implemented using atomic combinators. Other algorithms which share the same capabilities as GLL (such as GLR, Earley Parsing, etc) are fundamentally incompatible with the combinator model due to their highly-unintuitive control flow. In GLL parsing, the control flow follows that of the grammar, as it does in traditional parser combinators or any other form of recursive-descent.

关于c++ - GLL Parser Combinator or Generator in/for C or C++,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21196790/

相关文章:

c - 如何在 C 中打印 double 数组?

java - 自动从json创建java对象?

c - 解析参数后缀乘数和单位,SCPI

c++ - DeviceIoControl GetLastError 87 (ERROR_INVALID_PARAMETER)

c++ - 为什么 unique_ptr 有两个函数 reset 和 operator= 做类似的事情但不重载?

c - 仅 Malloc 低 32 位地址

c - 垃圾值检索

parsing - OCaml + Menhir 编译/写作

c++ - 包含来自基于 C 的代码的 C++ header (fstream)?

c++ - 如何在不同的定义时间为每个触发器设置数万个任务?