解析 Prolog 中的表达式并返回抽象语法

我必须编写 parse(Tkns, T)，它接受标记列表形式的数学表达式并找到 T，并返回表示抽象语法的语句，尊重操作顺序和关联性。

例如，

?- parse( [ num(3), plus, num(2), star, num(1) ], T ).

T = add(integer(3), multiply(integer(2), integer(1))) ;
No

我尝试按如下方式实现 + 和 *

parse([num(X)], integer(X)).
parse(Tkns, T) :-
  (  append(E1, [plus|E2], Tkns),
     parse(E1, T1),
     parse(E2, T2),
     T = add(T1,T2)
  ;  append(E1, [star|E2], Tkns),
     parse(E1, T1),
     parse(E2, T2),
     T = multiply(T1,T2)
  ).

它找到正确的答案，但也会返回不遵循关联性或运算顺序的答案。

例如)

parse( [ num(3), plus, num(2), star, num(1) ], T ).

也返回

mult(add(integer(3), integer(2)), integer(1))

和

parse([num(1), plus, num(2), plus, num(3)], T)

返回 1+2+3 和 1+(2+3) 的等价物，而它应该只返回前者。

有什么办法可以让它发挥作用吗？

编辑:更多信息:我只需要实现 +、-、*、/、否定(-1、-2 等)，并且所有数字都是整数。给出了一个提示，代码的结构将与语法类似

<expression> ::= <expression> + <term>
              |  <expression> - <term>
              |  <term>

      <term> ::= <term> * <factor>
              |  <term> / <factor>
              |  <factor>

    <factor> ::= num
              |  ( <expression> )

仅也实现了否定。

Edit2:我发现了一个用 Prolog 编写的语法解析器( http://www.cs.sunysb.edu/~warren/xsbbook/node10.html )。有没有一种方法可以修改它以打印语法的左手推导(“打印”是指 Prolog 解释器将输出“T=[正确答案]”)

最佳答案

正在删除 left recursion将驱使您走向基于 DCG 的语法。

但是有一个有趣的替代方法:实现自下而上解析。

这在 Prolog 中有多难？好吧，正如佩雷拉和希伯在他们精彩的书中所展示的那样 “Prolog 和自然语言分析”非常简单:来自第 6.5 章

Prolog supplies by default a top-down, left-to-right, backtrack parsing algorithm for DCGs.

It is well known that top-down parsing algorithms of this kind will loop on left-recursive rules (cf. the example of Program 2.3).

Although techniques are avail- able to remove left recursion from context-free grammars, these techniques are not readily generalizable to DCGs, and furthermore they can increase grammar size by large factors.

As an alternative, we may consider implementing a bottom-up parsing method directly in Prolog. Of the various possibilities, we will consider here the left-corner method in one of its adaptations to DCGs.

For programming convenience, the input grammar for the left-corner DCG interpreter is represented in a slight variation of the DCG notation. The right-hand sides of rules are given as lists rather than conjunctions of literals. Thus rules are unit clauses of the form, e.g.,

s ---> [np, vp].

或

optrel ---> [].

Terminals are introduced by dictionary unit clauses of the form word(w,PT).

考虑在继续之前完成讲座(在 info page 中按标题查找免费图书条目)。

现在让我们尝试编写一个自下而上的处理器:

:- op(150, xfx, ---> ).

parse(Phrase) -->
    leaf(SubPhrase),
    lc(SubPhrase, Phrase).

leaf(Cat) --> [Word], {word(Word,Cat)}.
leaf(Phrase) --> {Phrase ---> []}.

lc(Phrase, Phrase) --> [].

lc(SubPhrase, SuperPhrase) -->
    {Phrase ---> [SubPhrase|Rest]},
    parse_rest(Rest),
    lc(Phrase, SuperPhrase).

parse_rest([]) --> [].
parse_rest([Phrase|Phrases]) -->
    parse(Phrase),
    parse_rest(Phrases).

% that's all! fairly easy, isn't it ?

% here start the grammar: replace with your one, don't worry about Left Recursion
e(sum(L,R)) ---> [e(L),sum,e(R)].
e(num(N)) ---> [num(N)].

word(N, num(N)) :- integer(N).
word(+, sum).

例如，产生

phrase(parse(P), [1,+,3,+,1]).
P = e(sum(sum(num(1), num(3)), num(1)))

注意使用的左递归语法是 e::= e + e |数

关于解析 Prolog 中的表达式并返回抽象语法，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/20511060/

解析 Prolog 中的表达式并返回抽象语法

上一篇：eclipse - 如何在 Zend Studio 8/9 或 Windows 7 上的 Eclipse 中使用 LF 作为行结尾

下一篇：Jenkins hudson.TcpSlaveAgentListener$ConnectionHandler 运行连接失败