parsing - 规范语法中的结构差异

请考虑以下语法:

S → A | B
A → xy
B → xyz

在给定xyz输入的情况下，这就是我认为LR(0)解析器可以完成的工作:

    | xyz   → shift
  x | yz    → shift
 xy | z     → reduce
  A | z     → shift
 Az |       → fail

如果我的假设是正确的，我们将规则B更改为:

B → Az

现在该语法突然被LR(0)解析器接受。我认为这个新语法描述的字符串集与本问题中的第一个语法完全相同。

第一语法和第二语法有什么区别？

我们如何将语法中的结构差异与它们描述的语言脱钩？

通过归一化吗？

什么样的归一化？

要进一步说明:

我想向解析器描述一种语言，而语法的结构不起作用。我想获得一组字符串的最小/基本描述。对于LR(k)语法，我想最小化k。

最佳答案

我认为您的LR(0)解析器不是标准的解析器:

Source

An LR(0) parser is a shift/reduce parser that uses zero tokens of lookahead to determine what action to take (hence the 0). This means that in any configuration of the parser, the parser must have an unambiguous action to choose - either it shifts a specific symbol or applies a specific reduction. If there are ever two or more choices to make, the parser fails and we say that the grammar is not LR(0).

因此，当您拥有:

S->A|B
A->xy
B->Az

或者

S->A|B
A->xy
B->xyz

LR(0)永远不会检查B规则，并且对于它们两者而言都将失败。

State0 - Clousure(S->°A):
   S->°A
   A->°xy

 Arcs:
  0 --> x --> 2
  0 --> A --> 1
-------------------------

State1 - Goto(State0,A):
   S->A°

 Arcs:
  1 --> $ --> Accept
-------------------------

State2 - Goto(State0,x):
   A->x°y

 Arcs:
  2 --> y --> 3
-------------------------

State3 - Goto(State2,y):
   A->xy°

 Arcs:
-------------------------

但是如果你有

I->S
S->A|B
A->xy
B->xyz                           or B->Az

他们两个都将接受xyz，但是处于不同的状态:

State0 - Clousure(I->°S):
   I->°S
   S->°A
   S->°B
   A->°xy                           A->°xy, $z
   B->°xyz                          B->°Az, $

 Arcs:
  0 --> x --> 4
  0 --> S --> 1
  0 --> A --> 2
  0 --> B --> 3
-------------------------

State1 - Goto(State0,S):
   I->S°

 Arcs:
  1 --> $ --> Accept
-------------------------

State2 - Goto(State0,A):
   S->A°                            S->A°, $ 
                                    B->A°z, $
 Arcs:                              2 --> z --> 5
-------------------------

State3 - Goto(State0,B):
   S->B°

 Arcs:
-------------------------

State4 - Goto(State0,x):
   A->x°y                            A->x°y, $z
   B->x°yz

 Arcs:
  4 --> y --> 5                     4 --> y --> 6
-------------------------

State5 - Goto(State4,y):            - Goto(State2,z):
   A->xy°                           B->Az°, $

 Arcs:
  5 --> z --> 6                     -<None>-
-------------------------

State6 - Goto(State5,z):            - Goto(State4,y)
   B->xyz°                          A->xy°, $z

 Arcs:
-------------------------

您可以看到Goto Table和Action Table不同。

[B->xyz]                                  [B->Az]
  | Stack   | Input  | Action               | Stack   | Input  | Action
--+---------+--------+----------          --+---------+--------+----------
1 | 0       | xyz$   | Shift              1 | 0       | xyz$   | Shift 
2 | 0 4     | yz$    | Shift              2 | 0 4     | xy$    | Shift
3 | 0 4 5   | z$     | Shift              3 | 0 4 6   | z$     | Reduce A->xy
4 | 0 4 5 6 | $      | Reduce B->xyz      4 | 0 2     | z$     | Shift 
5 | 0 3     | $      | Reduce S->B        5 | 0 2 5   | $      | Reduce B->Az
6 | 0 1     | $      | Accept             6 | 0 3     | $      | Reduce S->B
                                          7 | 0 1     | $      | Accept

简单地，当您将B->xyz更改为B->Az时，您就在Action中添加了LR Table以查找差异，您可以检查Action Table和Goto Table(Constructing LR(0) parsing tables)

当您有A->xy和B->xyz时，您有两个底部句柄[xy或xyz]，但是当您有B->Az时，您只有一个底部句柄[xy]，可以接受其他z。

我认为与局部优化有关-c = a + b; d = a + b-> c = a + b; d = c-当您使用B->Az时，您对B->xyz进行了优化。

关于parsing - 规范语法中的结构差异，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/26069283/

parsing - 规范语法中的结构差异

上一篇：networking - 你如何在 Vagrant 中为 NFS 创建一个仅限主机的网络？

下一篇：javascript - 创建图像并将其附加到 div。怎么了？