python-3.x - 简单的语法在 Python 中给出 ValueError

标签 python-3.x nlp nltk

我是 Python、nltk 和 nlp 的新手。我写了简单的语法。但是在运行程序时,它会出现以下错误。请帮我解决这个错误

语法:-

S -> NP
NP -> PN|PRO|D[NUM=?n] N[NUM=?n]|D[NUM=?n] A N[NUM=?n]|D[NUM=?n] N[NUM=?n] PP|QP N[NUM=?n]|A N[NUM=?n]|D[NUM=?n] NOM PP|D[NUM=?n] NOM
PP -> P NP
D[NUM=sg] -> 'a'
D -> 'the'
N[NUM=sg] -> 'boy'|'girl'|'room'|'garden'|'hair'
N[NUM=pl] -> 'dogs'|'cats'
PN -> 'saumya'|'dinesh'
PRO -> 'she'|'he'|'we'
A -> 'tall'|'naughty'|'long'|'three'|'black'
P -> 'with'|'in'|'from'|'at'
QP -> 'some'
NOM -> A NOM|N[NUM=?n]

代码:-
import nltk

grammar = nltk.data.load('file:english_grammer.cfg')
rdparser = nltk.RecursiveDescentParser(grammar)
sent = "a dogs".split()
trees = rdparser.parse(sent)

for tree in trees: print (tree)

错误:-

ValueError: 期望一个非终结符,发现: [NUM=?n] N[NUM=?n]|D[NUM=?n] AN[NUM=?n]|D[NUM=?n] N[NUM=?n] ] PP|QP N[NUM=?n]|AN[NUM=?n]|D[NUM=?n] NOM PP|D[NUM=?n] NOM

最佳答案

我不认为 N​​LTK CFG 语法读者可以用方括号读取 CFG 的格式。

首先让我们尝试一个没有方括号的 CFG 语法:

from nltk.grammar import CFG

grammar_string = '''
S -> NP
PP -> P NP
D -> 'the'
PN -> 'saumya'|'dinesh'
PRO -> 'she'|'he'|'we'
A -> 'tall'|'naughty'|'long'|'three'|'black'
P -> 'with'|'in'|'from'|'at'
QP -> 'some'
'''

grammar = CFG.fromstring(grammar_string)
print grammar

[出去]:
Grammar with 18 productions (start state = S)
    S -> NP
    PP -> P NP
    D -> 'the'
    PN -> 'saumya'
    PN -> 'dinesh'
    PRO -> 'she'
    PRO -> 'he'
    PRO -> 'we'
    A -> 'tall'
    A -> 'naughty'
    A -> 'long'
    A -> 'three'
    A -> 'black'
    P -> 'with'
    P -> 'in'
    P -> 'from'
    P -> 'at'
    QP -> 'some'

现在让我们把方括号放在:
from nltk.grammar import CFG

grammar_string = '''
S -> NP
PP -> P NP
D -> 'the'
PN -> 'saumya'|'dinesh'
PRO -> 'she'|'he'|'we'
A -> 'tall'|'naughty'|'long'|'three'|'black'
P -> 'with'|'in'|'from'|'at'
QP -> 'some'
N[NUM=sg] -> 'boy'|'girl'|'room'|'garden'|'hair'
N[NUM=pl] -> 'dogs'|'cats'
'''

grammar = CFG.fromstring(grammar_string)
print grammar

[出去]:
Traceback (most recent call last):
  File "test.py", line 33, in <module>
    grammar = CFG.fromstring(grammar_string)
  File "/usr/local/lib/python2.7/dist-packages/nltk/grammar.py", line 519, in fromstring
    encoding=encoding)
  File "/usr/local/lib/python2.7/dist-packages/nltk/grammar.py", line 1273, in read_grammar
    (linenum+1, line, e))
ValueError: Unable to parse line 10: N[NUM=sg] -> 'boy'|'girl'|'room'|'garden'|'hair'
Expected an arrow

回到你的语法,似乎你正在使用方括号来表示约束或不约束,所以解决方案是 :
  • 使用下划线表示受限制的非终端和
  • 为不受约束的非终结符制定规则

  • 因此,您的 cfg 规则将如下所示:
    from nltk.parse import RecursiveDescentParser
    from nltk.grammar import CFG
    
    grammar_string = '''
    S -> NP
    NP -> PN | PRO | D N | D A N | D N PP | QP N | A N | D NOM PP | D NOM
    
    PP -> P NP
    PN -> 'saumya'|'dinesh'
    PRO -> 'she'|'he'|'we'
    A -> 'tall'|'naughty'|'long'|'three'|'black'
    P -> 'with'|'in'|'from'|'at'
    QP -> 'some'
    
    D -> D_def | D_sg
    D_def -> 'the'
    D_sg -> 'a'
    
    N -> N_sg | N_pl
    N_sg -> 'boy'|'girl'|'room'|'garden'|'hair'
    N_pl -> 'dogs'|'cats'
    '''
    
    grammar = CFG.fromstring(grammar_string)
    
    rdparser = RecursiveDescentParser(grammar)
    sent = "a dogs".split()
    trees = rdparser.parse(sent)
    
    for tree in trees:
        print (tree)
    

    [出去]:
    (S (NP (D (D_sg a)) (N (N_pl dogs))))
    

    关于python-3.x - 简单的语法在 Python 中给出 ValueError,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26505638/

    相关文章:

    python-2.7 - NLTK POS标记器不起作用

    python - 使用 NLTK 将两个字符串匹配在一起?

    python - NLTK值错误: too many values to unpack (expected 2)

    python - 为什么我在尝试重新启动井字棋游戏时会收到此错误?

    python-3.x - discord.ext.commands.errors.CommandInvokeError : Command raised an exception: KeyError: 'url'

    python - python中非英语推文的情感分析

    java关键词提取

    python - ModuleNotFoundError : No module named msg1 (in ubuntu with python 2. 7.6 未显示任何错误)

    python-3.x - 如何减慢使用 cv2 捕获的视频?

    python - 从非结构化医疗文档中提取文本以进行 NLP