c++ - 使用 Boost Spirit 解析语法的未处理异常

我正在尝试使用 Boost Spirit 来解析以下语法: 句子: 名词动词句子连句

连词: “和”

名词: “鸟类” “猫”

动词: “飞” “喵”

当语法仅包含名词 >> 动词规则时，解析成功。当语法修改为包括句子>>连接>>句子规则并且我提供了无效的输入(例如“birds Fly”而不是“birdsfly”)时，我在程序运行时收到未处理的异常。

这是根据 boost 文档中的示例修改的代码

#define BOOST_VARIANT_MINIMIZE_SIZE
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/include/phoenix_statement.hpp>
#include <boost/spirit/include/phoenix_container.hpp>
#include <iostream>
#include <string>

using namespace boost::spirit;
using namespace boost::spirit::ascii;

template <typename Lexer>
struct token_list : lex::lexer<Lexer>
{
    token_list()
    {
        noun = "birds|cats";    
        verb =  "fly|meow";
        conjunction = "and";

        this->self.add
            (noun)         
            (verb) 
            (conjunction)
        ;
    }
    lex::token_def<std::string> noun, verb, conjunction;
};

template <typename Iterator>
struct Grammar : qi::grammar<Iterator>
{
    template <typename TokenDef>
    Grammar(TokenDef const& tok)
      : Grammar::base_type(sentence)
    {
        sentence = (tok.noun>>tok.verb)
        |
        (sentence>>tok.conjunction>>sentence)>>eoi
    ;
    }
    qi::rule<Iterator> sentence;
};

int main()
{
typedef lex::lexertl::token<char const*, boost::mpl::vector<std::string>> token_type;
typedef lex::lexertl::lexer<token_type> lexer_type;
typedef token_list<lexer_type>::iterator_type iterator_type;

     token_list<lexer_type> word_count;         
     Grammar<iterator_type> g (word_count); 

     std::string str = "birdsfly"; 
 //std::string str = "birds fly"; this input caused unhandled exception

     char const* first = str.c_str();
     char const* last = &first[str.size()];

     bool r = lex::tokenize_and_parse(first, last, word_count, g);

     if (r) {
         std::cout << "Parsing passed"<< "\n";
     }
     else {
         std::string rest(first, last);
         std::cerr << "Parsing failed\n" << "stopped at: \"" 
                   << rest << "\"\n";
     }
    system("PAUSE");
    return 0;
}

最佳答案

您在句子规则的第二个分支中具有左递归。

sentence = sentence >> ....

将始终在句子上递归，因此您会看到堆栈溢出。

我建议编写规则，例如:

sentence = 
      (tok.noun >> tok.verb) 
  >> *(tok.conjunction >> sentence) 
  >> qi::eoi
  ;

现在结果如下

g++ -Wall -pedantic -std=c++0x -g -O0 test.cpp -o test
Parsing failed
stopped at: " fly"

(当然还有不可避免的“sh:暂停:找不到命令”......)

PS。请不要使用命名空间。相反:

namespace qi  = boost::spirit::qi;
namespace lex = boost::spirit::lex;

这是一个清理版本，删除/修复了一些其他内容:http://coliru.stacked-crooked.com/view?id=1fb26ca3e8c207979eaaf4592c319316-e223fd4a885a77b520bbfe69dda8fb91

#define BOOST_VARIANT_MINIMIZE_SIZE
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
// #include <boost/spirit/include/phoenix.hpp>
#include <iostream>
#include <string>

namespace qi  = boost::spirit::qi;
namespace lex = boost::spirit::lex;

template <typename Lexer>
struct token_list : lex::lexer<Lexer>
{
    token_list()
    {
        noun        = "birds|cats";    
        verb        = "fly|meow";
        conjunction = "and";

        this->self.add
            (noun)         
            (verb) 
            (conjunction)
        ;
    }

    lex::token_def<std::string> noun, verb, conjunction;
};

template <typename Iterator>
struct Grammar : qi::grammar<Iterator>
{
    template <typename TokenDef>
    Grammar(TokenDef const& tok) : Grammar::base_type(sentence)
    {
        sentence = 
              (tok.noun >> tok.verb) 
          >> *(tok.conjunction >> sentence) 
          >> qi::eoi
          ;
    }
    qi::rule<Iterator> sentence;
};

int main()
{
    typedef std::string::const_iterator It;
    typedef lex::lexertl::token<It, boost::mpl::vector<std::string>> token_type;
    typedef lex::lexertl::lexer<token_type> lexer_type;
    typedef token_list<lexer_type>::iterator_type iterator_type;

    token_list<lexer_type> word_count;         
    Grammar<iterator_type> g(word_count); 

    //std::string str = "birdsfly"; 
    const std::string str = "birds fly";

    It first = str.begin();
    It last  = str.end();

    bool r = lex::tokenize_and_parse(first, last, word_count, g);

    if (r) {
        std::cout << "Parsing passed"<< "\n";
    }
    else {
        std::string rest(first, last);
        std::cerr << "Parsing failed\n" << "stopped at: \"" << rest << "\"\n";
    }
}

关于c++ - 使用 Boost Spirit 解析语法的未处理异常，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/17941355/

c++ - 使用 Boost Spirit 解析语法的未处理异常

上一篇：c++ - 使用 const 引用时创建的临时变量

下一篇：c++ - 使用 csparse : cs_cholsol 求解简单的稀疏线性方程组