c++ - 递归 Boost.Spirit 解析的问题

标签 c++ boost boost-spirit

我正在尝试为学校项目的 C 语言子集建模解析器。然而,我似乎陷入了为 Boost.Spirit 生成递归解析规则的过程中,因为我的规则要么溢出堆栈,要么根本不拾取任何东西。

例如,我想对以下语法进行建模:

一个::= ... |一个 [一个] | a1 操作 a2 | ...

此表达式规则还有一些其他语法子集,但这些都没有问题。例如,如果我要解析 A[3*4],它应该被读作递归解析,其中 A[...](语法中的 A[a])是数组访问器,3*4(a1 op语法中的 a2) 是索引。

我尝试在语法结构中定义以下规则对象:

qi::rule<Iterator, Type(), Skipper> expr_arr;
qi::rule<Iterator, Type(), Skipper> expr_binary_arith;
qi::rule<Iterator, Type(), Skipper> expr_a;

并给他们以下语法:

expr_arr %= qi::lexeme[identifier >> qi::omit['[']] >> expr_a >> qi::lexeme[qi::omit[']']];
expr_binary_arith %= expr_a >> op_binary_arith >> expr_a;
expr_a %= (expr_binary_arith | expr_arr);

其中“op_binary_arith”是具有允许的运算符符号的 qi::symbol<> 对象。

这可以正常编译,但在执行时会进入一个所谓的无限循环,并且堆栈会溢出。我试过在以下问题中查看 Sehe 的答案:How to set max recursion in boost spirit .

但是,我一直未能成功设置最大递归深度。首先,我几乎所有的尝试都未能使它无错误地编译,但在最后一次尝试中它成功构建,尽管结果非常出乎意料。

有人可以指导我正确的方向,关于我应该如何正确实现这个语法吗?

最佳答案

PEG 语法不能很好地处理左递归。通常,您必须拆分辅助规则才能在没有左递归的情况下编写。

在您的特定情况下,目标生产

a ::= ... | A[a] | a1 op a2 | ...

似乎有点不对劲。这将允许 foo[bar]foo + bar 但不允许 foo + bar[qux]

通常,数组元素引用或普通标识符之间的选择具有较低的优先级(通常是“简单表达式”)。

这里有一个小细节:

literal           = number_literal | string_literal; // TODO exapnd?

expr_arr          = identifier >> '[' >> (expr_a % ',') >> ']';
simple_expression = literal | expr_arr | identifier;
expr_binary_arith = simple_expression >> op_binary_arith >> expr_a;
expr_a            = expr_binary_arith | simple_expression;

现在你可以解析例如:

for (std::string const& input : {
        "A[3*4]",
        "A[F[3]]",
        "A[8 + F[0x31]]",
        "3 * \"foo\"",
    })
{
    std::cout << std::quoted(input) << " -> ";

    It f=begin(input), l=end(input);
    AST::Expr e;

    if (parse(f,l,g,e)) {
        std::cout << "Parsed: " << e << "\n";
    } else {
        std::cout << "Failed\n";
    }

    if (f!=l) {
        std::cout << "Remaining: " << std::quoted(std::string(f,l)) << "\b";
    }
}

打印 Live On Coliru

"A[3*4]" -> Parsed: A[3*4]
"A[F[3]]" -> Parsed: A[F[3]]
"A[8 + F[0x31]]" -> Parsed: A[8+F[49]]
"3 * \"foo\"" -> Parsed: 3*"foo"

NOTE I deliberately left efficiency and operator precedence out of the picture for now.

These are talked about in detail in other answers:

And many more

完整的演示列表

Live On Coliru

//#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <iomanip>
#include <experimental/iterator>
namespace qi = boost::spirit::qi;

namespace AST {
    using Var = std::string;

    struct String : std::string {
        using std::string::string;
    };

    using Literal = boost::variant<String, intmax_t, double>;

    enum class ArithOp {
        addition, subtraction, division, multplication
    };

    struct IndexExpr;
    struct BinOpExpr;

    using Expr = boost::variant<
        Literal,
        Var,
        boost::recursive_wrapper<IndexExpr>,
        boost::recursive_wrapper<BinOpExpr>
    >;

    struct IndexExpr {
        Expr expr;
        std::vector<Expr> indices;
    };

    struct BinOpExpr {
        Expr lhs, rhs;
        ArithOp op;
    };

    std::ostream& operator<<(std::ostream& os, Literal const& lit) {
        struct {
            std::ostream& os;
            void operator()(String const& s) const { os << std::quoted(s); }
            void operator()(double d) const { os << d; }
            void operator()(intmax_t i) const { os << i; }
        } vis {os};
        boost::apply_visitor(vis, lit);
        return os;
    }
    std::ostream& operator<<(std::ostream& os, ArithOp const& op) {
        switch(op) {
            case ArithOp::addition: return os << '+';
            case ArithOp::subtraction: return os << '-';
            case ArithOp::division: return os << '/';
            case ArithOp::multplication: return os << '*';
        }
        return os << '?';
    }
    std::ostream& operator<<(std::ostream& os, BinOpExpr const& e) {
        return os << e.lhs << e.op << e.rhs;
    }
    std::ostream& operator<<(std::ostream& os, IndexExpr const& e) {
        std::copy(
            begin(e.indices),
            end(e.indices),
            std::experimental::make_ostream_joiner(os << e.expr << '[', ","));

        return os << ']';
    }
}

BOOST_FUSION_ADAPT_STRUCT(AST::IndexExpr, expr, indices)
BOOST_FUSION_ADAPT_STRUCT(AST::BinOpExpr, lhs, op, rhs)

template <typename Iterator, typename Skipper = qi::space_type>
struct G : qi::grammar<Iterator, AST::Expr()> {
    G() : G::base_type(start) {
        using namespace qi;

        identifier        = alpha >> *alnum;

        number_literal    =
            qi::real_parser<double, qi::strict_real_policies<double> >{}
          | "0x" >> qi::uint_parser<intmax_t, 16> {}
          |         qi::int_parser<intmax_t, 10> {}
          ;

        string_literal    = '"' >> *('\\' >> char_escape | ~char_('"')) >> '"';

        literal           = number_literal | string_literal; // TODO exapnd?

        expr_arr          = identifier >> '[' >> (expr_a % ',') >> ']';
        simple_expression = literal | expr_arr | identifier;
        expr_binary_arith = simple_expression >> op_binary_arith >> expr_a;
        expr_a            = expr_binary_arith | simple_expression;

        start = skip(space) [expr_a];

        BOOST_SPIRIT_DEBUG_NODES(
                (start)
                (expr_a)(expr_binary_arith)(simple_expression)(expr_a)
                (literal)(number_literal)(string_literal)
                (identifier))
    }

  private:
    struct escape_sym : qi::symbols<char, char> {
        escape_sym() {
            this->add
                ("b", '\b')
                ("f", '\f')
                ("r", '\r')
                ("n", '\n')
                ("t", '\t')
                ("\\", '\\')
                ;
        }
    } char_escape;

    struct op_binary_arith_sym : qi::symbols<char, AST::ArithOp> {
        op_binary_arith_sym() {
            this->add
                ("+", AST::ArithOp::addition)
                ("-", AST::ArithOp::subtraction)
                ("/", AST::ArithOp::division)
                ("*", AST::ArithOp::multplication)
                ;
        }
    } op_binary_arith;

    qi::rule<Iterator, AST::Expr()> start;
    qi::rule<Iterator, AST::IndexExpr(), Skipper> expr_arr;
    qi::rule<Iterator, AST::BinOpExpr(), Skipper> expr_binary_arith;
    qi::rule<Iterator, AST::Expr(), Skipper> simple_expression, expr_a;
    // implicit lexemes
    qi::rule<Iterator, AST::Literal()> literal, string_literal, number_literal;
    qi::rule<Iterator, AST::Var()> identifier;
};

int main() {
    using It = std::string::const_iterator;
    G<It> const g;
    for (std::string const& input : {
            "A[3*4]",
            "A[F[3]]",
            "A[8 + F[0x31]]",
            "3 * \"foo\"",
        })
    {
        std::cout << std::quoted(input) << " -> ";

        It f=begin(input), l=end(input);
        AST::Expr e;

        if (parse(f,l,g,e)) {
            std::cout << "Parsed: " << e << "\n";
        } else {
            std::cout << "Failed\n";
        }

        if (f!=l) {
            std::cout << "Remaining: " << std::quoted(std::string(f,l)) << "\b";
        }
    }
}

关于c++ - 递归 Boost.Spirit 解析的问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57886865/

相关文章:

c++ - 编译器错误的单元测试

c++:如何从UTF-8代码点创建unsigned char

c++ - 创建一个用于 spirit 规则的凤凰函数

C++ 让 boost 线程等待 1 秒

c++ - `boost::local_time::date_time` 来自 `std::string`

c++ - 解析字符串(带空格)但忽略(Spirit)末尾的空格

c++ - 精神X3 : Basic example for compound components does not compile

c++ - 没有运算符 ">> "匹配这些操作数

c++ - 如何将 std::vector<myClass*> 转换为 std::vector<const myClass* const>?

c++ - Boost Thread 的 boost::unique_lock 是作用域锁吗?