c++ - 使用 Boost.Spirit 编译一个简单的解析器

标签 c++ boost-spirit

我正在破解的一个简单的骨架实用程序的一部分我有一个用于触发文本替换的语法。我认为这是熟悉 Boost.Spirit 的绝妙方式,但模板错误是一种独特的乐趣。


#include <iostream>
#include <iterator>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>

namespace bsq = boost::spirit::qi;

namespace {
template<typename Iterator>
struct skel_grammar : public bsq::grammar<Iterator> {

    bsq::rule<Iterator> macro_b;
    bsq::rule<Iterator> macro_e;
    bsq::rule<Iterator, bsq::ascii::space_type> id;
    bsq::rule<Iterator> macro;
    bsq::rule<Iterator> text;
    bsq::rule<Iterator> start;

template<typename Iterator>
skel_grammar<Iterator>::skel_grammar() : skel_grammar::base_type(start)
    text = bsq::no_skip[+(bsq::char_ - macro_b)[bsq::_val += bsq::_1]];
    macro_b = bsq::lit("<<");
    macro_e = bsq::lit(">>");
    macro %= macro_b >> id >> macro_e;
    id %= -(bsq::ascii::alpha | bsq::char_('_'))
        >> +(bsq::ascii::alnum | bsq::char_('_'));
    start = *(text | macro);
}  // namespace

int main(int argc, char* argv[])
    std::string input((std::istreambuf_iterator<char>(std::cin)),
    skel_grammar<std::string::iterator> grammar;
    bool r = bsq::parse(input.begin(), input.end(), grammar);
    std::cout << std::boolalpha << r << '\n';
    return 0;




让我用我的“玩具”实现来招待你,完成测试用例,一个语法将识别 <<macros>>像这样,包括相同的嵌套扩展。


  1. 扩展是使用回调 ( process() ) 完成的,为您提供最大的灵 active (您可以使用查找表,根据宏内容导致解析失败,甚至产生独立于输出的副作用
  2. 解析器经过优化以支持流模式。看spirit::istream_iterator关于如何在流模式下解析输入 ( Stream-based Parsing Made Easy )。如果您的输入流是 10 GB,并且仅包含 4 个宏,那么这有明显的好处 - 这是抓取性能(或内存不足)与缩放之间的区别。
    • 请注意,演示仍然写入字符串缓冲区(通过 oss )。但是,您可以轻松地将输出直接挂接到 std::cout。或者说,一个 std::ofstream实例
  3. 扩展是急切完成的,因此您可以使用间接宏获得漂亮的效果。查看测试用例
  4. 我什至演示了一种支持转义 << 的简单方法或 >>分隔符 ( #define SUPPORT_ESCAPES )



注意 由于懒惰,我需要-std==c++0x ,但SUPPORT_ESCAPES已定义

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>

namespace qi = boost::spirit::qi;
namespace phx= boost::phoenix;
namespace fsn= boost::fusion;


    static bool process(std::string& macro)
        if (macro == "error") {
            return false; // fail the parse

        if (macro == "hello") {
            macro = "bye";
        } else if (macro == "bye") {
            macro = "We meet again";
        } else if (macro == "sideeffect") {
            std::cerr << "this is a side effect while parsing\n";
            macro = "(done)";
        } else if (std::string::npos != macro.find('~')) {  
            std::reverse(macro.begin(), macro.end());
            macro.erase(std::remove(macro.begin(), macro.end(), '~'));
        } else {
            macro = std::string("<<") + macro + ">>"; // this makes the unsupported macros appear unchanged

        return true;

    template<typename Iterator, typename OutIt>
        struct skel_grammar : public qi::grammar<Iterator>
        struct fastfwd {
            template<typename,typename> struct result { typedef bool type; };

            template<typename R, typename O> 
                bool operator()(const R&r,O& o) const
                o = std::copy(r.begin(),r.end(),o);
                auto f = std::begin(r), l = std::end(r);
                    if (('\\'==*f) && (l == ++f))
                    *o++ = *f++;
                return true; // false to fail the parse
        } copy;

        skel_grammar(OutIt& out) : skel_grammar::base_type(start)
            using namespace qi;

            rawch = ('\\' >> char_) | char_;
#           define rawch qi::char_

            macro = ("<<" >> (
                           (*(rawch - ">>" - "<<") [ _val += _1 ]) 
                         % macro                   [ _val += _1 ] // allow nests
                      ) >> 
                [ _pass = phx::bind(process, _val) ];

            start = 
                raw [ +(rawch - "<<") ] [ _pass = phx::bind(copy, _1, phx::ref(out)) ] 
              % macro                   [ _pass = phx::bind(copy, _1, phx::ref(out)) ]


#           undef rawch

        qi::rule<Iterator, char()> rawch;
        qi::rule<Iterator, std::string()> macro;
        qi::rule<Iterator> start;

int main(int argc, char* argv[])
    std::string input = 
        "Greeting is <<hello>> world!\n"
        "Side effects are <<sideeffect>> and <<other>> vars are untouched\n"
        "Empty <<>> macros are ok, as are stray '>>' pairs.\n"
        "<<nested <<macros>> (<<hello>>?) work>>\n"
        "The order of expansion (evaluation) is _eager_: '<<<<hello>>>>' will expand to the same as '<<bye>>'\n"
        "Lastly you can do algorithmic stuff too: <<!esrever ~ni <<hello>>>>\n"
#ifdef SUPPORT_ESCAPES // bonus: escapes
        "You can escape \\<<hello>> (not expanded to '<<hello>>')\n"
        "Demonstrate how it <<avoids <\\<nesting\\>> macros>>.\n"

    std::ostringstream oss;
    std::ostream_iterator<char> out(oss);

    skel_grammar<std::string::iterator, std::ostream_iterator<char> > grammar(out);

    std::string::iterator f(input.begin()), l(input.end());
    bool r = qi::parse(f, l, grammar);

    std::cout << "parse result: " << (r?"success":"failure") << "\n";
    if (f!=l)
        std::cout << "unparsed remaining: '" << std::string(f,l) << "'\n";

    std::cout << "Streamed output:\n\n" << oss.str() << '\n';

    return 0;


this is a side effect while parsing
parse result: success
Streamed output:

Greeting is bye world!
Side effects are (done) and <<other>> vars are untouched
Empty <<>> macros are ok, as are stray '>>' pairs.
<<nested <<macros>> (bye?) work>>
The order of expansion (evaluation) is _eager_: 'We meet again' will expand to the same as 'We meet again'
Lastly you can do algorithmic stuff too: eyb in reverse!
You can escape <<hello>> (not expanded to 'bye')
Demonstrate how it <<avoids <<nesting>> macros>>.

Grok 中隐藏了很多功能。我建议你看看测试用例和the process() callback并排查看发生了什么。

干杯 & HTH :)

关于c++ - 使用 Boost.Spirit 编译一个简单的解析器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9404558/


c++ - vector<unique_ptr> 上的 is_copy_constructible 误报

c++ - 向 STL 容器添加专门的功能

c++ - 使用 g++ 将多个源文件编译成一个 .o

c++ - 在 Boost Spirit 2.5.2 中按层次拆分语法

c++ - 使用 boost Spirit 解析为 STL vector

c++ - boost Spirit istream 迭代器给出误报

C++ Libzip + 删除 = 核心转储

C++ 映射<字符串, vector <对<字符串,字符串>>> : adding a mapping to an empty vector?

c++ - 无法使用lambda参数编译boost::spirit::x3解析器

c++ - 用灵气解析成 std​​::vector<string>,出现段错误或断言失败