c++ - Boost Spirit 在服务器应用程序上实现小型单行 DSL

标签 c++ boost dsl boost-spirit boost-spirit-qi

如果这个问题之前已经回答过,我们深表歉意。

我想将一个小型 DSL 插入到我使用的服务器应用程序中。语法非常简单,即使在这个早期阶段我也被难住了。我只是无法理解如何在 spirit 上构建语法。

这是我要测试的语法示例:

WHERE [not] <condition> [ and | or <condition> ] <command> [parameters]

WHERE 子句将通过测试命名属性从内部存储中选择多个对象。然后将所选对象的 vector 作为输入传递给命令对象。

我想对每个对象执行两种可能的测试:

<property> = "value"

<property> like <regexp>

还有2个命令:

print <propertyName> [, <propertyName> [...]]

set <propertyName> = "value" [, <propertyName> = "value" [...] ]

所以语法的例子是:

where currency like "GBP|USD" set logging = 1, logfile = "myfile"

where not status = "ok" print ident, errorMessage

我知道这是一个很大的问题,但我想知道是否有 spirit 专家可以在几秒钟内敲定这个语法?我尽可能地解析了 LIKE 和 =,但在尝试将其与 AND、OR 和 NOT 混合时卡住了。对我来说,问题是在思考 spirit 将如何解决这个问题时不知道从哪里开始。

最佳答案

参见 http://liveworkspace.org/code/3HUzjS用于概念验证。

我通常首先做的是想象我想如何存储解析后的数据。

数据类型

我喜欢坚持使用标准容器,boost::variant(有时是boost::optional)。从下往上阅读,看看它有多简单,自上而下:

struct regex {
    std::string _pattern;
    explicit regex(std::string const& pattern) : _pattern(pattern) {}
};

typedef boost::variant<double, int, std::string, regex> value;

enum logicOp { logicOr, logicAnd, logicPositive };

struct condition {
    bool          _negated;
    std::string   _propertyname;
    value         _operand;      // value or regex
};

struct filter {
    logicOp   _op;
    condition _cond;
};

struct setcommand {
    typedef std::list<std::pair<std::string, value> > pairs;
    pairs _propvals;
};

struct printcommand {
    std::vector<std::string> _propnames;
};

typedef boost::variant<printcommand, setcommand> command;

struct statement {
    std::vector<filter> _filters;
    command             _command;
};

注意事项:

  • 在这种情况下,我已经为 regex 创建了一个 ADT,而不需要切换运算符类型(=like)在处理代码中。)
  • 我假设您的值可以是 int、double、string(或带有“like”的正则表达式)。
  • 我假设您想要从左到右评估过滤条件(andor 没有优先级)。
  • 没有假设setprint 命令的参数是唯一的。
  • 我通过为第一项提供“无操作”逻辑组合来简化 filter 的容器。

语法

有了这个目标结构,编写语法就变得相对简单:

using namespace qi;

// no-skipper rules
property_  = alpha >> *alnum;
strlit_    = '"' >> *(  (lit('\\') >> char_) | ~char_('"') ) > '"';

// with-skipper rules
regex_     = strlit_ [ _val = phx::construct<regex>(_1) ];
value_     = double_ | int_ | strlit_;
condition_ = (no_case["NOT"] >> attr(true) | attr(false)) 
    >> property_ 
    >> (
            no_case["LIKE"] >> regex_ | '=' >> value_
       );

print_   = no_case["PRINT"] >> property_ % ',';
set_     = no_case["SET"] >> (property_ >> '=' >> value_) % ',';
command_ = print_ | set_;

filters_ %= +(
        (
           no_case["WHERE"] [ _pass = (phx::size(_val) == 0) ] >> attr(logicPositive)
         | no_case["AND"]   [ _pass = (phx::size(_val) >  0) ] >> attr(logicAnd)
         | no_case["OR"]    [ _pass = (phx::size(_val) >  0) ] >> attr(logicOr)
        ) 
        >> condition_);

statement_ = filters_ >> command_;

注意事项:

  • 我决定字符串应该可以包含引号,所以我将 \ 设为转义字符
  • 唯一“棘手”的事情是确保过滤器(条件)以“WHERE”开头,并且每个后续条件必须以“AND”/“OR”开头。它使用语义 Action

    [ _pass = (phx::size(_val) == 0) ]
    

    在解析期间检查过滤器的结果列表(vector)当时是否为空

  • attr(...) 习语用于获取可选关键字 (NOT) 的默认值。关键字仅在语法中是可选的,在 AST 中不是:

     no_case["NOT"] >> attr(true) | attr(false)
    

我整理了一个演示,使用 Spirit Karma 打印回 AST。请注意,我并没有花太多精力来进行语法往返:

  1. like 运算符打印为与正则表达式相等 (m/.../)
  2. 不对字符串文字中的特殊字符进行转义

测试程序的输出:

parse success: 'where currency like "GBP|USD" set logging = 1, logfile = "myfile"'
parsed: WHERE  currency = m/GBP|USD/ SET logging=1.0, logfile="myfile" 
parse success: 'where not status = "ok" print ident, errorMessage'
parsed: WHERE NOT status = "ok" PRINT ident, errorMessage 
parse success: 'where status = "ok" or not currency like "GBP|USD" print ident, errorMessage'
parsed: WHERE  status = "ok" OR NOT currency = m/GBP|USD/ PRINT ident, errorMessage 
parse success: 'where status = "\"special\"" set logfile = "C:\\path\\to\\logfile.txt"'
parsed: WHERE  status = ""special"" SET logfile="C:\path\to\logfile.txt" 

完整的测试程序

注意:除了 parser 之外,它还包含一个 generator 来打印回解析的 AST 数据类型。

Live On Coliru

//#define BOOST_SPIRIT_DEBUG
#define BOOST_SPIRIT_USE_PHOENIX_V3
#include <boost/fusion/adapted.hpp>
#include <boost/variant.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi    = boost::spirit::qi;
namespace karma = boost::spirit::karma;
namespace phx   = boost::phoenix;

struct regex
{
    std::string _pattern;
    explicit regex(std::string const& pattern) : _pattern(pattern) {}
};

typedef boost::variant<double, int, std::string, regex> value;

enum logicOp { logicOr, logicAnd, logicPositive };

struct condition
{
    bool          _negated;
    std::string   _propertyname;
    value         _operand;      // value or regex
};

struct filter
{
    logicOp   _op;
    condition _cond;
};

struct setcommand
{
    typedef std::list<std::pair<std::string, value> > pairs;
    pairs _propvals;
};

struct printcommand
{
    std::vector<std::string> _propnames;
};

typedef boost::variant<printcommand, setcommand> command;

struct statement
{
    std::vector<filter> _filters;
    command             _command;
};

BOOST_FUSION_ADAPT_STRUCT(regex, (std::string, _pattern))
BOOST_FUSION_ADAPT_STRUCT(printcommand, (std::vector<std::string>, _propnames))
BOOST_FUSION_ADAPT_STRUCT(setcommand, (setcommand::pairs, _propvals))
BOOST_FUSION_ADAPT_STRUCT(condition, (bool, _negated)(std::string, _propertyname)(value, _operand))
BOOST_FUSION_ADAPT_STRUCT(filter, (logicOp, _op)(condition, _cond))
BOOST_FUSION_ADAPT_STRUCT(statement, (std::vector<filter>, _filters)(command, _command))

// see http://stackoverflow.com/a/14206443/85371
namespace boost { namespace phoenix { namespace stl {
    template <typename This, typename Key, typename Value, typename Compare, typename Allocator, typename Index>
        struct at_impl::result<This(std::map<Key,Value,Compare,Allocator>&, Index)>
        { typedef Value & type; };
    template <typename This, typename Key, typename Value, typename Compare, typename Allocator, typename Index>
        struct at_impl::result<This(std::map<Key,Value,Compare,Allocator> const&, Index)>
        { typedef Value const& type; };
}}}

template <typename It, typename Delim>
    struct generator : karma::grammar<It, statement(), Delim>
{
    generator() : generator::base_type(start)
    {
        using namespace karma;

        property_  = karma::string;
        strlit_    = '"'  << karma::string << '"';
        regex_     = "m/" << karma::string << "/";

        value_     = (double_ | int_ | strlit_ | regex_);
        negate_    = eps [ _pass = !_val ] | lit("NOT");

        condition_ = negate_  << property_  << '=' << value_;
        print_     = "PRINT " << property_ % ", ";
        set_       = "SET "   << (property_ << '=' << value_) % ", ";
        command_   = print_ | set_;

        static const auto logicOpNames = std::map<logicOp, std::string> { 
            { logicPositive, "WHERE" },
            { logicAnd, "AND" },
            { logicOr, "OR" } };

        logic_ = string [ _1 = phx::at(phx::cref(logicOpNames), _val) ];

        filters_ = +(logic_ << condition_);

        statement_ = filters_ << command_;

        start = statement_;
    }

  private:
    karma::rule<It, logicOp()            , Delim> logic_;
    karma::rule<It, statement()          , Delim> statement_;
    karma::rule<It, std::vector<filter>(), Delim> filters_;
    karma::rule<It, command()            , Delim> command_;
    karma::rule<It, condition()          , Delim> condition_;
    karma::rule<It, statement()          , Delim> start;
    karma::rule<It, bool()        > negate_;
    karma::rule<It, printcommand()> print_;
    karma::rule<It, setcommand()  > set_;
    karma::rule<It, std::string() > strlit_, property_;
    karma::rule<It, value()       > value_;
    karma::rule<It, regex()       > regex_;
};

template <typename It, typename Skipper = qi::space_type>
    struct parser : qi::grammar<It, statement(), Skipper>
{
    parser() : parser::base_type(start)
    {
        using namespace qi;

        // no-skipper rules
        property_  = alpha >> *alnum;
        strlit_    = '"' >> *(  (lit('\\') >> char_) | ~char_('"') ) > '"';

        // with-skipper rules
        regex_     = strlit_ [ _val = phx::construct<regex>(_1) ];
        value_     = double_ | int_ | strlit_;
        condition_ = (no_case["NOT"] >> attr(true) | attr(false)) 
            >> property_ 
            >> (
                    no_case["LIKE"] >> regex_ | '=' >> value_
               );

        print_   = no_case["PRINT"] >> property_ % ',';
        set_     = no_case["SET"] >> (property_ >> '=' >> value_) % ',';
        command_ = print_ | set_;

        filters_ %= +(
                (
                   no_case["WHERE"] [ _pass = (phx::size(_val) == 0) ] >> attr(logicPositive)
                 | no_case["AND"]   [ _pass = (phx::size(_val) >  0) ] >> attr(logicAnd)
                 | no_case["OR"]    [ _pass = (phx::size(_val) >  0) ] >> attr(logicOr)
                ) 
                >> condition_);

        statement_ = filters_ >> command_;

        start = statement_;
        BOOST_SPIRIT_DEBUG_NODES((start)(condition_)(value_)(strlit_)(regex_)(property_)(statement_)(filters_)(print_)(set_)(command_));
    }

  private:
    qi::rule<It, statement()          , Skipper> statement_;
    qi::rule<It, std::vector<filter>(), Skipper> filters_;
    qi::rule<It, printcommand()       , Skipper> print_;
    qi::rule<It, setcommand()         , Skipper> set_;
    qi::rule<It, command()            , Skipper> command_;
    qi::rule<It, value()              , Skipper> value_, regex_;
    qi::rule<It, condition()          , Skipper> condition_;
    qi::rule<It, statement()          , Skipper> start;
    // lexemes
    qi::rule<It, std::string()> strlit_, property_; // no skipper
};

bool doParse(std::string const& input)
{
    auto f(begin(input)), l(end(input));

    parser<decltype(f), qi::space_type> p;
    statement parsed;

    bool ok = qi::phrase_parse(f,l,p,qi::space,parsed);
    if (ok)   
    {
        std::cout << "parse success: '" << input << "'\n";
        generator<boost::spirit::ostream_iterator, karma::space_type> gen;
        std::cout << "parsed: " << karma::format_delimited(gen, karma::space, parsed) << "\n";
    }
    else      
        std::cerr << "parse failed: '" << std::string(f,l) << "'\n";

    if (f!=l) 
        std::cerr << "trailing unparsed: '" << std::string(f,l) << "'\n";

    return ok;
}

int main()
{
    doParse("where currency like \"GBP|USD\" set logging = 1, logfile = \"myfile\"");
    doParse("where not status = \"ok\" print ident, errorMessage");
    doParse("where status = \"ok\" or not currency like \"GBP|USD\" print ident, errorMessage");
    // All the extra levels of escaping get a bit ugly here. Of course, you'd be reading from a file/database/etc...
    doParse("where status = \"\\\"special\\\"\" set logfile = \"C:\\\\path\\\\to\\\\logfile.txt\"");
}

关于c++ - Boost Spirit 在服务器应用程序上实现小型单行 DSL,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14548592/

相关文章:

c++ - 为什么连续开始的 Dispatches 需要不同的时间?

c++ - 它有效,直到我输入第二个输入然后它关闭

c++ - boost::multiprecision::gmp_float::operator =?中的神秘stackoverflow异常?

gradle - 如何将动态模式名​​称传递给 Liquibase 变更集并使用 liquibase gradle 插件循环运行它们

eclipse - XText 多个文件扩展名

c++ - 使用 Clang (3.8) 和 Android NDK r14b 构建 Boost (1.58)

c++ - dlopen 期间 undefined symbol

c++ - 使用 spirit 解析器从字符串中提取值

c++ - 从 std::istringstream 构造 boost::archive::text_iarchive 时出现未知异常

c# - C#可以调用Racket代码吗?