c++ - Boost.Spirit.Qi 替代 (|) 解析器问题

标签 c++ parsing boost c++14 boost-spirit

我正在编写一个 Qi 解析器来解析 IRC 消息,转录 RFC 2812 。语法中有一个完全普遍的替代方案:

auto const hostname = shortname >> *('.' >> shortname);
auto const nickUserHost = nickname >> -(-('!' >> user) >> '@' >> host);

auto const prefix = hostname | nickUserHost;

( Full code on Coliru here )

我很困惑地发现我的测试字符串 ( "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="86c2abfca7c2abfcc6ebefe4abc7b5c7b6b4b0c0c0a8f4e3f0a8f5e0f4a8e8e3f2" rel="noreferrer noopener nofollow">[email protected]</a>" ) 与 nickUserHost 匹配,但不是prefix .

我看到的唯一值得注意的事情是 nickUserHosthost本身是根据 hostname 定义的,但我不确定它会如何影响解析。

最佳答案

通过附加>> eoi,如果未到达输入末尾,则显式地使解析失败。

Live On Coliru

#include <string>
#include <iostream>
#include <iomanip>

#include <boost/spirit/include/qi.hpp>

namespace qi = boost::spirit::qi;

template <typename Expr>
void test(std::string name, Expr const& expr) {
    std::string const test = "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="5c1871267d1871261c31353e711d6f1d6c6e6a1a1a722e392a722f3a2e72323928" rel="noreferrer noopener nofollow">[email protected]</a>";

    auto f = begin(test);
    bool ok = qi::parse(f, end(test), expr);
    std::cout << name << ": " << ok << "\n";
    if (f != end(test))
        std::cout << " -- remaining input: '" << std::string(f, end(test)) << "'\n";
}

int main() {
    auto const hexdigit = qi::char_("0123456789ABCDEF");
    auto const special = qi::char_("\x5b-\x60\x7b-\x7d");

    auto const oneToThreeDigits = qi::repeat(1, 3)[qi::digit];
    auto const ip4addr = oneToThreeDigits >> '.' >> oneToThreeDigits >> '.' >> oneToThreeDigits >> '.' >> oneToThreeDigits;
    auto const ip6addr = +(hexdigit >> qi::repeat(7)[':' >> +hexdigit]) | ("0:0:0:0:0:" >> (qi::lit('0') | "FFFF") >> ':' >> ip4addr);
    auto const hostaddr = ip4addr | ip6addr;

    auto const nickname = (qi::alpha | special) >> qi::repeat(0, 8)[qi::alnum | special | '-'];
    auto const user = +(~qi::char_("\x0d\x0a\x20\x40"));

    auto const shortname = qi::alnum >> *(qi::alnum | '-');
    auto const hostname = shortname >> *('.' >> shortname);
    auto const host = hostname | hostaddr;

    auto const nickUserHost = nickname >> -(-('!' >> user) >> '@' >> host);

    auto const prefix = hostname | nickUserHost; // The problematic alternative

    std::cout << std::boolalpha;
    test("hostname",     hostname);
    test("nickUserHost", nickUserHost);
    test("prefix",       prefix);
}

打印

hostname: true
-- remaining input: '<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="ceef8ae3b48ea3a7ace38ffd8ffefcf88888e0bcabb8e0bda8bce0a0abba" rel="noreferrer noopener nofollow">[email protected]</a>'
nickUserHost: true
prefix: true
-- remaining input: '<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="af8eeb82d5efc2c6cd82ee9cee9f9d99e9e981ddcad981dcc9dd81c1cadb" rel="noreferrer noopener nofollow">[email protected]</a>'

关于c++ - Boost.Spirit.Qi 替代 (|) 解析器问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34839306/

相关文章:

c++ - 简单的 C++ vector 用法

c# - 解析字符串 - 有比检查每一行更有效的方法吗?

c++ - pugixml错误解析文档c++

c++ - 在空间图上使用 boost 凸包

c++ - 使用 boost-asio 时实时将缓冲区写入磁盘

c++ - 限制指针算术或比较的基本原理是什么?

c++ - 我需要什么 C++ 库来编译这个程序

c++ - int 数组 C++ 的大小

UTF-8 字符串的 java.lang.NumberFormatException

c++ - 编写一个使多个容器看起来像一个的迭代器