c++ - pegtl - 如何跳过整个语法的空格

我正在尝试使用 PEGTL 解析一种非常简单的语言。我想我已经找到了问题，但不明白为什么；空格不会被忽略。我知道必须不能忽略空格，这样缩进感知语言也可以被解析。但是我找不到一种默认情况下“吃掉”空格的机制。鉴于:

struct kw_enum : tao::pegtl::string<'e', 'n', 'u', 'm'> { };
struct enum_decl : tao::pegtl::seq<kw_enum, tao::pegtl::identifier, tao::pegtl::one<';'>> { };

无法解析以下内容:

enum thing;

如果我在每个标记之间明确添加 pegtl::space，那么它就可以工作。但是在整个语法中这样做会是一个很大的负担。

如何在没有明确指定的情况下忽略/吃掉/跳过空格，就像在 C 中一样？

最佳答案

我认为没有捷径，您必须以一种明确的方式指定语法，以便明确允许空格的位置和数量。

我认为最好的方法是添加一个方便的规则模板，允许匹配由任何允许的分隔符分隔的规则列表 ( tao::pegtl::seq )(通常是空格 plus 注释)。

struct comment : tao::pegtl::disable< /* whatever your comment syntax is */ > {};
struct separator : tao::pegtl::sor< tao::pegtl::ascii::space, comment > {}; // either/or
struct seps : tao::pegtl::star< separator > {}; // Any separators, whitespace or comments

// Template to generate rule
// tao::pegtl::seq<Rule0, Separator, Rule1, Separator, Rule2, ... , Separator, RuleN>
template <typename Separator, typename... Rules>
struct interleaved;

template <typename Separator, typename Rule0, typename... RulesRest>
struct interleaved<Separator, Rule0, RulesRest...>
  : tao::pegtl::seq<Rule0, Separator, interleaved<Separator, RulesRest...>> {};

template <typename Separator, typename Rule0>
struct interleaved<Separator, Rule0>
  : Rule0 {};

// Note: interleaved<Separator /*, no Rule! */> intentionally not defined.

struct enum_decl : interleaved<seps, kw_enum, tao::pegtl::identifier, tao::pegtl::one<';'> {};
// Expands to:
seq<kw_enum, seps, interleaved<seps, identifier, one<';'>>> ==
seq<kw_enum, seps, seq<identifier, seps, interleaved<seps, one<';'>>>> ==
seq<kw_enum, seps, seq<identifier, seps, one<';'>> ==
seq<kw_enum, seps, identifier, seps, one<';'>>

基本上，做类似上面的事情，你只需要替换tao::pegtl::seq<R...>与 interleaved<seps, R...> , 但您甚至可以为此创建一个单独的别名:

template<typename... Rules>
using sseq = interleaved<seps, Rules...>;

// Now you only have to replace tao::pegtl::seq with sseq
struct enum_decl : sseq<kw_enum, tao::pegtl::identifier, tao::pegtl::one<';'>> {};

之所以需要这种策略，不是因为对空格有重要意义的语言，而是因为您在解析中没有较早的分词器步骤来分隔 enum来自 thing .另一种策略是首先实现它，然后根据标记流而不是字符流进行后期解析，但这是一个更大的重写并且有其自身的缺点。

Note: I haven't compiled the code, but the point here should be the strategy. Please leave a comment if there is a compile error (I hope I haven't messed up the variadic templates) or if something is unclear.

Note 2: Also you probably don't want to replace every seq with sseq. For example, if you have the logical and (&&) defined as seq<one<'&'>, one<'&'>>, that's something you probably don't want to change.

关于c++ - pegtl - 如何跳过整个语法的空格，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53427551/

c++ - pegtl - 如何跳过整个语法的空格

上一篇：c++ - 避免模板参数中迭代器类型的累积

下一篇：c++ - QFileInfo size() 返回快捷方式目标大小

c++ - pegtl - 如何跳过整个语法的空格

上一篇：c++ - 避免模​​板参数中迭代器类型的累积

下一篇：c++ - QFileInfo size() 返回快捷方式目标大小

上一篇：c++ - 避免模板参数中迭代器类型的累积