c++ - 'context string for the given token' 是什么意思？

标签 c++ tokenize

我正在编写基于 nand2tetris 类(class)的分词器作业(用 C++ 编写)，作业的一部分需要上下文字符串。我不确定这是什么意思，我正在寻找故障或某种伪/示例代码来说明它的含义。 (我觉得这就像盯着书架寻找一本就在你面前的书，但你却看不到，因为你已经找了太久了!)

指令是:

Generate a context string for the given token. It shows the line before the token, the line containing the token, and a line with a ^ marking the token's position. Tab stops are every 8 characters in the context string, tabs are replaced by spaces (1 to 8) so that the next character starts on an 8 character boundary.

我知道这可能是明显的英语而不是代码的情况，但我只是有点迷路，任何帮助都是传奇，因为我在编程方面仍然非常基础。

我在想:

string token_context(Token token)
{
    return "previous line \n" + "token" + "somehow having 8 spaces and the ^ symbol where the token is" ;
}

最佳答案

将上下文字符串想象成您在编译器错误消息中看到的那样。上下文字符串用于显示 token 周围的内容或其上下文。问题是要求三行:

紧接着包含标记的行之后的文本行。
包含标记 it 的文本行。
包含 ^ 的一行。 ^ 的位置应该在实际 token 的下方。

有关选项卡的内容是帮助您将 ^ 放在正确的位置。基本上，它是说制表符就像可变数量的空格。制表符的空格数使下一个字符成为 8 的倍数。例如 "ab\tc" 应该被认为与 "ab c" 因为制表符 (\t) 位于第三个空格，所以它的作用类似于 6 个空格，因此 c 将位于字符串的第八个位置。

关于c++ - 'context string for the given token' 是什么意思？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55755345/

上一篇：c++ - 函数无法匹配 GLDEBUGPROC，仅在 MSVC 上(使用相同的 glew 版本 2.1.0，在 Linux 上使用 gcc/clang 没问题)

下一篇：c++ - 将字节 vector 转换为浮点 vector

相关文章：

c - 当我将文件传递到链接列表时，如何读取文件并确定其数据表示形式？

java - OpenNLP Tokenizer 不检测属于一起的单词？

python - nltk 句子标记器，将新行视为句子边界

python - 如何使用 tokenize 注释器和 pycorenlp(Stanford CoreNLP 的 Python 包装器)执行文本的单词标记，而不使用 ssplit？

c++ - 友元函数 C++

c++ - Qt中如何设置central widget填充整个主窗口

c++ - 这是使用 strerror_r 的正确方法吗？

c++ - 模板函数中 STL 容器的默认参数

c++ - 使用模板参数重载 C++ 方法 : how to make this work for subclass of template?

java - 如何从 Lucene TokenStream 中获取 Token？