strtol
规范在概念上将输入字符串分为“初始空白”、“主题序列”和“最终字符串”,并将“主题序列”定义为:
the longest initial subsequence of the input string, starting with the first non-white-space character that is of the expected form. The subject sequence shall contain no characters if the input string is empty or consists entirely of white-space characters, or if the first non-white-space character is other than a sign or a permissible letter or digit.
有一次我认为“最长初始子序列”业务类似于 scanf
的工作方式,其中 "0x@"
将扫描为 "0x "
,匹配失败,后跟 "@"
作为下一个未读字符。然而,经过一些讨论,我基本上相信 strtol
处理的是预期形式的最长初始子序列,而不是最长的初始字符串,它是某些可能的预期形式字符串的初始子序列.
仍然让我感到困惑的是规范中的这种语言:
If the subject sequence is empty or does not have the expected form, no conversion is performed; the value of str is stored in the object pointed to by endptr, provided that endptr is not a null pointer.
如果我们接受似乎是“主题序列”的正确定义,则不存在不具有预期形式的非空主题序列之类的东西,取而代之的是(为了避免冗余和混淆)文本应该只是阅读:
If the subject sequence is empty, no conversion is performed; the value of str is stored in the object pointed to by endptr, provided that endptr is not a null pointer.
任何人都可以为我澄清这些问题吗?也许指向过去讨论或任何相关缺陷报告的链接会很有用。
最佳答案
我觉得C99语言讲的很清楚:
The subject sequence is defined as the longest initial subsequence of the input string, starting with the first non-white-space character, that is of the expected form.
给定 "0x@"
,"0x@"
不是预期的形式; “0x”
不是预期的形式;因此 "0"
是预期形式的最长初始子序列。
我同意这意味着你不能有一个非预期形式的非空主题序列 - 除非你解释以下内容:
In other than the
"C"
locale, additional locale-specific subject sequence forms may be accepted.
...允许语言环境定义主题序列可能具有的其他可能形式,但这些形式仍然不是“预期形式”。
最后一段的措辞似乎只是“腰带和大括号”。
关于strtol 等规范中的混淆语言,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/6701089/