根据 N4140
(C++11 工作草案):
The fundamental storage unit in the C ++ memory model is the byte. A byte is at least large enough to contain any member of the basic execution character set and the eight-bit code units of the Unicode UTF-8 encoding form and is composed of a contiguous sequence of bits, the number of which is implementation-defined. (§6.6.1-1; p.48)
我认为只需要 8 位来包含“Unicode UTF-8 编码形式的八位代码单元”的所有成员。是否还需要更多位来包含“基本执行字符集”的所有成员?为什么CHAR_BIT
在很多实现中可以是8?
最佳答案
基本执行字符定义如下(强调我的):
The basic execution character set and the basic execution wide-character set shall each contain all the members of the basic source character set, plus control characters representing alert, backspace, and carriage return, plus a null character (respectively, null wide character), whose value is 0. For each basic execution character set, the values of the members shall be non-negative and distinct from one another. In both the source and execution basic character sets, the value of each character after 0 in the above list of decimal digits shall be one greater than the value of the previous. The execution character set and the execution wide-character set are implementation-defined supersets of the basic execution character set and the basic execution wide-character set, respectively. The values of the members of the execution character sets and the sets of additional members are locale-specific.
基本源字符集是这样的:
The basic source character set consists of 96 characters: the space character, the control characters representing horizontal tab, vertical tab, form feed, and new-line, plus the following 91 graphical characters:
a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ~ ! = , \ " '
请注意标准定义的基本执行字符集与实现定义的执行字符集之间的区别。前者仅包含大约 100 个字符,而那些(无论是哪个)的编码可以很好地适应 8 位。
在阅读问题中的段落时,还必须谨慎行事。一个字节需要足够大以容纳基本执行字符集中的字符编码或 utf-8 字符。前一种编码可能(通常是)是后者的子集,但即使不一定是,8 位也足够了。
关于c++ - 为什么 CHAR_BIT 通常是 8?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49766777/