从 C++0x 工作草案开始,用于处理 Unicode 的新字符类型(char16_t 和 char32_t)将是无符号的(uint_least16_t uint_least32_t 将是基础类型)。
但据我所知(也许不是很远)没有定义类型char8_t(基于uint_least8_t)。为什么 ?
当您看到为 UTF-8 字符串文字引入了新的 u8 编码前缀时,它更加令人困惑......基于老 friend (有符号/无符号)char强>.为什么?
更新: 有人提议添加一个新类型:char8_t
char8_t:UTF-8 字符和字符串的类型(修订版 1) http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0482r1.html
最佳答案
char 将是用于 UTF-8 的类型,因为它被重新定义以确保它可以与它一起使用:
For the purpose of enhancing support for Unicode in C++ compilers, the definition of the type char has been modified to be both at least the size necessary to store an eight-bit coding of UTF-8 and large enough to contain any member of the compiler's basic execution character set. It was previously defined as only the latter. There are three Unicode encodings that C++0x will support: UTF-8, UTF-16, and UTF-32. In addition to the previously noted changes to the definition of char, C++0x will add two new character types: char16_t and char32_t. These are designed to store UTF-16 and UTF-32 respectively.
来源:http://en.wikipedia.org/wiki/C%2B%2B0x
大多数 UTF-8 应用程序已经在 PC/mac 上使用 char。
关于c++ - C++0x 中 char 和 Unicode 的符号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2391268/