我正在构建一个 API,它允许我获取各种编码的字符串,包括 utf8、utf16、utf32 和 wchar_t(根据操作系统可能是 utf32 或 utf16)。
新的 C++ 标准引入了新的类型
char16_t
和char32_t
没有这种大小的歧义,应该在将来使用,所以我也想支持它们,但问题是,它们会干扰正常的uint16_t
吗? ,uint32_t
,wchar_t
类型不允许重载,因为它们可能引用相同的类型?class some_class { public: void set(std::string); // utf8 string void set(std::wstring); // wchar string utf16 or utf32 according // to sizeof(wchar_t) void set(std::basic_string<uint16_t>) // wchar independent utf16 string void set(std::basic_string<uint32_t>); // wchar independent utf32 string #ifdef HAVE_NEW_UNICODE_CHARRECTERS void set(std::basic_string<char16_t>) // new standard utf16 string void set(std::basic_string<char32_t>); // new standard utf32 string #endif };
所以我可以写:
foo.set(U"Some utf32 String"); foo.set(u"Some utf16 string");
std::basic_string<char16_t>
的 typedef 是什么?和std::basic_string<char32_t>
就像今天一样:typedef basic_string<wchar_t> wstring.
我找不到任何引用资料。
编辑:根据 gcc-4.4 的标题,引入了这些新类型:
typedef basic_string<char16_t> u16string; typedef basic_string<char32_t> u32string;
我只是想确保这是实际的标准要求,而不是 gcc-ism。
最佳答案
1) char16_t
和 char32_t
将是不同的新类型,因此可以对它们进行重载。
引自 ISO/IEC JTC1 SC22 WG21 N2018 :
Define
char16_t
to be a typedef to a distinct new type, with the name_Char16_t
that has the same size and representation asuint_least16_t
. Likewise, definechar32_t
to be a typedef to a distinct new type, with the name_Char32_t
that has the same size and representation asuint_least32_t
.
进一步解释(来自 devx.com 文章“Prepare Yourself for the Unicode Revolution”):
You're probably wondering why the
_Char16_t
and_Char32_t
types and keywords are needed in the first place when the typedefsuint_least16_t
anduint_least32_t
are already available. The main problem that the new types solve is overloading. It's now possible to overload functions that take_Char16_t
and_Char32_t
arguments, and create specializations such asstd::basic_string<_Char16_t>
that are distinct fromstd::basic_string <wchar_t>
.
2) u16string
和 u32string
确实是 C++0x 的一部分,而不仅仅是 GCC 主义,正如 various standard draft papers 中提到的那样.它们将包含在新的 <string>
中标题。引用同一篇文章:
The Standard Library will also provide
_Char16_t
and_Char32_t
typedefs, in analogy to the typedefswstring
,wcout
, etc., for the following standard classes:
filebuf, streambuf, streampos, streamoff, ios, istream, ostream, fstream, ifstream, ofstream, stringstream, istringstream, ostringstream,
string
关于c++ - C++0x 中的新 unicode 字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/872491/