c - Unicode 与多字节

我真的很困惑这个 unicode 和多字节的东西。

假设我正在用 Unicode 编译我的程序(但最终，我想要一个独立于所用字符集的解决方案)。

1) 所有'char'都会被解释为宽字符吗？

2) 如果我有一个简单的 printf 语句，即 printf("Hello World\n");没有字符串，我可以不使用 _tprintf 和 _T("...") 吗？如果 printf 语句包含字符串，那么我应该使用 _tprintf 和 _T("...")，即 _tprintf("Hello %s\n", name); ？

3) 如果我有一个文本文件(以默认格式保存，即不更改使用的默认字符集)我想读入缓冲区，我仍然可以使用 char 而不是 TCHAR 吗？特别是如果我逐个字符地读取它，即通过递增字符指针？

谢谢。

问候，雷恩

最佳答案

首先，如果您使用UNICODE/_UNICODE 编译并且不打算针对其他平台，您可以避免使用TCHAR 业务并在任何地方使用 WCHAR(或 wchar_t)和 W 函数。

1) Will all 'char' be interpreted as wide characters?

char 在 C 中——根据定义——1 字节。 (这在技术上并不排除它在 wchar_t 也是 1 字节的平台上成为“宽字符”，但考虑到您使用的是 MSVC 并且针对 Windows 平台，这不会是情况。)

因此，出于实际目的，答案是:否。

2) If I have a simple printf statement, i.e. printf("Hello World\n"); with no character strings, can I just leave it be without using _tprintf and _T("...")? If the printf statement includes a character string, then I should use _tprintf and _T("..."), i.e. _tprintf("Hello %s\n", name); ?

如果您正在打印 ASCII 字符串文字，您可以继续使用 printf。

如果您正在打印可能位于 ASCII 范围之外的任意字符串，您应该使用 _tprintf(或 wprintf)。

3) If I have a text file (saved in the default format, i.e. without changing the default character set used) that I want to read into a buffer, can I still use char instead of TCHAR? Especially if I'm reading it character by character, i.e. by incrementing the character pointer?

什么是“默认格式”？

当您读入外部文件时，您应该先读入前几个字节以检查 UTF-16 或 UTF-8 BOM，然后据此做出决定。

关于c - Unicode 与多字节，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/2226528/

c - Unicode 与多字节

上一篇：c - 是否正在积极开发下一个 C 标准？

下一篇：c - CPU寄存器的大小