python - 通过 ctypes 将 Unicode 字符串传递给 printf

我正在测试 Python 3.x 的内置 ctypes 模块，然后再花一些时间为我现有的 C 库制作一个包装器。

我知道 C 中的标准库函数需要 ASCII 输入手册中标记为 char * 的任何内容。但是，我的库是 UTF-8 兼容的，我已经在 C 程序中测试过它。我还测试了以下代码在为 C11 编译时是否有效并按预期工作:

printf("Hello, %s!\n", u8"world");

但是，如果我在 Python 中尝试同样的操作，只会打印字符串中的第一个字符。

from ctypes import *

libc = CDLL("libc.so.6")

libc.printf(b"Hello, %s!\n", "world") # will print: Hello, w!

关于 Unicode 的 Python 3 手册暗示 Python 3 使用 UTF-8 作为其字符编码，这应该避免 printf 会看到并停止读取的嵌入 NUL 字节。如果我将 Python 测试中的 %s 更改为 %ls，它会按预期打印。

Python 实际上使用的是 UTF-16 吗？

最佳答案

Python 3(3.3 之前)在内部使用 UCS-16 或 UCS-32，per the docs :

Strings are stored internally as sequences of codepoints (to be precise as Py_UNICODE arrays). Depending on the way Python is compiled (either via --without-wide-unicode or --with-wide-unicode, with the former being the default) Py_UNICODE is either a 16-bit or 32-bit data type.

Py_UNICODE

This type represents the storage type which is used by Python internally as basis for holding Unicode ordinals. Python’s default builds use a 16-bit type for Py_UNICODE and store Unicode values internally as UCS2. It is also possible to build a UCS4 version of Python (most recent Linux distributions come with UCS4 builds of Python). These builds then use a 32-bit type for Py_UNICODE and store Unicode data internally as UCS4.

关于python - 通过 ctypes 将 Unicode 字符串传递给 printf，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/27470027/

python - 通过 ctypes 将 Unicode 字符串传递给 printf

Py_UNICODE

上一篇：c++ - 在 C 或 C++ 中打开 jpeg 或 png 图像作为像素数据

下一篇：c - 如果在多维数组中使用函数，则缩短