python - 将字符串从 C 传递到 Python 进行多处理，无需进行额外的复制

标签 python multiprocessing python-c-api

我有一个嵌入 Python 2.7 解释器的 C 应用程序。在我的程序中的某个时刻，会生成一个可能很大的字符串 (char*)，需要由一些 Python 代码进行处理。我使用 PyObject_CallFunction 调用 Python 函数并将字符串作为参数传递。然后，此 Python 函数使用multiprocessing 库在单独的进程中分析数据。

将字符串传递给 Python 函数将在 Python str 对象中创建数据的副本。我试图通过将缓冲区对象传递给 Python 函数来避免这种额外的复制。不幸的是，这会在 unpickle 期间的multiprocessing 过程中生成错误:

类型错误:buffer() 至少需要 1 个参数(给定 0 个)

似乎 buffer 对象可以被 pickle，但不能 unpickled。

关于将 char* 从 C 传递到 multiprocessing 函数而不进行额外的复制，有什么建议吗？

最佳答案

对我有用的方法:

在创建大 C 字符串之前，使用 Python 为其分配内存:

PyObject *pystr = PyString_FromStringAndSize(NULL, size);
char *str = PyString_AS_STRING(pystr);
/* now fill <str> with <size> bytes */

这样，当需要将其传递给 Python 时，您就不必创建副本:

PyObject *result = PyObject_CallFunctionObjArgs(callable, pystr, NULL);
/* or PyObject_CallFunction(callable, "O", pystr) if you prefer */

请注意，完成此操作后，您不应修改字符串。

关于python - 将字符串从 C 传递到 Python 进行多处理，无需进行额外的复制，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/9931749/

上一篇：python - 如何将 wokkel 与 google talk : (error twisted.words.protocols.jabber.sasl.SASLNoAcceptableMechanism) 一起使用

下一篇：python - 如何在未安装以前版本的 Linux 中安装 Python

python - 如何在我的代码上使用多处理/多线程？

Python C API : Assigning PyObjects to a dictionary causes memory leak

python - 如何将 python C 扩展方法声明为类方法？

python - SQLAlchemy 等效于 ActiveRecord 中的命名范围

python - 从python中的数组制作数组

python - 如何使用 Python 设置文件的 ctime？

python 多处理: setting class attribute value

Python 3.3 C-API 和 UTF-8 字符串

python - Vertica Python 读取结果抛出 UnicodeDecodeError