Python 使用 ctypes 调用 Rust FFI 在退出时崩溃并返回 "pointer being freed was not allocated"

标签 python rust ctypes ffi

我正在尝试释放分配给 CString 并使用 ctypes 传递给 Python 的内存。但是,Python 因 malloc 错误而崩溃:

python(30068,0x7fff73f79000) malloc: *** error for object 0x103be2490: pointer being freed was not allocated 

这是我用来将指针传递给 ctypes 的 Rust 函数:

#[repr(C)]
pub struct Array {
    pub data: *const c_void,
    pub len: libc::size_t,
}

// Build &mut[[f64; 2]] from an Array, so it can be dropped
impl<'a> From<Array> for &'a mut [[f64; 2]] {
    fn from(arr: Array) -> Self {
        unsafe { slice::from_raw_parts_mut(arr.data as *mut [f64; 2], arr.len) }
    }
}

// Build an Array from a Vec, so it can be leaked across the FFI boundary
impl<T> From<Vec<T>> for Array {
    fn from(vec: Vec<T>) -> Self {
        let array = Array {
            data: vec.as_ptr() as *const libc::c_void,
            len: vec.len() as libc::size_t,
        };
        mem::forget(vec);
        array
    }
}

// Build a Vec from an Array, so it can be dropped
impl From<Array> for Vec<[f64; 2]> {
    fn from(arr: Array) -> Self {
        unsafe { Vec::from_raw_parts(arr.data as *mut [f64; 2], arr.len, arr.len) }
    }
}

// Decode an Array into a Polyline
impl From<Array> for String {
    fn from(incoming: Array) -> String {
        let result: String = match encode_coordinates(&incoming.into(), 5) {
            Ok(res) => res,
            // we don't need to adapt the error
            Err(res) => res
        };
        result
    }
}

#[no_mangle]
pub extern "C" fn encode_coordinates_ffi(coords: Array) -> *mut c_char {
    let s: String = coords.into();
    CString::new(s).unwrap().into_raw()
}

还有我用来在 Python 返回时释放指针的那个

pub extern "C" fn drop_cstring(p: *mut c_char) {
    unsafe { CString::from_raw(p) };
}

还有我用来将指针转换为 str 的 Python 函数:

def char_array_to_string(res, _func, _args):
    """ restype is c_void_p to prevent automatic conversion to str
    which loses pointer access

    """
    converted = cast(res, c_char_p)
    result = converted.value
    drop_cstring(converted)
    return result

我用来生成传递给 Rust 的 Array 结构的 Python 函数:

class _FFIArray(Structure):
    """
    Convert sequence of float lists to a C-compatible void array
    example: [[1.0, 2.0], [3.0, 4.0]]

    """
    _fields_ = [("data", c_void_p),
                ("len", c_size_t)]

    @classmethod
    def from_param(cls, seq):
        """  Allow implicit conversions """
        return seq if isinstance(seq, cls) else cls(seq)

    def __init__(self, seq, data_type = c_double):
        arr = ((c_double * 2) * len(seq))()
        for i, member in enumerate(seq):
            arr[i][0] = member[0]
            arr[i][1] = member[1]
        self.data = cast(arr, c_void_p)
        self.len = len(seq)

argtyperestype 定义:

encode_coordinates = lib.encode_coordinates_ffi
encode_coordinates.argtypes = (_FFIArray,)
encode_coordinates.restype = c_void_p
encode_coordinates.errcheck = char_array_to_string

drop_cstring = lib.drop_cstring
drop_cstring.argtypes = (c_char_p,)
drop_cstring.restype = None

我倾向于认为这不是 Rust 函数,因为 dylib 崩溃会导致段错误(并且 FFI 测试在 Rust 端通过)。我也可以在调用 FFI 函数后继续在 Python 中进行其他操作——进程退出时会出现 malloc 错误。

最佳答案

感谢J.J. Hakala's answer中所做的努力,我能够产生一个 MCVE在纯 Rust 中:

extern crate libc;

use std::ffi::CString;
use libc::c_void;

fn encode_coordinates(coordinates: &Vec<[f64; 2]>) -> String {
    format!("Encoded coordinates {:?}", coordinates)
}

struct Array {
    data: *const c_void,
    len: libc::size_t,
}

impl From<Array> for Vec<[f64; 2]> {
    fn from(arr: Array) -> Self {
        unsafe { Vec::from_raw_parts(arr.data as *mut [f64; 2], arr.len, arr.len) }
    }
}

impl From<Array> for String {
    fn from(incoming: Array) -> String {
        encode_coordinates(&incoming.into())
    }
}

fn encode_coordinates_ffi(coords: Array) -> CString {
    CString::new(String::from(coords)).unwrap()
}

fn main() {
    for _ in 0..10 {
        let i_own_this = vec![[1.0, 2.0], [3.0, 4.0]];

        let array = Array {
            data: i_own_this.as_ptr() as *const _,
            len: i_own_this.len(),
        };

        println!("{:?}", encode_coordinates_ffi(array))
    }
}

这打印:

"Encoded coordinates [[1, 2], [3, 4]]"
"Encoded coordinates [[1, 2], [3, 4]]"
"Encoded coordinates [[0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000012169663452665325, 213780573330512200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000], [3.0000002417770535, 4]]"
"Encoded coordinates [[0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000012169663452665325, 213780573330512200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000], [3.0000002417770535, 4]]"
"Encoded coordinates [[0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000012169663452665325, 213780573330512200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000], [3.0000002417770535, 4]]"
"Encoded coordinates [[0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000012169663452665325, 213780573330512200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000], [3.0000002417770535, 4]]"
"Encoded coordinates [[0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000012169663452665325, 213780573330512200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000], [3.0000002417770535, 4]]"
"Encoded coordinates [[0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000012169663452665325, 213780573330512200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000], [3.0000002417770535, 4]]"
"Encoded coordinates [[0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000012169663452665325, 213780573330512200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000], [3.0000002417770535, 4]]"
"Encoded coordinates [[0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000012169663452665325, 213780573330512200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000], [3.0000002417770535, 4]]"

主要问题在这里:

impl From<Array> for Vec<[f64; 2]> {
    fn from(arr: Array) -> Self {
        unsafe { Vec::from_raw_parts(arr.data as *mut [f64; 2], arr.len, arr.len) }
    }
}

让我们看看 documentation for Vec::from_raw_parts :

This is highly unsafe, due to the number of invariants that aren't checked:

  • ptr needs to have been previously allocated via String/Vec<T> (at least, it's highly likely to be incorrect if it wasn't).
  • length needs to be the length that less than or equal to capacity.
  • capacity needs to be the capacity that the pointer was allocated with.

Violating these may cause problems like corrupting the allocator's internal datastructures.

然而,如图所示的原始代码违反第一点——指针由malloc分配。 .

为什么会出现这种情况?当您调用 Vec::from_raw_parts ,它获得了指针的所有权。当 Vec超出范围,指向的内存被解除分配。这意味着您正试图多次释放该指针。

因为函数的安全性取决于传入的内容,entire function should be marked unsafe .在这种情况下,这将违反特征的接口(interface),因此您需要将其移到别处。

更明智的是,您可以转换 Array一片。这仍然不安全,因为它依赖于传入的指针,但它不拥有底层指针。然后,您可以将切片制作成 Vec , 分配新内存并复制内容。

因为您可以控制 encode_coordinates ,您还应该更改其签名。 &Vec<T>在 99.99% 的情况下是无用的,实际上可能效率较低:它需要两个指针取消引用而不是一个。相反,接受 &[T] .这允许传递更广泛的类型,包括数组和 Vec

关于Python 使用 ctypes 调用 Rust FFI 在退出时崩溃并返回 "pointer being freed was not allocated",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38412184/

相关文章:

rust - '移动发生是因为值具有类型' Rust 错误

python - 如何将变量传递给mapnik数据源类?

rust - Rust 中带有整数和 float 的泛型函数的问题。在 Rust 中研究计算机程序的结构和解释

rust - 将方法的值添加到 serde 序列化输出

python - 使用 ctypes 传递结构指针

来自 C++ 的 Python 回调调用在 native 站点上失败

python - 将使用 FFmpeg 截取的屏幕截图保存到 Amazon S3 存储桶中

python - 将元组的 Unicode 输出转换为字符串 Python

python - 使用python3爬取维基百科子类别时出错

Python:如何增加 ctypes POINTER 实例