python - 如何在大小为 `n` 次 `m` (n≠m) 的非方阵的 numpy 数组上使用 c 函数进行计算

编辑

我创建了一个类似的问题，我发现它在那里更容易理解和实用:How to copy a 2D array (matrix) from python with a C function (and do some computer heavy computation) which return a 2D array (matrix) in python?

原始问题

我想在 python 中使用 C 来计算一个大小为 n 倍 m 的大型非方阵的所有条目。我从那里的优秀教程中复制了代码:https://medium.com/spikelab/calling-c-functions-from-python-104e609f2804 .方阵的代码

我先编译了c_sum.c脚本

$ cc -fPIC -shared -o c_sum.so c_sum.c

然后运行 python 脚本:

$ python main.py

而且运行良好。但是，如果我将 main.py 中的 n 和 m 的值设置为不同的值，则会出现段错误。我想必须分别为 n 和 m 分配内存，但我对 C 的了解还很初级，不知道如何去做。比方说，可以使用 m=3000 和 n=2000 的代码是什么？

这是脚本c_sum.c:

#include <stdlib.h>

double * c_sum(const double * matrix, int n, int m){
    double * results = (double *)malloc(sizeof(double) * n);
    int index = 0;
    for(int i=0; i< n*m; i+=n){
        results[index] = 0;
        for(int j=0; j<m; j++){
            results[index] += matrix[i+j];
        }
        index += 1;
    }
    return results;
}

这是 main.c 脚本:

# https://medium.com/spikelab/calling-c-functions-from-python-104e609f2804
from ctypes import c_void_p, c_double, c_int, cdll
from numpy.ctypeslib import ndpointer
import numpy as np
import time
import pdb

def py_sum(matrix: np.array, n: int, m: int) -> np.array:
    result = np.zeros(n)
    for i in range(0, n):
        for j in range(0, m):
            result[i] += matrix[i][j]
    return result

n = 3000
m = 3000
matrix = np.random.randn(n, m)

time1 = time.time()
py_result = py_sum(matrix, n, m)
time2 = time.time() - time1
print("py running time in seconds:", time2)
py_time = time2

lib = cdll.LoadLibrary("c_sum.so")
c_sum = lib.c_sum
c_sum.restype = ndpointer(dtype=c_double,
                          shape=(n,))

time1 = time.time()
result = c_sum(c_void_p(matrix.ctypes.data),
            c_int(n),
            c_int(m))
time2 = time.time() - time1
print("c  running time in seconds:", time2)

c_time = time2
print("speedup:", py_time/c_time)

最佳答案

我假设您想计算 (n,m) 矩阵沿最后一个轴的总和。当您访问您无权访问的内存时，会发生段错误。问题在于错误的外循环。您需要对两个维度进行迭代，但您对同一维度进行了两次迭代。

double * results = (double *)malloc(sizeof(double) * n); /* you allocate n doubles. 
         Do you free this Outside function? If not, you are having a memory leak. 
        An alternative way is to pass the output array to function, so that you can avoid creating memory in the function*/

for(int i=0; i< n*m; i+=n){ /* i+=n => you are iterating for m times. also you are iterating over last dimension */

        results[index] = 0; /* when index > n ; you are accessing data which 
                           you don't have access leading to segmentation fault */
        for(int j=0; j<m; j++) /* you are iterating again over last axis*/
        {
            results[index] += matrix[i+j];
        }

        index += 1; /* this leads to index > n as you iterate for m times and m>n in this case.
                   For a square matrix, m=n, so you don't have any issue */
    }

TLDR:要修复段错误，您需要替换 for(int i=0; i< n*m; i+=n)与 for(int i=0; i< n*m; i+=m)这样您就可以只在两个维度上迭代 n 次。

关于python - 如何在大小为 `n` 次 `m` (n≠m) 的非方阵的 numpy 数组上使用 c 函数进行计算，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/73715140/

python - 如何在大小为 `n` 次 `m` (n≠m) 的非方阵的 numpy 数组上使用 c 函数进行计算

上一篇：migration - Ordercloud 迁移到新的沙箱

下一篇：visual-studio-2022 - 我可以手动删除旧版本的 Windows 10 SDK 吗？