python - 一旦数据进入 Cython 模块，精度就会丢失/更改

我将一些使用 NumPy 的代码移植到 Cython 中以获得一些性能提升。我取得了相当大的进步，但我遇到了一个问题。

Cython 得到的结果与Python 得到的结果不同。我不知道为什么会这样，所以我决定查看将什么推送到 Cython 模块。

在到达 Cython 之前，数据如下所示:

azimuth = 0.000349065850399 

rawDistance = [ 2.682  7.234  2.8    7.2    2.912  7.19   3.048  7.174  3.182  7.162
  3.33   7.164  3.506  7.158  3.706  7.154  3.942  7.158  4.192  7.158
  4.476  7.186  4.826  7.19   5.218  7.204  5.704  7.224  6.256  7.248
  6.97   7.284] 

intensity = [19 34 25 28 26 48 21 56 21 60 31 49 24 37 26 37 34 37 23 84 15 59 23 45 
             18  47 20 55 18 36 15 39]

一旦它进入 Cython，同样的数据看起来像:

azimuth = 0.000349065850399 

rawDistance = [2.686, 7.23, 2.7960000000000003, 7.204, 2.91, 7.188, 3.044, 7.174, 3.19, 
               7.16, 3.3280000000000003, 7.16, 3.5, 7.154, 3.704, 7.144, 3.936, 7.158, 
               4.196, 7.156000000000001, 4.478, 7.19, 4.8260000000000005, 7.192, 5.22, 
               7.204, 5.708, 7.22, 6.256, 7.252, 6.97, 7.282] 

intensity = [19, 34, 27, 28, 26, 48, 22, 52, 21, 60, 31, 49, 24, 37, 28, 34, 32, 37, 
             23, 84, 15, 59, 23, 45, 18, 47, 20, 58, 18, 36, 15, 36]

这就解释了为什么结果与纯Python方法计算的结果不完全一样。

这是信息被传输到的 Cython 模块:

from libc.math cimport sin, cos
import numpy as np
cimport numpy as np
cimport cython

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
def calculateXYZ(list frames, double[:] cosVertCorrection, double[:] sinVertCorrection):
    cdef long numberFrames = len(frames)
    cdef long i, j, k, numberBlocks
    cdef list finalResults = []
    cdef list intensities = []
    cdef list frameXYZ = []
    cdef double azimuth, xy, x, y, z, sinRotational, cosRotational
    cdef double[32] rawDistance
    cdef int[32] intensity
    cdef double[:] tempX
    cdef double[:] tempY
    cdef double[:] tempZ
    cdef int positionsFilled = 0

    for i in xrange(numberFrames):
        numberBlocks = len(frames[i])
        tempX = np.zeros(numberBlocks * 32, dtype=np.double)
        tempY = np.zeros(numberBlocks * 32, dtype=np.double)
        tempZ = np.zeros(numberBlocks * 32, dtype=np.double)
        frameXYZ = [[] for i in range(3)]
        positionsFilled = 0

        for j in xrange(numberBlocks):
            # This is where I tested for the data in Cython
            # This is the information that is different. 
            # It is reading from what was passed to it from python.

            azimuth = frames[i][j][0]
            rawDistance = frames[i][j][1]
            intensity = frames[i][j][2]
            sinRotational, cosRotational = sin(azimuth), cos(azimuth)

            for k in xrange(32):
                xy = rawDistance[k] * cosVertCorrection[k]
                x, y = xy * sinRotational, xy * cosRotational
                z = rawDistance[k] * sinVertCorrection[k]

                if x != 0 or y != 0 or z != 0:
                    tempX[positionsFilled] = x
                    tempY[positionsFilled] = y
                    tempZ[positionsFilled] = z
                    intensities.append(intensity[k])
                    positionsFilled = positionsFilled + 1

        frameXYZ[0].append(np.asarray(tempX[0:positionsFilled].copy()).tolist())
        frameXYZ[1].append(np.asarray(tempY[0:positionsFilled].copy()).tolist())
        frameXYZ[2].append(np.asarray(tempZ[0:positionsFilled].copy()).tolist())
        finalResults.append(frameXYZ)

    return finalResults, intensities

这是它的纯 Python 版本:

documentXYZ = []
intensities = []

# I tested to see what the original data was in here adding prints

for frame in frames:
    frameXYZ = [[] for i in range(3)]
    frameX, frameY, frameZ = [], [], []
    for block in frame:
        sinRotational, cosRotational = np.math.sin(block[0]), np.math.cos(block[0])
        rawDistance, intensity = np.array(block[1]), np.array(block[2])
        xy = np.multiply(rawDistance, cosVertCorrection)
        x, y, z = np.multiply(xy, sinRotational), np.multiply(xy, cosRotational), np.multiply(rawDistance, sinVertCorrection)
        maskXYZ = np.logical_and(np.logical_and(x, x != 0), np.logical_and(y, y != 0), np.logical_and(z, z != 0))
        frameX += x[maskXYZ].tolist()
        frameY += y[maskXYZ].tolist()
        frameZ += z[maskXYZ].tolist()
        intensities += intensity[maskXYZ].tolist()

    frameXYZ[0].append(frameX), frameXYZ[1].append(frameY), frameXYZ[2].append(frameZ)
    documentXYZ.append(frameXYZ)

我知道浮点值的精度可能存在差异(尽管我认为不应该存在，因为我在所有结构中都使用了 doubles)，但我不明白为什么作为整数的 intensity 值也被更改。我希望精度与 Python 中的精度相同。

关于如何改进这个的任何想法？

谢谢。

最佳答案

解决问题的前两个步骤是:

确定 NumPy 在您的平台上使用的特定整数类型(例如 int32、int64 ...)，例如检查 dtype整数数组的属性或其值之一。
确定int的位宽在您的平台上使用您选择的 C 实现。通常它是 32 位，但并非总是如此(例如使用 sizeof 检查)。

一旦了解了这两个细节，您就可以确定以何种方式进行普通 (C) int无法匹配 NumPy 一直使用的整数精度。一个常见的猜测是 NumPy 使用的是 int64但在 C 中你使用的是 int这可能是 int32为您的平台/实现。另一个常见的情况是 NumPy 使用无符号整数，而在 C 中 int将被签名，即使具有相同的位数也会导致不同的表示。

在Cython中可以很方便的引用定宽整数，至少有以下三种方式:

自从您使用了cimport numpy as np可以引用NumPy的定宽整数类型，比如np.int64_t或 np.uint8_t . “_t”类型定义在 NumPy 的 Cython 支持中可用。
您可以尝试从您的 C 实现和平台中找出标准类型名称，例如 cython.longlong对于 64 位整数或 cython.uchar对于一个无符号的 8 位整数，它恰好对应于整数的正确位数和正确的符号性，以匹配 NumPy 使用的任何类型的精度和符号性。
也可以从C标准库导入，比如from libc.stdint import int64_t, uint8_t如果您更喜欢将 C 的标准头文件用于指定大小的固定宽度整数。

假设您选择了合适的整数类型，然后您可以声明您的 intensity具有正确类型的数组，例如以下任何一种，具体取决于您选择用于表达正确整数类型的方法:

cdef np.uint8_t[32] intensity   # If using NumPy integer types
cdef uint8_t[32] intensity      # If importing from libc.stdint
cdef cython.uchar[32] intensity # If using Cython integer types

最后一点，最好记住常规 Python 整数是无限精度的，所以如果您设法获得 int 的 NumPy 数组类型(不是 C int ，而是 Python int )，在 Cython 中工作时，您必须决定不同的固定精度表示，或者使用包含 Python int 的数组或类型化内存 View 。类型(这通常违背了首先使用 Cython 的目的)。

关于python - 一旦数据进入 Cython 模块，精度就会丢失/更改，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/37625628/

python - 一旦数据进入 Cython 模块，精度就会丢失/更改

上一篇：c - 为什么链接程序需要 stdlib？

下一篇：为内核模块创建 DKMS 包，依赖于内核头文件