python - 如何使用最近的邻居对高维 numpy python 数组进行插值

我正在使用 scipy 和 numpy 在 python 中编程，我有一个查找数据表 (LUT)，我可以这样访问:

self.lut_data[n_iter][m_iter][l_iter][k_iter][j_iter][i_iter]

我得到的 *_iter 索引对应于我保存在字典中的一组值。例如，i_iter 索引对应于光的波长，所以我有一个标签和值的字典可以通过:

labels['wavelength']

它将返回每个 i_iter 对应的波长数组。如果我将它用作直接查找，这将很有用。如果我想要 500 nm 的 lut_data。我首先在 labels['wavelength'] 中找到相应的索引，然后用它来索引

lut_data[][][][][][wavelength_index]

我对其他维度做同样的事情，包括视角等，它们对应于其他 *_iters

我需要做的是在查找表中的值之间找到值，如果我事先不知道查找表的维度，我需要它工作。如果我这样做了，那么我就知道如何使用每个维度的循环来解决问题。但是，如果我不知道 LUT 有多少维，那么我就不知道要嵌套多少个循环。

我认为我应该能够使用 cKDTree 来完成它，但我不知道如何让它工作。我真的很感激一个看起来与我的结构相似的例子

谢谢

最佳答案

如果您有完整的信息数组可以从中进行插值，则线性插值并不那么困难。这只是稍微耗时，但如果您可以将阵列装入 RAM，则只需几秒钟。

诀窍是线性插值可以一次在一个轴上完成。因此，对于每个轴:

找到最近的点进行插值
找出这些点之间的相对距离 (d = 0..1)，例如如果您有 540 和 550 nm，并且您希望获得 548 nm 的数据，d = 0.8。
对所有轴重复此过程；每一轮都会将维数减少一个

像这样:

import numpy as np

def ndim_interp(A, ranges, p):
    # A: array with n dimensions
    # ranges: list of n lists or numpy arrays of values along each dimension
    # p: vector of values to find (n elements)

    # iterate through all dimensions
    for i in range(A.ndim):
        # check if we are overrange; if we are, use the edgemost values
        if p[i] <= ranges[i][0]:
            A = A[0]
            continue
        if p[i] >= ranges[i][-1]:
            A = A[-1]
            continue

        # find the nearest values
        right = np.searchsorted(ranges[i], p[i])
        left = right - 1

        # find the relative distance
        d = (p[i] - ranges[i][left]) / (ranges[i][right] - ranges[i][left])

        # calculate the interpolation
        A = (1 - d) * A[left] + d * A[right]            

    return A

举个例子:

# data axis points
arng = [1, 2, 3]
brng = [100, 200]
crng = [540, 550, 560]

# some data
A = np.array([
    [[1., 2., 3.], [2., 3., 4.]],
    [[0.5, 1.5, 2.], [1.5, 2.0, 3.0]],
    [[0., 0.5, 1.], [1., 1., 1.]]])

# lookup:
print ndim_interp(A, (arng, brng, crng), (2.3, 130., 542.))

如果你想做一些更复杂的事情(三次样条等)，那么你可以使用 scipy.ndimage.interpolation.map_coordinates。然后配方变化如下:

import numpy as np
import scipy.ndimage.interpolation

def ndim_interp(A, ranges, p):
    # A: array with n dimensions
    # ranges: list of n lists or numpy arrays of values along each dimension
    # p: vector of values to find (n elements)

    # calculate the coordinates into array positions in each direction
    p_arr = []
    # iterate through all dimensions
    for i in range(A.ndim):
        # check if we are overrange; if we are, use the edgemost values
        if p[i] <= ranges[i][0]:
            p_arr.append(0)
            continue
        if p[i] >= ranges[i][-1]:
            p_arr.append(A.shape[i] - 1)
            continue

        # find the nearest values to the left
        right = np.searchsorted(ranges[i], p[i])
        left = right - 1

        # find the relative distance
        d = (p[i] - ranges[i][left]) / (ranges[i][right] - ranges[i][left])

        # append the position
        p_arr.append(left + d)

    coords = np.array(p_arr).reshape(A.ndim, -1)
    return scipy.ndimage.interpolation.map_coordinates(A, coords, order=1, mode='nearest')[0]

当然，使用最简单的设置(order=1 等于线性插值)没有意义，但即使是三次样条，编写自己的插值算法也不是那么简单.

自然地，如果您的网格在所有方向上都是等间距的，那么代码会更简单，因为不需要先插值到正确的位置(一个简单的除法即可)。

关于python - 如何使用最近的邻居对高维 numpy python 数组进行插值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/24616079/

python - 如何使用最近的邻居对高维 numpy python 数组进行插值

上一篇：python - 获取通用 View 类中的类名

下一篇：python - Kmeans 与 Spark