python - 在 for 循环中加速性能重写数组

我有一个二维数据集 shape = (500, 500)。从给定位置 (x_0, y_0) 我想将每个元素/像素的距离映射到该给定位置。我通过确定与 (x_0, y_0) 的所有唯一距离并使用整数映射它们来做到这一点。 6 x 6 数据集的 map 如下所示:

[9 8 7 6 7 8]
[8 5 4 3 4 5]
[7 4 2 1 2 4]
[6 3 1 0 1 3]
[7 4 2 1 2 4]
[8 5 4 3 4 5]

其中整数对应于存储在以下数组中的唯一距离:

[0.  1.  1.41421356  2.  2.23606798  2.82842712  3.  3.16227766  3.60555128  4.24264069]

确定这些距离的代码如下:

def func(data, (x_0,y_0)):
  y, x = numpy.indices((data.shape))
  r = numpy.sqrt((x - x_0)**2 + (y - y_0)**2)

  float_values = numpy.unique(r.ravel())  # Unique already sorts the result 
  int_values = numpy.arange(float_values.shape[0]).astype(numpy.int) 

  for idx in range(float_values.shape[0])[::-1]:
    r[r == float_values[idx]] = int_values[idx] 

  return float_values, r

for 循环是一个瓶颈。我需要的应用程序花费的时间太长了。有没有办法加快/提高其性能？或者是否有一种完全不同但更快的方法来获取我需要的输出？

最佳答案

这是一个使用masking 的向量化方法 -

def func_mask_vectorized(data, (x_0, y_0)):
    # Leverage broadcasting with open meshes to create the squared distances/ids
    m,n = data.shape
    Y,X = np.ogrid[:m,:n]
    ids = (X-x_0)**2 + (Y-y_0)**2

    # Setup mask that will help us retrieve the unique "compressed" IDs
    # (similar to what return_inverse does).
    # This is done by setting 1s at ids places and then using that mask to 
    # assign range covered array, in effect setting up the unique compress. IDs.
    mask = np.zeros(ids.max()+1, dtype=bool)
    mask[ids] = 1    
    id_arr = mask.astype(int)
    id_arr[mask] = np.arange(mask.sum())
    r_out = id_arr[ids]

    # Finally extract out the unique ones among the IDs & get their sqrt values
    float_values_out = np.sqrt(np.flatnonzero(mask))
    return float_values_out, r_out

基准测试

使用形状为 (500,500) 的数据对建议的设置进行计时，使用 0-9 的数字范围，也用于问题和计时示例中以下本节中的所有完整解决方案 -

In [371]: np.random.seed(0)
     ...: data = np.random.randint(0,10,(500,500))
     ...: x_0 = 2
     ...: y_0 = 3

# Original soln
In [372]: %timeit func(data, (x_0,y_0))
1 loop, best of 3: 6.77 s per loop

# @Daniel's soln
In [373]: %timeit func_return_inverse(data, (x_0,y_0))
10 loops, best of 3: 23.9 ms per loop

# Soln from this post
In [374]: %timeit func_mask_vectorized(data, (x_0,y_0))
100 loops, best of 3: 5.02 ms per loop

扩展数字可能扩展到 100 甚至 1000 的情况不会对这些数字的叠加方式产生太大影响 -

In [397]: np.random.seed(0)
     ...: data = np.random.randint(0,100,(500,500))
     ...: x_0 = 50
     ...: y_0 = 50

In [398]: %timeit func(data, (x_0,y_0))
     ...: %timeit func_return_inverse(data, (x_0,y_0))
     ...: %timeit func_mask_vectorized(data, (x_0,y_0))
1 loop, best of 3: 5.62 s per loop
10 loops, best of 3: 20.7 ms per loop
100 loops, best of 3: 4.28 ms per loop

In [399]: np.random.seed(0)
     ...: data = np.random.randint(0,1000,(500,500))
     ...: x_0 = 500
     ...: y_0 = 500

In [400]: %timeit func(data, (x_0,y_0))
     ...: %timeit func_return_inverse(data, (x_0,y_0))
     ...: %timeit func_mask_vectorized(data, (x_0,y_0))
1 loop, best of 3: 6.87 s per loop
10 loops, best of 3: 21.9 ms per loop
100 loops, best of 3: 5.05 ms per loop

关于python - 在 for 循环中加速性能重写数组，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49886504/

python - 在 for 循环中加速性能重写数组

基准测试

上一篇：python - 从具有所需形状的 Pandas 系列中获取矩阵

下一篇：python - 如何测试使用随机函数的函数？