我有以下代码,其中计算数组元素之间的平均差。有没有办法比嵌套循环(如 numpy 函数)更有效地执行此操作?
import numpy as np
a = np.array([0.02625, -0.04125, -0.00875, -0.05625, 0.04375, 0.03625])
delta = []
for i in range(len(a) - 1):
for j in range(i+1, len(a)):
delta.append(a[i] - a[j])
delta = np.array(delta)
avg_dist = np.sum(np.abs(delta)) / delta.size
最佳答案
方法#1
使用 np.triu_indices
获取成对索引/np.tril_indices
,使用它们对输入数组进行索引,从而计算差异 -
I,J = np.triu_indices(len(a),1)
delta = a[I] - a[J]
方法#2
我们还可以在一个循环中使用切片
,这应该是内存高效的,因为它避免了生成索引,如前一种方法中所做的那样 -
def pairwise_diff(a):
n = len(a)
N = n*(n-1)//2
idx = np.concatenate(( [0], np.arange(n-1,0,-1).cumsum() ))
start, stop = idx[:-1], idx[1:]
out = np.empty(N,dtype=a.dtype)
for j,i in enumerate(range(n-1)):
out[start[j]:stop[j]] = a[i,None] - a[i+1:]
return out
具有 10000
元素的大型数组的计时 -
In [214]: a = np.random.rand(10000)
# Approach #1
In [215]: %%timeit
...: I,J = np.triu_indices(len(a),1)
...: delta = a[I] - a[J]
1 loop, best of 3: 627 ms per loop
# Approach #2
In [216]: %timeit pairwise_diff(a)
10 loops, best of 3: 69.1 ms per loop
# Original approach
In [217]: %%timeit
...: delta = []
...: for i in range(len(a) - 1):
...: for j in range(i+1, len(a)):
...: delta.append(a[i] - a[j])
1 loop, best of 3: 15.7 s per loop
关于python - 有效计算数组元素的所有成对组合的度量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49883359/