python - 基于位置向量聚合元素

我正在尝试对一个非常简单的操作进行矢量化，但似乎不知道如何进行。

给定一个非常大的数值向量(超过 1M 个位置)和另一个具有给定位置集的大小为 n 的数组，我想取回大小为 n 的向量，其中元素是第一个向量值的平均值由第二个指定

a = np.array([1,2,3,4,5,6,7])
b = np.array([[0,1],[2],[3,5],[4,6]])

c = [1.5,3,5,6]

我需要多次重复此操作，因此性能是一个问题。

普通的Python解决方案:

import numpy as np
import time

a = np.array([1,2,3,4,5,6,7])
b = np.array([[0,1],[2],[3,5],[4,6]])

begin = time.time()

for i in range(100000):

    c = []

    for d in b:
        c.append(np.mean(a[d]))

print(time.time() - begin, c)
# 3.7529971599578857 [1.5, 3.0, 5.0, 6.0]

最佳答案

我不确定这是否一定更快，但您不妨尝试一下:

import numpy as np

a = np.array([1, 2, 3, 4, 5, 6, 7])
b = np.array([[0, 1], [2], [3, 5], [4, 6]])

# Get the length of each subset of indices
lens = np.fromiter((len(bi) for bi in b), count=len(b), dtype=np.int32)
# Compute reduction indices
reduce_idx = np.roll(np.cumsum(lens), 1)
reduce_idx[0] = 0
# Make flattened array of index lists
idx = np.fromiter((i for bi in b for i in bi), count=lens.sum(), dtype=np.int32)
# Reorder according to indices
a2 = a[idx]
# Sum reordered array at reduction indices and divide by number of indices
c = np.add.reduceat(a2, reduce_idx) / lens
print(c)
# [1.5 3.  5.  6. ]

关于python - 基于位置向量聚合元素，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54670126/

python - 基于位置向量聚合元素

上一篇：python - "Think Python"提取 URL 的练习

下一篇：python - 在 SQLAlchemy 中注释 `exists` 子查询