python - 简单的 Numpy 向量化

我有两个一维 Numpy 数组 start 和 stop，它们都包含整数(用于索引其他数组)。我有以下代码。

index_list = []
for i in range(len(start)):
    temp = range(start[i], stop[i])
    index_list.extend(temp)
index_list = np.array(index_list)

有没有一种简单的方法可以对其进行矢量化？

最佳答案

您可以按如下方式对其进行矢量化:

def make_index_list(start, stop):
    lens = stop - start
    cum_lens = np.cumsum(lens)
    # Sequential indices the same length as the expected output
    out = np.arange(cum_lens[-1])
    # Starting index for each section of `out`
    cum_lens = np.concatenate(([0], cum_lens[:-1]))
    # How much each section of `out` is off from the correct value
    deltas = start - out[cum_lens]
    # Apply the correction
    out += np.repeat(deltas, lens)

    return out

一些虚构的数据:

start = np.random.randint(100, size=(100000,))
stop = start + np.random.randint(1, 10 ,size=start.shape)

我们可以拿代码试乘一下:

In [39]: %%timeit
   ....: index_list = []
   ....: for i in range(len(start)):
   ....:     temp = range(start[i], stop[i])
   ....:     index_list.extend(temp)
   ....: index_list = np.array(index_list)
   ....:
10 loops, best of 3: 137 ms per loop

In [40]: %timeit make_index_list(start, stop)
100 loops, best of 3: 9.27 ms per loop

In [41]: np.array_equal(make_index_list(start, stop), index_list)
Out[41]: True

所以它是正确的并且快了大约 15 倍，一点也不差......

关于python - 简单的 Numpy 向量化，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/23699189/

上一篇：python - shebang 行的行号重要吗？

下一篇：python - 用于创建 AbstractUser 扩展模型的 Django 管理表单

相关文章：

python - 如何将多个 .npy 文件加载到单个 numpy 数组中

python - 循环 numpy 数组时返回具有相同维度的子数组

python - 使用 Numpy 进行低效正则逻辑回归

python - 插值 lambda 难以理解

python - 如何将多个 excel 行放入一个带有子列表的大列表中？

java - 从android中的xml数组获取资源

python - np.ndarray 的类型参数

python - 使用 `as_ptr()` 时如何阻止内存泄漏？

python - 查找python脚本所需的模块

javascript - 在 mysql 数组上使用 php 或 javascript 进行演绎过滤搜索