python - NumPy 中的累积加法/乘法

有一个相对简单的代码块，循环遍历两个数组，进行相乘和累加:

import numpy as np

a = np.array([1, 2, 4, 6, 7, 8, 9, 11])
b = np.array([0.01, 0.2, 0.03, 0.1, 0.1, 0.6, 0.5, 0.9])
c = []
d = 0
for i, val in enumerate(a):
    d += val
    c.append(d)
    d *= b[i]

有没有一种不用迭代就可以做到这一点的方法？我想可以使用 cumsum/cumprod，但我不知道如何使用。当你一步一步分解发生的事情时，它看起来像这样:

# 0: 0 + a[0]
# 1: ((0 + a[0]) * b[0]) + a[1]
# 2: ((((0 + a[0]) * b[0]) + a[1]) * b[1]) + a[2]

编辑澄清:我对列表(或数组)c 感兴趣。

最佳答案

在每次迭代中，您有 -

d[n+1] = d[n] + a[n]
d[n+1] = d[n+1] * b[n]

因此，本质上-

d[n+1] = (d[n] + a[n]) * b[n]

即-

d[n+1] = (d[n]* b[n]) + K[n]   #where `K[n] = a[n] * b[n]`

现在，使用这个公式，如果您写下直到 n = 2 个情况的表达式，您将有 -

d[1] = d[0]*b[0] + K[0]

d[2] = d[0]*b[0]*b[1] + K[0]*b[1] + K[1]

d[3] = d[0]*b[0]*b[1]*b[2] + K[0]*b[1]*b[2] + K[1]*b[2] + K[2]

Scalars      :    b[0]*b[1]*b[2]     b[1]*b[2]        b[2]       1 
Coefficients :         d[0]             K[0]          K[1]       K[2]

因此，您需要将 b 的 cumprod 反向，对 K 数组执行逐元素乘法。最后，要获得 c，执行 cumsum 并且由于 c 在按 b 缩小之前存储，所以你会需要通过 b 的反向 cumprod 缩小 cumsum 版本。

最终的实现是这样的——

# Get reversed cumprod of b and pad with `1` at the end
b_rev_cumprod = b[::-1].cumprod()[::-1]
B = np.hstack((b_rev_cumprod,1))

# Get K
K = a*b

# Append with 0 at the start, corresponding starting d
K_ext = np.hstack((0,K))

# Perform elementwsie multiplication and cumsum and scale down for final c
sums = (B*K_ext).cumsum()
c = sums[1:]/b_rev_cumprod

运行时测试和验证输出

函数定义-

def original_approach(a,b):
    c = []
    d = 0
    for i, val in enumerate(a):
        d = d+val
        c.append(d)
        d = d*b[i]
    return c

def vectorized_approach(a,b): 
    b_rev_cumprod = b[::-1].cumprod()[::-1]
    B = np.hstack((b_rev_cumprod,1))

    K = a*b
    K_ext = np.hstack((0,K))
    sums = (B*K_ext).cumsum()
    return sums[1:]/b_rev_cumprod

运行时和验证

案例 #1:OP 示例案例

In [301]: # Inputs
     ...: a = np.array([1, 2, 4, 6, 7, 8, 9, 11])
     ...: b = np.array([0.01, 0.2, 0.03, 0.1, 0.1, 0.6, 0.5, 0.9])
     ...: 

In [302]: original_approach(a,b)
Out[302]: 
[1,
 2.0099999999999998,
 4.4020000000000001,
 6.1320600000000001,
 7.6132059999999999,
 8.7613205999999995,
 14.256792359999999,
 18.128396179999999]

In [303]: vectorized_approach(a,b)
Out[303]: 
array([  1.        ,   2.01      ,   4.402     ,   6.13206   ,
         7.613206  ,   8.7613206 ,  14.25679236,  18.12839618])

案例 #2:大输入案例

In [304]: # Inputs
     ...: N = 1000
     ...: a = np.random.randint(0,100000,N)
     ...: b = np.random.rand(N)+0.1
     ...: 

In [305]: np.allclose(original_approach(a,b),vectorized_approach(a,b))
Out[305]: True

In [306]: %timeit original_approach(a,b)
1000 loops, best of 3: 746 µs per loop

In [307]: %timeit vectorized_approach(a,b)
10000 loops, best of 3: 76.9 µs per loop

请注意，对于非常大的输入数组情况，如果 b 元素是这么小的分数，由于累积操作，b_rev_cumprod 的初始数字可能会出现zeros 在这些初始位置导致 NaN。

关于python - NumPy 中的累积加法/乘法，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/34182755/

python - NumPy 中的累积加法/乘法

上一篇：python - axes.prop_cycle 的 Matplotlib 1.5 用法

下一篇：python - 将文件上传到保管箱时的进度条