python - numpy/scipy/python 中的快速 K 步折扣

x 的形状为 [batch_size, n_time]，其中批处理是独立的

如果 k=3，d=discount_rate。伪代码:

x[:,i] = x[:,i] + x[:,i+1]*(d**1) + x[:,i+2]*(d**2) + x[:,i+3]*(d**3)

这是工作代码，但速度非常慢。我将执行此函数数百万次，所以我希望实现速度更快

import numpy as np

def k_step_discount(x, k, discount_rate):
    n_time = x.shape[1]
    k_include_cur = k + 1 # k excludes current timestep
    for i in range(n_time):
        k_cur = min(n_time - i, k_include_cur) # prevent out of bounds
        for j in range(1, k_cur):
            x[:, i] += x[:, i+j] * (discount_rate ** j)
    return x

x = np.array([
    [0,0,0,1,0,0],
    [0,1,2,3,4,5.]
])

y = k_step_discount(x+0, k=2, discount_rate=.9)
print('x\n{}\ny\n{}'.format(x, y))


>> x
   [[ 0.  0.  0.  1.  0.  0.]
    [ 0.  1.  2.  3.  4.  5.]]
>> y
   [[  0.     0.81   0.9    1.     0.     0.  ]
    [  2.52   5.23   7.94  10.65   8.5    5.  ]]

一个类似的 scipy 函数是:

import scipy.signal
import numpy as np

x = np.array([[0,0,0,1,0,0.]])
discount_rate = .9

y = np.flip(scipy.signal.lfilter([1], [1, -discount_rate], np.flip(x+0, 1), axis=1), 1)
print('x\n{}\ny\n{}'.format(x, y))
>> x
   [[ 0.  0.  0.  1.  0.  0.]]
>> y
   [[ 0.729  0.81   0.9    1.     0.     0.   ]]

但是，它会一直打折到 n_time 结束，而不是只打折 k 步

我也对无批处理的 K 步折扣感兴趣，如果那样更容易/更快的话

import numpy as np

def k_step_discount_no_batch(x, k, discount_rate):
    n_time = x.shape[0]
    k_include_cur = k + 1 # k excludes current timestep
    for i in range(n_time):
        k_cur = min(n_time - i, k_include_cur) # prevent out of bounds
        for j in range(1, k_cur):
            x[i] += x[i+j] * (discount_rate ** j)
    return x

x = np.array([8,0,0,0,1,2.])
y = k_step_discount_no_batch(x+0, k=2, discount_rate=.9)
print('x\n{}\ny\n{}'.format(x, y))

>> x
   [ 8.  0.  0.  0.  1.  2.]
>> y
   [ 8.    0.    0.81  2.52  2.8   2.  ]

类似的 no_batch scipy 函数

import scipy.signal
import numpy as np

x = np.array([8,0,0,0,1,2.])    
discount_rate = .9

y = scipy.signal.lfilter([1], [1, -discount_rate], x[::-1], axis=0)[::-1]
print('x\n{}\ny\n{}'.format(x, y))
>> x
   [ 8.  0.  0.  0.  1.  2.]

>> y
   [ 9.83708  2.0412   2.268    2.52     2.8      2.     ]

最佳答案

你可以使用 2D convolution这里。为了正确完成缩放，我们需要创建适当的 2D 内核，这将是 discount_rate 的幂缩放数字的翻转版本。这符合 卷积 的定义，其中内核按照输入数据的翻转顺序滑动，其元素与那些内核元素一起缩放并相加，正如在这种情况下所做的那样。

因此，实现很简单——

from scipy.signal import convolve2d as conv2d
import numpy as np

def k_step_discount(x, k, discount_rate, is_batch=True):
    if is_batch:
        kernel = discount_rate**np.arange(k+1)[::-1][None]
        return conv2d(x,kernel)[:,k:]
    else:
        kernel = discount_rate**np.arange(k+1)[::-1]
        return np.convolve(x, kernel)[k:]

sample 运行-

In [190]: x
Out[190]: 
array([[ 0.,  0.,  0.,  1.,  0.,  0.],
       [ 0.,  1.,  2.,  3.,  4.,  5.]])

# Proposed method
In [191]: k_step_discount_conv2d(x, k=2, discount_rate=0.9)
Out[191]: 
array([[  0.  ,   0.81,   0.9 ,   1.  ,   0.  ,   0.  ],
       [  2.52,   5.23,   7.94,  10.65,   8.5 ,   5.  ]])

# Original loopy method
In [192]: k_step_discount(x, k=2, discount_rate=.9)
Out[192]: 
array([[  0.  ,   0.81,   0.9 ,   1.  ,   0.  ,   0.  ],
       [  2.52,   5.23,   7.94,  10.65,   8.5 ,   5.  ]])

运行时测试

In [206]: x = np.random.randint(0,9,(100,1000)).astype(float)

In [207]: %timeit k_step_discount_conv2d(x, k=2, discount_rate=0.9)
1000 loops, best of 3: 1.27 ms per loop

In [208]: %timeit k_step_discount(x, k=2, discount_rate=.9)
100 loops, best of 3: 4.83 ms per loop

使用更大的k:

In [215]: x = np.random.randint(0,9,(100,1000)).astype(float)

In [216]: %timeit k_step_discount_conv2d(x, k=20, discount_rate=0.9)
100 loops, best of 3: 5.44 ms per loop

In [217]: %timeit k_step_discount(x, k=20, discount_rate=.9)
10 loops, best of 3: 44.8 ms per loop

因此，预计 k 越大 会出现巨大的加速!

进一步提升

根据 @Eric 的建议，我们还可以利用 scipy.ndimage.filters's 1D convolution在这里。

对于与 Scipy 的 2D 和 1D 卷积方法的正确比较列表 -

from scipy.ndimage.filters import convolve1d as conv1d

def using_conv2d(x, k, discount_rate):
    kernel = discount_rate**np.arange(k+1)[::-1][None]
    return conv2d(x,kernel)[:,k:]

def using_conv1d(x, k, discount_rate):
    kernel = discount_rate**np.arange(k+1)[::-1]
    return conv1d(x,kernel, mode='constant', origin=k//2)

时间 -

In [100]: x = np.random.randint(0,9,(100,1000)).astype(float)

In [101]: out1 = using_conv2d(x, k=20, discount_rate=0.9)
     ...: out2 = using_conv1d(x, k=20, discount_rate=0.9)
     ...: 

In [102]: np.allclose(out1, out2)
Out[102]: True

In [103]: %timeit using_conv2d(x, k=20, discount_rate=0.9)
100 loops, best of 3: 5.27 ms per loop

In [104]: %timeit using_conv1d(x, k=20, discount_rate=0.9)
1000 loops, best of 3: 1.43 ms per loop

关于python - numpy/scipy/python 中的快速 K 步折扣，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/43745570/

python - numpy/scipy/python 中的快速 K 步折扣

上一篇：python - 将 DataFrame 的列绘制为针对相同 y 列的散点图

下一篇：python - 在 while 循环中更改 tkinter Canvas 中的图像