python - numpy/scipy/python 中的快速 K 步折扣

标签 python numpy scipy

x 的形状为 [batch_size, n_time],其中批处理是独立的

如果 k=3,d=discount_rate。伪代码:

x[:,i] = x[:,i] + x[:,i+1]*(d**1) + x[:,i+2]*(d**2) + x[:,i+3]*(d**3)

这是工作代码,但速度非常慢。我将执行此函数数百万次,所以我希望实现速度更快

import numpy as np

def k_step_discount(x, k, discount_rate):
    n_time = x.shape[1]
    k_include_cur = k + 1 # k excludes current timestep
    for i in range(n_time):
        k_cur = min(n_time - i, k_include_cur) # prevent out of bounds
        for j in range(1, k_cur):
            x[:, i] += x[:, i+j] * (discount_rate ** j)
    return x

x = np.array([
    [0,0,0,1,0,0],
    [0,1,2,3,4,5.]
])

y = k_step_discount(x+0, k=2, discount_rate=.9)
print('x\n{}\ny\n{}'.format(x, y))


>> x
   [[ 0.  0.  0.  1.  0.  0.]
    [ 0.  1.  2.  3.  4.  5.]]
>> y
   [[  0.     0.81   0.9    1.     0.     0.  ]
    [  2.52   5.23   7.94  10.65   8.5    5.  ]]

一个类似的 scipy 函数是:

import scipy.signal
import numpy as np

x = np.array([[0,0,0,1,0,0.]])
discount_rate = .9

y = np.flip(scipy.signal.lfilter([1], [1, -discount_rate], np.flip(x+0, 1), axis=1), 1)
print('x\n{}\ny\n{}'.format(x, y))
>> x
   [[ 0.  0.  0.  1.  0.  0.]]
>> y
   [[ 0.729  0.81   0.9    1.     0.     0.   ]]

但是,它会一直打折到 n_time 结束,而不是只打折 k 步

我也对无批处理的 K 步折扣感兴趣,如果那样更容易/更快的话

import numpy as np

def k_step_discount_no_batch(x, k, discount_rate):
    n_time = x.shape[0]
    k_include_cur = k + 1 # k excludes current timestep
    for i in range(n_time):
        k_cur = min(n_time - i, k_include_cur) # prevent out of bounds
        for j in range(1, k_cur):
            x[i] += x[i+j] * (discount_rate ** j)
    return x

x = np.array([8,0,0,0,1,2.])
y = k_step_discount_no_batch(x+0, k=2, discount_rate=.9)
print('x\n{}\ny\n{}'.format(x, y))

>> x
   [ 8.  0.  0.  0.  1.  2.]
>> y
   [ 8.    0.    0.81  2.52  2.8   2.  ]

类似的 no_batch scipy 函数

import scipy.signal
import numpy as np

x = np.array([8,0,0,0,1,2.])    
discount_rate = .9

y = scipy.signal.lfilter([1], [1, -discount_rate], x[::-1], axis=0)[::-1]
print('x\n{}\ny\n{}'.format(x, y))
>> x
   [ 8.  0.  0.  0.  1.  2.]

>> y
   [ 9.83708  2.0412   2.268    2.52     2.8      2.     ]

最佳答案

你可以使用 2D convolution这里。为了正确完成缩放,我们需要创建适当的 2D 内核,这将是 discount_rate 的幂缩放数字的翻转版本。这符合 卷积 的定义,其中内核按照输入数据的翻转顺序滑动,其元素与那些内核元素一起缩放并相加,正如在这种情况下所做的那样。

因此,实现很简单——

from scipy.signal import convolve2d as conv2d
import numpy as np

def k_step_discount(x, k, discount_rate, is_batch=True):
    if is_batch:
        kernel = discount_rate**np.arange(k+1)[::-1][None]
        return conv2d(x,kernel)[:,k:]
    else:
        kernel = discount_rate**np.arange(k+1)[::-1]
        return np.convolve(x, kernel)[k:]

sample 运行-

In [190]: x
Out[190]: 
array([[ 0.,  0.,  0.,  1.,  0.,  0.],
       [ 0.,  1.,  2.,  3.,  4.,  5.]])

# Proposed method
In [191]: k_step_discount_conv2d(x, k=2, discount_rate=0.9)
Out[191]: 
array([[  0.  ,   0.81,   0.9 ,   1.  ,   0.  ,   0.  ],
       [  2.52,   5.23,   7.94,  10.65,   8.5 ,   5.  ]])

# Original loopy method
In [192]: k_step_discount(x, k=2, discount_rate=.9)
Out[192]: 
array([[  0.  ,   0.81,   0.9 ,   1.  ,   0.  ,   0.  ],
       [  2.52,   5.23,   7.94,  10.65,   8.5 ,   5.  ]])

运行时测试

In [206]: x = np.random.randint(0,9,(100,1000)).astype(float)

In [207]: %timeit k_step_discount_conv2d(x, k=2, discount_rate=0.9)
1000 loops, best of 3: 1.27 ms per loop

In [208]: %timeit k_step_discount(x, k=2, discount_rate=.9)
100 loops, best of 3: 4.83 ms per loop

使用更大的k:

In [215]: x = np.random.randint(0,9,(100,1000)).astype(float)

In [216]: %timeit k_step_discount_conv2d(x, k=20, discount_rate=0.9)
100 loops, best of 3: 5.44 ms per loop

In [217]: %timeit k_step_discount(x, k=20, discount_rate=.9)
10 loops, best of 3: 44.8 ms per loop

因此,预计 k 越大 会出现巨大的加速!


进一步提升

根据 @Eric 的建议,我们还可以利用 scipy.ndimage.filters's 1D convolution在这里。

对于与 Scipy 的 2D1D 卷积方法的正确比较列表 -

from scipy.ndimage.filters import convolve1d as conv1d

def using_conv2d(x, k, discount_rate):
    kernel = discount_rate**np.arange(k+1)[::-1][None]
    return conv2d(x,kernel)[:,k:]

def using_conv1d(x, k, discount_rate):
    kernel = discount_rate**np.arange(k+1)[::-1]
    return conv1d(x,kernel, mode='constant', origin=k//2)

时间 -

In [100]: x = np.random.randint(0,9,(100,1000)).astype(float)

In [101]: out1 = using_conv2d(x, k=20, discount_rate=0.9)
     ...: out2 = using_conv1d(x, k=20, discount_rate=0.9)
     ...: 

In [102]: np.allclose(out1, out2)
Out[102]: True

In [103]: %timeit using_conv2d(x, k=20, discount_rate=0.9)
100 loops, best of 3: 5.27 ms per loop

In [104]: %timeit using_conv1d(x, k=20, discount_rate=0.9)
1000 loops, best of 3: 1.43 ms per loop

关于python - numpy/scipy/python 中的快速 K 步折扣,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43745570/

相关文章:

python - 将索引列表转换为 boolean 掩码的快速方法

python - 使用 pandas.SparseSeries.from_coo() 函数的非 NDFFrame 对象错误

python - 将 url 解码为 utf-8-sig 时起始字节无效

python - LSTM Keras 输入形状混淆

Python多线程+多处理BrokenPipeError(子进程未退出?)

python - 在将 asyncio websockets 作为类实现后接收流数据?

python - 将列向量连接到矩阵的末尾

Python/Numpy : Vectorizing repeated row insertion in a 2D array

python - 将高斯拟合到测量峰值

python - 绘制从一组到另一组的复杂函数