python - 以向量化方式计算数组切片上的函数

比方说，我有一维 numpy 数组 X(特征)和 Y(二进制类)和一个函数 f 需要两个切片X 和 Y 并计算一个数字。

我还有一个索引数组S，我需要通过它拆分X 和Y。保证每个切片都不会为空。

所以我的代码是这样的:

def f(x_left, y_left, x_right, y_right):
    n = x_left.shape[0] + x_right.shape[0]

    lcond = y_left == 1
    rcond = y_right == 1

    hleft = 1 - ((y_left[lcond].shape[0])**2
                     + (y_left[~lcond].shape[0])**2) / n**2

    hright = 1 - ((y_right[rcond].shape[0])**2
                     + (y_right[~rcond].shape[0])**2) / n**2

    return -(x_left.shape[0] / n) * hleft - (x_right.shape[0] / n) * hright

results = np.empty(len(S))
for i in range(len(S)):
    results[i] = f(X[:S[i]], Y[:S[i]], X[S[i]:], Y[S[i]:])

results 数组必须包含 S 的每次拆分的 f 的结果。

len(结果) == len(S)

我的问题是如何使用 numpy 以向量化的方式执行我的计算，以使这段代码更快？

最佳答案

首先，让我们让您的函数更高效一些。您正在执行一些不必要的索引操作:而不是 y_left[lcond].shape[0]你只需要 lcond.sum() , 或 len(lcond.nonzero()[0])这似乎更快。

这是您的代码的一个改进的循环版本(带有虚拟输入):

import numpy as np           

n = 1000                     
X = np.random.randint(0,n,n) 
Y = np.random.randint(0,n,n) 
S = np.random.choice(n//2, n)

def f2(x, y, s):                                     
    """Same loopy solution as original, only faster"""
    n = x.size                                       
    isone = y == 1                                   
    lval = len(isone[:s].nonzero()[0])               
    rval = len(isone[s:].nonzero()[0])               

    hleft = 1 - (lval**2 + (s - lval)**2) / n**2     
    hright = 1 - (rval**2 + (n - s - rval)**2) / n**2

    return - s / n * hleft - (n - s) / n * hright

def time_newloop():                                   
    """Callable front-end for timing comparisons"""   
    results = np.empty(len(S))                        
    for i in range(len(S)):                           
        results[i] = f2(X, Y, S[i])                   
    return results

变化相当简单。

现在，事实证明我们确实可以向量化您的循环。为此，我们必须使用 S 的每个元素进行比较。同时。我们可以做到这一点的方法是创建一个形状为 (nS, n) 的二维蒙版。 (其中 S.size == nS )将值截断到 S 的相应元素为止.方法如下:

def f3(X, Y, S):                                     
    """Vectorized solution working on all the data at the same time"""
    n = X.size                                                        
    leftmask = np.arange(n) < S[:,None] # boolean, shape (nS, n)      
    rightmask = ~leftmask # boolean, shape (nS, n)              

    isone = Y == 1 # shape (n,)                                 
    lval = (isone & leftmask).sum(axis=1) # shape (nS,)         
    rval = (isone & rightmask).sum(axis=1) # shape (nS,)        

    hleft = 1 - (lval**2 + (S - lval)**2) / n**2                
    hright = 1 - (rval**2 + (n - S - rval)**2) / n**2           

    return - S / n * hleft - (n - S) / n * hright # shape (nS,) 

def time_vector():                                             
    """Trivial front-end for fair timing"""                    
    return f3(X,Y,S)

将原始解决方案定义为 time_orig() 运行我们可以检查结果是否相同:

>>> np.array_equal(time_orig(), time_newloop()), np.array_equal(time_orig(), time_vector())
(True, True)

以及具有上述随机输入的运行时:

>>> %timeit time_orig()
... %timeit time_newloop()
... %timeit time_vector()
... 
... 
19 ms ± 501 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
11.4 ms ± 214 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
3.93 ms ± 37.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

这意味着上面的循环版本几乎是原始循环版本的两倍，而矢量化版本又快了三倍。当然，后一种改进的代价是增加了内存需求:而不是形状数组 (n,)你现在有形状数组 (nS, n)如果您的输入数组很大，它会变得很大。但正如他们所说，天下没有免费的午餐，使用矢量化时，您通常会用运行时间换取内存。

关于python - 以向量化方式计算数组切片上的函数，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53264919/

python - 以向量化方式计算数组切片上的函数

上一篇：python - Django 模型字段实际上是对相关模型中字段的引用

下一篇：python - Tensorflow - keras - 'strided_slice' 的形状错误(使用调整大小的 MNIST 数据集)