算术 numpy 表达式的内存消耗是多少
vec ** 3 + vec ** 2 + vec
(vec 是 numpy.ndarray)。是否为每个中间操作存储一个数组?这样的复合表达式是否可以比底层 ndarray 拥有多倍的内存?
最佳答案
你是对的,将为每个中间结果分配一个新数组。幸运的是,numexpr
包就是为了解决这个问题而设计的。从描述来看:
The main reason why NumExpr achieves better performance than NumPy is that it avoids allocating memory for intermediate results. This results in better cache utilization and reduces memory access in general. Due to this, NumExpr works best with large arrays.
示例:
In [97]: xs = np.random.rand(1_000_000)
In [98]: %timeit xs ** 3 + xs ** 2 + xs
26.8 ms ± 371 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [99]: %timeit numexpr.evaluate('xs ** 3 + xs ** 2 + xs')
1.43 ms ± 20.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
感谢 @max9111 指出 numexpr 简化了乘法运算。看来基准测试中的大部分差异都是通过 xs ** 3
的优化来解释的。
In [421]: %timeit xs * xs
1.62 ms ± 12 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [422]: %timeit xs ** 2
1.63 ms ± 10.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [423]: %timeit xs ** 3
22.8 ms ± 283 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [424]: %timeit xs * xs * xs
2.52 ms ± 58.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
关于python - ufunc 算术表达式中的内存消耗,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50528634/