我有一个巨大的数组,我想用一个小数组计算点积。但我收到“数组太大”的提示,是否有解决方法?
import numpy as np
eMatrix = np.random.random_integers(low=0,high=100,size=(20000000,50))
pMatrix = np.random.random_integers(low=0,high=10,size=(50,50))
a = np.dot(eMatrix,pMatrix)
Error:
/Library/Python/2.7/site-packages/numpy/random/mtrand.so in mtrand.RandomState.random_integers (numpy/random/mtrand/mtrand.c:9385)()
/Library/Python/2.7/site-packages/numpy/random/mtrand.so in mtrand.RandomState.randint (numpy/random/mtrand/mtrand.c:7051)()
ValueError: array is too big.
最佳答案
在计算数组的总大小时,如果它溢出 native int 类型,则会引发该错误,see here获取确切的源代码行。
要做到这一点,无论您的机器是 64 位的,您几乎肯定会运行 32 位版本的 Python(和 NumPy)。 You can check if that is the case by doing :
>>> import sys
>>> sys.maxsize
2147483647 # <--- 2**31 - 1, on a 64 bit version you would get 2**63 - 1
然后,您的数组“仅”20000000 * 50 = 1000000000
,刚好低于 2**30
。如果我尝试在 32 位 numpy 上重现您的结果,我会得到一个 MemoryError
:
>>> np.random.random_integers(low=0,high=100,size=(20000000,50))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "mtrand.pyx", line 1420, in mtrand.RandomState.random_integers (numpy\random\mtrand\mtrand.c:12943)
File "mtrand.pyx", line 938, in mtrand.RandomState.randint (numpy\random\mtrand\mtrand.c:10338)
MemoryError
除非我增加大小超过神奇的 2**31 - 1
阈值
>>> np.random.random_integers(low=0,high=100,size=(2**30, 2))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "mtrand.pyx", line 1420, in mtrand.RandomState.random_integers (numpy\random\mtrand\mtrand.c:12943)
File "mtrand.pyx", line 938, in mtrand.RandomState.randint (numpy\random\mtrand\mtrand.c:10338)
ValueError: array is too big.
鉴于您和我的回溯中的行号不同,我怀疑您使用的是旧版本。此输出在您的系统上是什么:
>>> np.__version__
'1.10.0.dev-9c50f98'
关于python - numpy中巨大数组的点积,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25688079/