python - scipy.sparse 点在 Python 中非常慢

以下代码甚至无法在我的系统上完成:

import numpy as np
from scipy import sparse
p = 100
n = 50
X = np.random.randn(p,n)
L = sparse.eye(p,p, format='csc')
X.T.dot(L).dot(X)

有没有解释为什么这个矩阵乘法挂了？

最佳答案

X.T.dot(L) 并不是你想象的那样是一个 50x100 的矩阵，而是一个由 100x100 的 50x100 稀疏矩阵组成的数组

>>> X.T.dot(L).shape
(50, 100)
>>> X.T.dot(L)[0,0]
<100x100 sparse matrix of type '<type 'numpy.float64'>'
    with 100 stored elements in Compressed Sparse Column format>

看来问题是 X 的 dot 方法，它是一个数组，不知道稀疏矩阵。因此，您必须使用其 todense 或 toarray 方法将稀疏矩阵转换为密集矩阵。前者返回一个矩阵对象，后者返回一个数组:

>>> X.T.dot(L.todense()).dot(X)
matrix([[  81.85399873,    3.75640482,    1.62443625, ...,    6.47522251,
            3.42719396,    2.78630873],
        [   3.75640482,  109.45428475,   -2.62737229, ...,   -0.31310651,
            2.87871548,    8.27537382],
        [   1.62443625,   -2.62737229,  101.58919604, ...,    3.95235372,
            1.080478  ,   -0.16478654],
        ..., 
        [   6.47522251,   -0.31310651,    3.95235372, ...,   95.72988689,
          -18.99209596,   17.31774553],
        [   3.42719396,    2.87871548,    1.080478  , ...,  -18.99209596,
          108.90045569,  -16.20312682],
        [   2.78630873,    8.27537382,   -0.16478654, ...,   17.31774553,
          -16.20312682,  105.37102461]])

或者，稀疏矩阵有一个了解数组的 dot 方法:

>>> X.T.dot(L.dot(X))
array([[  81.85399873,    3.75640482,    1.62443625, ...,    6.47522251,
           3.42719396,    2.78630873],
       [   3.75640482,  109.45428475,   -2.62737229, ...,   -0.31310651,
           2.87871548,    8.27537382],
       [   1.62443625,   -2.62737229,  101.58919604, ...,    3.95235372,
           1.080478  ,   -0.16478654],
       ..., 
       [   6.47522251,   -0.31310651,    3.95235372, ...,   95.72988689,
         -18.99209596,   17.31774553],
       [   3.42719396,    2.87871548,    1.080478  , ...,  -18.99209596,
         108.90045569,  -16.20312682],
       [   2.78630873,    8.27537382,   -0.16478654, ...,   17.31774553,
         -16.20312682,  105.37102461]])

关于python - scipy.sparse 点在 Python 中非常慢，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/14204406/

python - scipy.sparse 点在 Python 中非常慢

上一篇：python - Django 中的模型历史

下一篇：Python 的 glob 模块和 unix 的 find 命令不识别非 ascii