我有一个数字列表,其 len(lex) = 6064
如下所示
[0,
0,
1,
0,
0,
-1,
1,
1,
0,
0,
0,
0,
1,
0,]
和企业社会责任矩阵
tweets.shape = (6064, 2500)
如何合并它们我尝试将它们转换为两个列表,但是当我尝试处理它时出现错误
tweets = list(tweets)
lex = list(lex)
tweets_final = np.column_stack([tweets, lex])
分割数据进行训练后,出现以下错误
nb.fit(X_train, y_train)
ValueError: setting an array element with a sequence.
如何将该列表添加为该矩阵的列
最佳答案
您可以使用 scipy.sparse.hstack
将这两个水平堆叠(按列)。我们只需要将列表转换为列向量(就稀疏矩阵而言)或具有单列的二维数组 -
scipy.sparse.hstack(( tweets, csr_matrix(lex).T ))
scipy.sparse.hstack(( tweets, np.asarray(lex)[:,None] ))
示例运行 -
In [189]: from scipy.sparse import csr_matrix
In [194]: import scipy as sp
In [190]: a = np.random.randint(0,4,(5,10))
In [192]: a
Out[192]:
array([[2, 1, 1, 1, 0, 3, 1, 3, 2, 1],
[0, 2, 1, 2, 3, 0, 1, 1, 2, 3],
[0, 1, 1, 1, 2, 3, 0, 1, 0, 1],
[0, 0, 3, 0, 3, 0, 1, 0, 3, 1],
[1, 0, 2, 3, 3, 3, 2, 2, 0, 1]])
In [193]: b = [9,8,7,6,5] # equivalent to lex
In [191]: A = csr_matrix(a) # equivalent to tweets
In [195]: sp.sparse.hstack(( A, csr_matrix(b).T ))
Out[195]:
<5x11 sparse matrix of type '<type 'numpy.int64'>'
with 42 stored elements in COOrdinate format>
In [197]: _.toarray() # verify values by converting to dense array
Out[197]:
array([[2, 1, 1, 1, 0, 3, 1, 3, 2, 1, 9],
[0, 2, 1, 2, 3, 0, 1, 1, 2, 3, 8],
[0, 1, 1, 1, 2, 3, 0, 1, 0, 1, 7],
[0, 0, 3, 0, 3, 0, 1, 0, 3, 1, 6],
[1, 0, 2, 3, 3, 3, 2, 2, 0, 1, 5]])
关于python - 如何合并列表和 csr 矩阵,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46042279/