python - 删除 array([]) 括号为矩阵方程创建干净的数组

标签 python arrays matrix scikit-learn vectorization

我想从向量中删除数组括号,以便将它们转换为方程矩阵,最好的方法是什么?我希望向量是 [0,0] 而不是 array([0,0]) 所以连接到矩阵是 [[0,0],[0,1]] 而不是 [array([0,0]) , 数组([0,1])]

我的代码:

import numpy as np
from sklearn.feature_extraction.text import CountVectorizer


#create all actual subject matters 
subjectmatters = ["basic", "python", "programming", "engineering", "mathematics", "logic", "hard", "html", "computers",
                  "design", "easy", "americanhistory", "history", "civilizations", "languagearts", "algebra",
                  "basicmath", "calculus", "nueralnets"]

#vectorize the subjects
vectorizer = CountVectorizer()
subjectmatters_vectorized = vectorizer.fit_transform(subjectmatters)
subjectmatters_vectorized_to_array = subjectmatters_vectorized.toarray()
subjectmatters_vectorized_to_array_shape = np.shape(subjectmatters_vectorized.toarray())
subjectvectordict = dict(zip(subjectmatters, subjectmatters_vectorized_to_array))
print(subjectvectordict)

这将打印以下内容,希望删除 array[()]:

{
    "basic": array([0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
    "python": array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]),
    "programming": array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0]),
    "engineering": array([0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
    "mathematics": array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0]),
    "logic": array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]),
    "hard": array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]),
    "html": array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]),
    "computers": array([0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
    "design": array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
    "easy": array([0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
    "americanhistory": array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
    "history": array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]),
    "civilizations": array([0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
    "languagearts": array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]),
    "algebra": array([1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
    "basicmath": array([0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
    "calculus": array([0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
    "nueralnets": array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]),
}

最佳答案

请看看这是不是你想要的:

from sklearn.feature_extraction.text import CountVectorizer

#create all actual subject matters 
subjectmatters = ["basic", "python", "programming", "engineering", "mathematics", "logic", "hard", "html", "computers",
                  "design", "easy", "americanhistory", "history", "civilizations", "languagearts", "algebra",
                  "basicmath", "calculus", "nueralnets"]

#vectorize the subjects
vectorizer = CountVectorizer()
subjectmatters_vectorized = vectorizer.fit_transform(subjectmatters)
subjectmatters_vectorized_to_array = subjectmatters_vectorized.toarray().tolist()

subjectvectordict = dict(zip(subjectmatters, subjectmatters_vectorized_to_array))
print(subjectvectordict)
{'basic': [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],  
 'python': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1],  
 'programming': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0], 
 'engineering': [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
 'mathematics': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0], 
 'logic': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0], 
 'hard': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0], 
 'html': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0], 
 'computers': [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
 'design': [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
 'easy': [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
 'americanhistory': [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
 'history': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0], 
 'civilizations': [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
 'languagearts': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], 
 'algebra': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
 'basicmath': [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
 'calculus': [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
 'nueralnets': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]
 }

关于python - 删除 array([]) 括号为矩阵方程创建干净的数组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64839307/

相关文章:

arrays - 在 perl 中为数组预分配内存有什么用?

ios - 多个 UIImage View

javascript - 为什么数组迭代将我的自定义原型(prototype)函数显示为项目?

c++ - 围绕任意轴旋转平面二维对象

python - 使用列表作为类参数

Python ljust 在包含 lin 时无法正确显示

python - 尝试导入 opencv 时出现段错误和崩溃

python - 滚动条在 python tkinter 中不起作用

c++ - 读取文件并删除重复的字母

java - 递归地改变矩阵中相同的数字