python - 预期是二维数组,却得到了一维数组, reshape 数据

标签 python python-3.x numpy machine-learning sklearn-pandas

我真的被这个问题困扰了。在使用 LabelEncoder 后,我尝试使用 OneHotEncoder 将数据编码为矩阵,但出现此错误:预期为 2D 数组,却得到了 1D 数组。

在错误消息的末尾(包括在下面)它说“ reshape 我的数据”,我以为我做到了,但它仍然不起作用。如果我理解 reshape ,这就是当你想将一些数据真正 reshape 为不同的矩阵大小时吗?例如,如果我想将 3 x 2 矩阵更改为 4 x 6?

我的代码在这两行上失败:

X = X.reshape(-1, 1) # I added this after I saw the error
X[:, 0] = onehotencoder1.fit_transform(X[:, 0]).toarray()

这是我到目前为止的代码:

# Data Preprocessing

# Import Libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Import Dataset
dataset = pd.read_csv('Data2.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 5].values
df_X = pd.DataFrame(X)
df_y = pd.DataFrame(y)

# Replace Missing Values
from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values = 'NaN', strategy = 'mean', axis = 0)
imputer = imputer.fit(X[:, 3:5 ])
X[:, 3:5] = imputer.transform(X[:, 3:5])


# Encoding Categorical Data "Name"
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_x = LabelEncoder()
X[:, 0] = labelencoder_x.fit_transform(X[:, 0])

# Transform into a Matrix

onehotencoder1 = OneHotEncoder(categorical_features = [0])
X = X.reshape(-1, 1)
X[:, 0] = onehotencoder1.fit_transform(X[:, 0]).toarray()


# Encoding Categorical Data "University"
from sklearn.preprocessing import LabelEncoder
labelencoder_x1 = LabelEncoder()
X[:, 1] = labelencoder_x1.fit_transform(X[:, 1])

以下是完整的错误消息:

 File "/Users/jim/anaconda3/lib/python3.6/site-packages/sklearn/preprocessing/data.py", line 1809, in _transform_selected
    X = check_array(X, accept_sparse='csc', copy=copy, dtype=FLOAT_DTYPES)

  File "/Users/jim/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py", line 441, in check_array
    "if it contains a single sample.".format(array))

ValueError: Expected 2D array, got 1D array instead:
array=[  2.00000000e+00   7.00000000e+00   3.20000000e+00   2.70000000e+01
   2.30000000e+03   1.00000000e+00   6.00000000e+00   3.90000000e+00
   2.80000000e+01   2.90000000e+03   3.00000000e+00   4.00000000e+00
   4.00000000e+00   3.00000000e+01   2.76700000e+03   2.00000000e+00
   8.00000000e+00   3.20000000e+00   2.70000000e+01   2.30000000e+03
   3.00000000e+00   0.00000000e+00   4.00000000e+00   3.00000000e+01
   2.48522222e+03   5.00000000e+00   9.00000000e+00   3.50000000e+00
   2.50000000e+01   2.50000000e+03   5.00000000e+00   1.00000000e+00
   3.50000000e+00   2.50000000e+01   2.50000000e+03   0.00000000e+00
   2.00000000e+00   3.00000000e+00   2.90000000e+01   2.40000000e+03
   4.00000000e+00   3.00000000e+00   3.70000000e+00   2.77777778e+01
   2.30000000e+03   0.00000000e+00   5.00000000e+00   3.00000000e+00
   2.90000000e+01   2.40000000e+03].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

任何帮助都会很棒。

最佳答案

尝试将代码更改为此

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Import Dataset
dataset = pd.read_csv('Data2.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 5].values
df_X = pd.DataFrame(X)
df_y = pd.DataFrame(y)

# Replace Missing Values
from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values = 'NaN', strategy = 'mean', axis = 0)
imputer = imputer.fit(X[:, 3:5 ])
X[:, 3:5] = imputer.transform(X[:, 3:5])


# Encoding Categorical Data "Name"
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_x = LabelEncoder()
X[:, 0] = labelencoder_x.fit_transform(X[:, 0])

# Transform into a Matrix

onehotencoder1 = OneHotEncoder(categorical_features = [0])
res_0 = onehotencoder1.fit_transform(X[:, 0].reshape(-1, 1))  # <=== Change
X[:, 0] = res_0.ravel()

# Encoding Categorical Data "University"
from sklearn.preprocessing import LabelEncoder
labelencoder_x1 = LabelEncoder()
X[:, 1] = labelencoder_x1.fit_transform(X[:, 1])

如果您在 labelencoder_x1.fit_transform(X[:, 1]) 处遇到错误,请执行 labelencoder_x1.fit_transform(X[:, 1].reshape(-1, 1) ))

关于python - 预期是二维数组,却得到了一维数组, reshape 数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47965149/

相关文章:

python - 如何在数据框中删除重复项并首先保留两个异常(exception)?

python - 在 Python numpy 掩码数组中用最近的邻居填充缺失值?

python - 在 Python 中关闭窗口并打开文本编辑器时保持脚本运行

python - 无法使我的 python 网页抓取脚本与多处理一起使用

python - 如何仅在内存中运行 Django 测试数据库?

django - 替换 "tzinfo"并用localtime打印修改六分钟

python - 在 macOS BigSur 上使用 Homebrew 软件 python 安装 numpy 的问题

python - Cv2 中的 cvtcolor - 无属性

python - 在 Python3 中的 Jetson xavier 上使用 tx2 dev-kit CSI 相机

Python:基于绝对XPath解析HTML元素