python - 为什么我在使用 SVM Predict() 函数时出现错误?

标签 python numpy scikit-learn svm

设计了一个简单的SVM算法进行预测:

我收到的代码错误如下:
分数计算正确,但当我尝试传递要预测的值时, model.predict() 函数出现错误。我无法弄清楚这个问题。试图整理但没有找到任何相关信息。

import pandas as pd
import pylab as pl
import numpy as np
import scipy.optimize as opt
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
%matplotlib inline 
import matplotlib.pyplot as plt

data = pd.read_csv(r'C:\Users\Imad\Desktop\New folder\cars.csv')

from sklearn.preprocessing import LabelEncoder

data.columns

Index(['buying', 'maint', 'doors', 'persons', 'lug_boot', 'safety', 
'car'], dtype='object')

buying_1=LabelEncoder()
maint_1=LabelEncoder()
doors_1=LabelEncoder()
persons_1=LabelEncoder()
lug_boot_1=LabelEncoder()
safety_1=LabelEncoder()
car_1=LabelEncoder()

data['buying_n'] = buying_1.fit_transform(data['buying'])
data['maint_n'] = maint_1.fit_transform(data['maint'])
data['door_n'] = doors_1.fit_transform(data['doors'])
data['persons_n'] = persons_1.fit_transform(data['persons'])
data['lug_boot_n'] = lug_boot_1.fit_transform(data['lug_boot'])
data['safety_n'] = safety_1.fit_transform(data['safety'])
data['car_n'] = car_1.fit_transform(data['car'])

inputs = data.drop(['buying', 'maint', 'doors', 'persons', 'lug_boot', ' 
safety', 'car'], axis = 'columns')
target = data['buying_n']

X = np.asarray(inputs)

y = np.asarray(target)

X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, 
random_state= 20)
print ('Train set:', X_train.shape,  y_train.shape)
print ('Test set:', X_test.shape,  y_test.shape)
Train set: (1382, 7) (1382,)
Test set: (346, 7) (346,)


from sklearn.svm import SVC
model = SVC(C = 2, gamma=3, random_state=5)
model.fit(X_train, y_train)


model.score(X_test,y_test)
0.9884393063583815

model.predict([[3,3,2,2,1,2]])
<小时/>
ValueError                                Traceback (most recent call 
last)
<ipython-input-122-6773f55c74b9> in <module>
----> 1 model.predict([[3,3,2,2,1,2]])

~\Anaconda3\lib\site-packages\sklearn\svm\base.py in predict(self, X)
565             Class labels for samples in X.
566         """
--> 567         y = super(BaseSVC, self).predict(X)
568         return self.classes_.take(np.asarray(y, dtype=np.intp))
569 

~\Anaconda3\lib\site-packages\sklearn\svm\base.py in predict(self, X)
323         y_pred : array, shape (n_samples,)
324         """
--> 325         X = self._validate_for_predict(X)
326         predict = self._sparse_predict if self._sparse else 
self._dense_predict
327         return predict(X)

~\Anaconda3\lib\site-packages\sklearn\svm\base.py in 
_validate_for_predict(self, X)
476             raise ValueError("X.shape[1] = %d should be equal to %d, 
"
477                              "the number of features at training 
time" %
--> 478                              (n_features, self.shape_fit_[1]))
479         return X
480 

ValueError: X.shape[1] = 6 should be equal to 7, the number of features 
at training time

最佳答案

问题是您将目标变量与输入一起传递。

更正以下几行!

target = data['buying_n']
inputs = data.drop(['buying', 'maint', 'doors', 'persons', 'lug_boot', ' 
                    safety', 'car'. 'buying_n'], axis = 'columns')

此后,输入中的特征数量将更改为 6。

注意:不要更改这两行的顺序。

关于python - 为什么我在使用 SVM Predict() 函数时出现错误?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55927352/

相关文章:

python - 与 Flask-Restplus 一起使用时,Flask 路由位于/返回 404

python - ValueError : view limit minimum -36761. 69947916667 小于 1 并且是无效的 Matplotlib 日期值。

python - 多幅图像各 channel 的平均值

python - 为什么与 np.int16 的相同操作相比,dtype np.int64 的操作要慢得多?

python - 无法选择 sklearn 凝聚链接类型 "single"

python - 将 dict 传递给 scikit learn estimator

python - Pandas 。将值与其他 DataFrame 中的相应范围进行匹配

python - 为什么 Matplotlib 中的 pyplot 不允许您在显示图像后保存图像?

python - 无法将 Jupyter 笔记本导出到 Azure ML Studio 中的 Python 脚本

python - AffinityPropagation .labels_ 与 .predict()