python - 为分类器设置参数并在不拟合的情况下使用它

标签 python machine-learning scikit-learn classification

我正在使用 pythonscikit-learn 进行一些分类。

是否可以重用分类器学习的参数?

例如:

from sklearn.svm import SVC

cl = SVC(...)    # create svm classifier with some hyperparameters
cl.fit(X_train, y_train)
params = cl.get_params()

让我们将这个 params 作为字符串字典存储在某处,甚至写入文件 json。假设,我们稍后想使用这个经过训练的分类器对某些数据进行一些预测。尝试恢复它:

params = ...  # retrieve these parameters stored somewhere as a dictionary
data = ...    # the data, we want make predictions on
cl = SVC(...)
cl.set_params(**params)
predictions = cl.predict(data)

如果我这样做,我会得到 NonFittedError 和以下堆栈跟踪:

File "C:\Users\viacheslav\Python\Python36-32\lib\site-packages\sklearn\svm\base.py", line 548, in predict
    y = super(BaseSVC, self).predict(X)
  File "C:\Users\viacheslav\Python\Python36-32\lib\site-packages\sklearn\svm\base.py", line 308, in predict
    X = self._validate_for_predict(X)
  File "C:\Users\viacheslav\Python\Python36-32\lib\site-packages\sklearn\svm\base.py", line 437, in _validate_for_predict
    check_is_fitted(self, 'support_')
  File "C:\Users\viacheslav\Python\Python36-32\lib\site-packages\sklearn\utils\validation.py", line 768, in check_is_fitted
    raise NotFittedError(msg % {'name': type(estimator).__name__})
sklearn.exceptions.NotFittedError: This SVC instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

是否可以为分类器设置参数并在不拟合的情况下进行预测?我该怎么做?

最佳答案

请阅读 model persistence in SKLearn :

from sklearn.externals import joblib
joblib.dump(clf, 'filename.pkl') 

及以后:

clf = joblib.load('filename.pkl')

关于python - 为分类器设置参数并在不拟合的情况下使用它,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48252006/

相关文章:

文件对象关闭的 Python 回调

python - Rasterio 安装失败

r - 不平衡的训练数据集和回归模型

python - 如何使用 scikit-learn 对大文本数据进行分类?

python - 如何将新的键值对添加到字典中,其中值是数组的数组

Python:沿着边界从图像中裁剪出区域

machine-learning - Theano 梯度不适用于 .sum(),仅适用于 .mean()?

python - 凯拉斯属性错误: 'list' object has no attribute 'ndim'

python - sklearn MultinomialNB 如何在类里面找到最有区别的词

python - sklearn : Would like to extend CountVectorizer to fuzzy match against vocabulary