python - 找到 C 和 gamma 的值以优化 SVM

标签 python machine-learning scikit-learn svm hyperparameters

我在一些数据集中应用了 SVM (scikit-learn)，并希望找到可以为测试集提供最佳准确度的 C 和 gamma 值。

我首先将 C 固定为某个整数，然后遍历多个 gamma 值，直到获得为该 C 提供最佳测试集精度的 gamma。然后我固定了我在上述步骤中获得的 gamma 和迭代 C 的值并找到一个 C 可以给我最好的准确性等等......

但上述步骤永远无法给出产生最佳测试集准确度的 gamma 和 C 的最佳组合。

任何人都可以帮助我找到一种方法来获得这个组合(gamma，C) sckit-学习？

最佳答案

您正在寻找超参数调整。在参数调整中，我们传递一个包含分类器可能值列表的字典，然后根据您选择的方法(即 GridSearchCV、RandomSearch 等)返回最佳可能参数。您可以阅读更多相关信息 here .

例如:

#Create a dictionary of possible parameters
params_grid = {'C': [0.001, 0.01, 0.1, 1, 10, 100],
          'gamma': [0.0001, 0.001, 0.01, 0.1],
          'kernel':['linear','rbf'] }

#Create the GridSearchCV object
grid_clf = GridSearchCV(SVC(class_weight='balanced'), params_grid)

#Fit the data with the best possible parameters
grid_clf = clf.fit(X_train, y_train)

#Print the best estimator with it's parameters
print grid_clf.best_estimators

您可以阅读有关 GridSearchCV 的更多信息 here和 RandomizedSearchCV here .不过请注意，SVM 会占用大量 CPU 资源，因此请注意传递的参数数量。根据您的数据和您传递的参数数量，可能需要一些时间来处理。

This link还包含一个示例

关于python - 找到 C 和 gamma 的值以优化 SVM，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46330329/

上一篇：python - 如何同时抓取两页并在一个嵌套的 'for-loop' 中生成两个不同的列表？

下一篇：python - 删除具有 >N NaN 的列，不包括特定列

python - fit() 得到了一个意外的关键字参数 'criterion'

python - 如何并行使用 KNNImputer？

python - 在生产中从 CDN 而不是 Flask 提供静态文件

python - 通过与训练数据的一致映射来分解实时数据？

python - GAE Python - 操作错误 : (2013, 'Lost connection to MySQL server during query')

python - 将 Kivy 按钮链接到函数

machine-learning - 在PyTorch中使用DataLoaders进行k折交叉验证

machine-learning - 使用 mlr3pipeline 编码和缩放后无法通过 mlr3proba 训练数据集

python - LASSO 中的单热编码分类变量，如何比较变量重要性？