python - sklearn auc 分数 - diffmetrics.roc_auc_score & model_selection.cross_val_score

标签 python machine-learning scikit-learn auc

刚接触 sklearn，请保持温柔。使用不同的 roc_auc 评分计算客户流失率，我得到 3 个不同的分数。分数 1 和 3 接近，分数与分数 2 之间存在显着差异。感谢您指导为什么会出现这种差异以及哪一个可能是首选？非常感谢!

from sklearn.model_selection import cross_val_score
from sklearn.metrics import roc_auc_score


param_grid = {'n_estimators': range(10, 510, 100)}
grid_search = GridSearchCV(estimator=RandomForestClassifier(criterion='gini', max_features='auto',
                    random_state=20), param_grid=param_grid, scoring='roc_auc', n_jobs=4, iid=False, cv=5, verbose=0)
grid_search.fit(self.dataset_train, self.churn_train)
score_roc_auc = np.mean(cross_val_score(grid_search, self.dataset_test, self.churn_test, cv=5, scoring='roc_auc'))
"^^^ SCORE1 - 0.6395751751133528

pred = grid_search.predict(self.dataset_test)
score_roc_auc_2 = roc_auc_score(self.churn_test, pred)
"^^^ SCORE2 - 0.5063261397640454

print("grid best score ", grid_search.best_score_)
"^^^  SCORE3 - 0.6473102070034342

最佳答案

我相信下面的链接可以回答这个问题，它指向 GridSearchCV 中的折叠和较小分割的评分？

Difference in ROC-AUC scores in sklearn RandomForestClassifier vs. auc methods

关于python - sklearn auc 分数 - diffmetrics.roc_auc_score & model_selection.cross_val_score，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49544509/

上一篇：scala - Spark 对象的类型参数边界很难获得

下一篇：machine-learning - 使用神经网络进行强化学习函数逼近

相关文章：

python 删除列表的空子列表

machine-learning - 多变量推荐系统

python - CountVectorizer() 不适用于单字母单词

machine-learning - 向 SGDClassifier 添加新类？

python - 尝试使用atom运行脚本时出错

python - __getattribute__ 什么时候不参与属性查找？

python - 向 urllib2.Request 添加数据

python - 密集层的 LSTM 初始状态

machine-learning - 神经网络训练期间的 MSE

用于分类的 Python 向量化