python - 使用 cross_val_predict sklearn 计算评估指标

标签 python scikit-learn cross-validation

sklearn.model_selection.cross_val_predict page据称:

Generate cross-validated estimates for each input data point. It is not appropriate to pass these predictions into an evaluation metric.

谁能解释一下这是什么意思吗?如果这给出了每个 Y(真实 Y)的 Y 估计(y 预测),为什么我不能使用这些结果计算 RMSE 或确定系数等指标?

最佳答案

它似乎基于样本的分组和预测方式。来自 user guide链接在 cross_val_predict 文档中:

Warning Note on inappropriate usage of cross_val_predict

The result of cross_val_predict may be different from those obtained using cross_val_score as the elements are grouped in different ways. The function cross_val_score takes an average over cross-validation folds, whereas cross_val_predict simply returns the labels (or probabilities) from several distinct models undistinguished. Thus, cross_val_predict is not an appropriate measure of generalisation error.

cross_val_score 似乎表示它对所有 折叠进行平均,而 cross_val_predict 对各个折叠和不同模型进行分组,但不是全部因此它也不一定具有普遍性。例如,使用 sklearn 页面中的示例代码:

from sklearn import datasets, linear_model
from sklearn.model_selection import cross_val_predict, cross_val_score
from sklearn.metrics import mean_squared_error, make_scorer
diabetes = datasets.load_diabetes()
X = diabetes.data[:200]
y = diabetes.target[:200]
lasso = linear_model.Lasso()
y_pred = cross_val_predict(lasso, X, y, cv=3)

print("Cross Val Prediction score:{}".format(mean_squared_error(y,y_pred)))

print("Cross Val Score:{}".format(np.mean(cross_val_score(lasso, X, y, cv=3, scoring = make_scorer(mean_squared_error)))))

Cross Val Prediction score:3993.771257795029
Cross Val Score:3997.1789145156217

关于python - 使用 cross_val_predict sklearn 计算评估指标,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53523887/

相关文章:

r - 交叉验证和提前停止

python - 如何打印元素列表中当前索引的每三个元素

php - 什么更消耗资源? PHP 还是 Python?

pytorch - 使用 Pytorch 进行分层交叉验证

python - Scikit-learn 的 GridSearchCV 中的 Grid_scores_ 是什么意思

python - class_weight = 'balanced' 相当于朴素贝叶斯

machine-learning - 如何同时使用交叉验证和提前停止?

python - 使用 Django Auth Ldap 将 LDAP 用户映射到 Django 用户

python - 如何检查启用了哪个 XP 主题

python - 在 Python 中对某些 Dataframe 列进行输入