python - 我试图理解下面的预测模型的形状值。请帮我理解值(value)和解释器的o/p是什么意思？

标签 python machine-learning random-forest shap predictive

x_train,x_test,y_train,y_test=train_test_split(X,Y,test_size=0.3,random_state=42)
rf_model= RandomForestClassifier()
rf_model.fit(x_train, y_train)
rf_pred = rf_model.predict(x_test)


import shap
rf_explainer = shap.TreeExplainer(rf_model, x_train)

rf_vals = rf_explainer.shap_values(x_train)

o/p:100%|====================| 4778/4792 [03:26<00:00]

rf_explainer.expected_value

o/p: 数组([0.5763, 0.4237])

(虽然有了总结图，我明白了每个特征对模型的贡献是什么) (请解释一下输出中的这个数字是什么意思(4778/4792 和 array([0.5763, 0.4237])))

最佳答案

rf_explainer.expected_value 是所谓的“基值”，即模型在整个数据集上的“预期”值，这又意味着模型在不了解数据的情况下会预测什么。这些与类(class)频率接近，但不完全相等。

在解释模型的预测时:

您从基值开始，所有数据点(在提供的后台数据集上)的基值都相同。
在它们之上添加 SHAP 值以得出实际的模型预测。 SHA 值将显示特定特征对感兴趣预测的贡献。

关于python - 我试图理解下面的预测模型的形状值。请帮我理解值(value)和解释器的o/p是什么意思？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/70330847/

上一篇：plugins - 将 Adobe xd 导出为 React/HTML

下一篇：neovim - Windows neovim vim-plug 错误 : `git` executable not found

相关文章：

python - 在 python 中将 XML 编辑为字典？

python - 代码不收敛普通梯度下降

r - 拆分数据集并将子集并行传递给函数，然后重新组合结果

python - 为什么 scikit-learn 的 RandomForestClassifier 在显式设置中不是确定性的？

R randomForest子集无法摆脱因子水平

python - multiprocessing.Pool : When to use apply, apply_async 或映射？

python - 在 Python 中使用 readline() 时，如何从列表中删除换行符或空字符串？

python - 我无法在 Django 模板上显示图像

python - 无法导入 sklearn.metrics.accuracy_score

python - 如何在 k-Means 聚类算法中选择哪些列适合可视化？

©2024 IT工具网联系我们