machine-learning - 随机森林分类器 : feature importance of prediction probability

我正在使用 sklearn RFC。

forest.fit(training_data, y_train)
probas_test = forest.predict_proba(test_data)

我想知道是否有一种方法可以找到导致预测的每个特征的贡献/重要性。

类似于，但针对单个数据点级别。

   forest.feature_importances_

最佳答案

这个问题可以通过多种方式解决；检查http://blog.datadive.net/interpreting-random-forests/ (以及一个Python包: https://github.com/andosa/treeinterpreter )。还有一些不太直接的选择，例如

https://arxiv.org/abs/1606.05390 (实现:https://github.com/sato9hara/defragTrees)
https://arxiv.org/abs/1611.05722 (实现:https://github.com/IBCNServices/GENESIM)

关于machine-learning - 随机森林分类器 : feature importance of prediction probability，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41060913/

machine-learning - 随机森林分类器 : feature importance of prediction probability

上一篇：machine-learning - 神经网络 : gpu vs no-gpu

下一篇：python-2.7 - NLTK NaiveBayesClassifier 抛出属性错误，指出 'list' 对象没有属性 'items'