python - sklearn - 如何检索 PCA 组件并解释传递给 GridSearchCV 的管道内部的方差

标签 python scikit-learn pipeline grid-search

我使用 GridSearchCV 和管道，如下所示:

grid = GridSearchCV(
    Pipeline([
        ('reduce_dim', PCA()),
        ('classify', RandomForestClassifier(n_jobs = -1))
        ]),
    param_grid=[
        {
            'reduce_dim__n_components': range(0.7,0.9,0.1),
            'classify__n_estimators': range(10,50,5),
            'classify__max_features': ['auto', 0.2],
            'classify__min_samples_leaf': [40,50,60],
            'classify__criterion': ['gini', 'entropy']
        }
    ],
    cv=5, scoring='f1')

grid.fit(X,y)

现在如何从 grid.best_estimator_ 模型中检索 components 和 explained_variance 等 PCA 详细信息？

此外，我还想使用 pickle 将 best_estimator_ 保存到文件中，然后加载它。如何从此加载的估算器中检索 PCA 详细信息？我怀疑它会和上面一样。

最佳答案

grid.best_estimator_是访问具有最佳参数的管道。

现在使用named_steps[]attribute访问管道的内部估计器。

因此，grid.best_estimator_.named_steps['reduce_dim']将为您提供pca对象。现在，您可以简单地使用它来访问此 pca 对象的 components_ 和 explained_variance_ 属性，如下所示:

grid.best_estimator_.named_steps['reduce_dim'].components_ grid.best_estimator_.named_steps['reduce_dim'].explained_variance_

关于python - sklearn - 如何检索 PCA 组件并解释传递给 GridSearchCV 的管道内部的方差，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46800147/

上一篇：Python lxml XPath 语法错误 : invalid predicate

下一篇：python - 如何解决 ValueError : Input contains NaN, 无穷大或值对于 dtype ('float64' 来说太大)

相关文章：

Python 正则表达式 url 抓取

python - 如何通过索引自定义 sklearn 交叉验证迭代器？

python - 在 scikit-learn 中使用 python 生成器

python - AppEngine Pipeline Yield - 这是 yield 运算符的标准用法吗？

python - 如何使用具有自定义功能的 sklearn 管道？

Scrapy 管道加载但不起作用

带有外部函数的Python工厂方法

php - 在 PHP 中将 tiff 转换为 gif

python - ubuntu 14.04，pip无法升级matplotllib

scikit-learn - 如何将 sklearn 的分类报告用于 keras 模型？