python - 如何将带有 keras 回归器的 scikit-learn 管道保存到磁盘？

标签 python machine-learning scikit-learn keras joblib

我有一个带有 kerasRegressor 的 scikit-learn 管道:

estimators = [
    ('standardize', StandardScaler()),
    ('mlp', KerasRegressor(build_fn=baseline_model, nb_epoch=5, batch_size=1000, verbose=1))
    ]
pipeline = Pipeline(estimators)

训练管道后，我尝试使用 joblib 保存到磁盘...

joblib.dump(pipeline, filename , compress=9)

但是我得到一个错误:

RuntimeError: maximum recursion depth exceeded

如何将管道保存到磁盘？

最佳答案

我遇到了同样的问题，因为没有直接的方法可以做到这一点。这是一个对我有用的黑客。我将管道保存到两个文件中。第一个文件存储了 sklearn 管道的 pickled 对象，第二个文件用于存储 Keras 模型:

...
from keras.models import load_model
from sklearn.externals import joblib

...

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('estimator', KerasRegressor(build_model))
])

pipeline.fit(X_train, y_train)

# Save the Keras model first:
pipeline.named_steps['estimator'].model.save('keras_model.h5')

# This hack allows us to save the sklearn pipeline:
pipeline.named_steps['estimator'].model = None

# Finally, save the pipeline:
joblib.dump(pipeline, 'sklearn_pipeline.pkl')

del pipeline

下面是如何加载模型:

# Load the pipeline first:
pipeline = joblib.load('sklearn_pipeline.pkl')

# Then, load the Keras model:
pipeline.named_steps['estimator'].model = load_model('keras_model.h5')

y_pred = pipeline.predict(X_test)

关于python - 如何将带有 keras 回归器的 scikit-learn 管道保存到磁盘？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/37984304/

上一篇：python - 在数据框列中应用模糊匹配并将结果保存在新列中

下一篇：Python 将日志滚动到变量

python - 带宽核密度 python

arrays - 数组的转换产生一个空占位符

python - 查找最近索引值的最快方法

python - 属性错误: module 'socket' has no attribute 'MSG_DONTWAIT'

python - flask sqlalchemy.exc.NoForeignKeysError NoForeignKeysError : Could not determine join condition

python - 优化将值转换为 0 和 1 的性能

python - 将数据拟合到 DecisionTreeRegressor 时出现 KeyError

scikit-learn - 带有 class_weight=auto 的 SGDClassifier 在 scikit-learn 0.15 但不是 0.14 上失败

python - 如果 DST 更改的数据帧频率低于 1 小时，则会出现 pytz 错误 [多索引 pandas]