python-3.x - 如何在管道中使用适当的 FunctionTransformer 制作 GridSearchCV?

标签 python-3.x machine-learning scikit-learn deep-learning

我正在尝试使用 GridSearchCV 创建一个管道来过滤数据(使用 iforest)并使用 StandarSclaler+MLPRegressor 执行回归。

我制作了一个 FunctionTransformer 将我的 iForest 过滤器包含在管道中。我还为 iForest 过滤器定义了一个参数网格(使用 kw_args 方法)。

一切看起来都不错,但是当不合适时,什么也没有发生......没有错误消息。没什么。

之后,当我想要进行预测时,我收到消息:“此 RandomizedSearchCV 实例尚未安装”

from sklearn.preprocessing import FunctionTransformer

#Definition of the function auto_filter using the iForest algo
def auto_filter(DF, conta=0.1):
    #iForest made on the DF dataframe
    iforest = IsolationForest(behaviour='new', n_estimators=300, max_samples='auto', contamination=conta)
    iforest = iforest.fit(DF)

    # The DF (dataframe in input) is filtered taking into account only the inlier observations

data_filtered = DF[iforest.predict(DF) == 1]

    # Only few variables are kept for the next step (regression by MLPRegressor)
    # this function delivers X_filtered and y
    X_filtered = data_filtered[['SessionTotalTime','AverageHR','MaxHR','MinHR','EETotal','EECH','EEFat','TRIMP','BeatByBeatRMSSD','BeatByBeatSD','HFAverage','LFAverage','LFHFRatio','Weight']]
    y = data_filtered['MaxVO2']
    return (X_filtered, y)

#Pipeline definition ('auto_filter' --> 'scaler' --> 'MLPRegressor')    
pipeline_steps = [('auto_filter', FunctionTransformer(auto_filter)), ('scaler', StandardScaler()), ('MLPR', MLPRegressor(solver='lbfgs', activation='relu', early_stopping=True, n_iter_no_change=20, validation_fraction=0.2, max_iter=10000))]

#Gridsearch Definition with differents values of 'conta' for the first stage of the pipeline ('auto_filter)
parameters = {'auto_filter__kw_args': [{'conta': 0.1}, {'conta': 0.2}, {'conta': 0.3}], 'MLPR__hidden_layer_sizes':[(sp_randint.rvs(1, nb_features, 1),), (sp_randint.rvs(1, nb_features, 1), sp_randint.rvs(1, nb_features, 1))], 'MLPR__alpha':sp_rand.rvs(0, 1, 1)}   

pipeline = Pipeline(pipeline_steps)

estimator = RandomizedSearchCV(pipeline, parameters, cv=5, n_iter=10)
estimator.fit(X_train, y_train)

最佳答案

您可以尝试手动逐步运行来发现问题:

auto_filter_transformer = FunctionTransformer(auto_filter)
X_train = auto_filter_transformer.fit_transform(X_train)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)

MLPR = MLPRegressor(solver='lbfgs', activation='relu', early_stopping=True, n_iter_no_change=20, validation_fraction=0.2, max_iter=10000)
MLPR.fit(X_train, y_train)

如果每个步骤都运行良好,则构建一个管道。检查管道。如果工作正常,请尝试使用 RandomizedSearchCV

关于python-3.x - 如何在管道中使用适当的 FunctionTransformer 制作 GridSearchCV?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57461926/

相关文章:

python - 测试 python 代码 : TypeError: int() argument must be a string, 类似字节的对象或数字时出错,而不是 'NoneType'

python - 为什么 TensorFlow 的 `tf.data` 包会减慢我的代码速度?

Python Q 学习实现不起作用

Python Sklearn - 弃用警告

machine-learning - OneHotEncoder 不会删除管道中的分类

apache - 如何在虚拟环境中安装 mod_wsgi

python - 如何在 ctypes 中使用 typedef

python - 如何使用 while 循环创建乘法表?

machine-learning - 如何获取OpenNLP模型的训练数据集?

python - 多变量 KNN 预测