python - 如何使用hyperopt对Keras深度学习网络进行超参数优化？

我想使用 keras 构建非线性回归模型来预测 +ve 连续变量。对于下面的模型，我该如何选择以下超参数？

隐藏层数和神经元数
辍学率
是否使用 BatchNormalization
激活函数out of linear, relu, tanh, sigmoid
在 adam、rmsprog、sgd 中使用的最佳优化器

代码

def dnn_reg():
    model = Sequential()
    #layer 1
    model.add(Dense(40, input_dim=13, kernel_initializer='normal'))
    model.add(Activation('tanh'))
    model.add(Dropout(0.2))
    #layer 2
    model.add(Dense(30, kernel_initializer='normal'))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(Dropout(0.4))
    #layer 3
    model.add(Dense(5, kernel_initializer='normal'))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(Dropout(0.4))

    model.add(Dense(1, kernel_initializer='normal'))
    model.add(Activation('relu'))
    # Compile model
    model.compile(loss='mean_squared_error', optimizer='adam')
    return model

我考虑过随机网格搜索，但我想使用 hyperopt，我相信它会更快。我最初使用 https://github.com/maxpumperla/hyperas 实现了调整. Hyperas 不适用于最新版本的 keras。我怀疑 keras 正在快速发展，维护者很难使其兼容。所以我认为直接使用hyperopt会是一个更好的选择。

PS:我是超参数调整和 hyperopt 的贝叶斯优化新手。

最佳答案

我在 Hyperas 上取得了很大的成功。以下是我学到的让它发挥作用的东西。

1) 从终端(不是从 Ipython 笔记本)将其作为 python 脚本运行 2) 确保你的代码中没有任何注释(Hyperas 不喜欢注释!) 3) 将您的数据和模型封装在一个函数中，如 hyperas 自述文件中所述。

下面是一个适用于我的 Hyperas 脚本示例(按照上面的说明)。

from __future__ import print_function

from hyperopt import Trials, STATUS_OK, tpe
from keras.datasets import mnist
from keras.layers.core import Dense, Dropout, Activation
from keras.models import Sequential
from keras.utils import np_utils
import numpy as np
from hyperas import optim
from keras.models import model_from_json
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.optimizers import SGD , Adam
import tensorflow as tf
from hyperas.distributions import choice, uniform, conditional
__author__ = 'JOnathan Hilgart'



def data():
    """
    Data providing function:

    This function is separated from model() so that hyperopt
    won't reload data for each evaluation run.
    """
    import numpy as np
    x = np.load('training_x.npy')
    y = np.load('training_y.npy')
    x_train = x[:15000,:]
    y_train = y[:15000,:]
    x_test = x[15000:,:]
    y_test = y[15000:,:]
    return x_train, y_train, x_test, y_test


def model(x_train, y_train, x_test, y_test):
    """
    Model providing function:

    Create Keras model with double curly brackets dropped-in as needed.
    Return value has to be a valid python dictionary with two customary keys:
        - loss: Specify a numeric evaluation metric to be minimized
        - status: Just use STATUS_OK and see hyperopt documentation if not feasible
    The last one is optional, though recommended, namely:
        - model: specify the model just created so that we can later use it again.
    """
    model_mlp = Sequential()
    model_mlp.add(Dense({{choice([32, 64,126, 256, 512, 1024])}},
                        activation='relu', input_shape= (2,)))
    model_mlp.add(Dropout({{uniform(0, .5)}}))
    model_mlp.add(Dense({{choice([32, 64, 126, 256, 512, 1024])}}))
    model_mlp.add(Activation({{choice(['relu', 'sigmoid'])}}))
    model_mlp.add(Dropout({{uniform(0, .5)}}))
    model_mlp.add(Dense({{choice([32, 64, 126, 256, 512, 1024])}}))
    model_mlp.add(Activation({{choice(['relu', 'sigmoid'])}}))
    model_mlp.add(Dropout({{uniform(0, .5)}}))
    model_mlp.add(Dense({{choice([32, 64, 126, 256, 512, 1024])}}))
    model_mlp.add(Activation({{choice(['relu', 'sigmoid'])}}))
    model_mlp.add(Dropout({{uniform(0, .5)}}))
    model_mlp.add(Dense(9))
    model_mlp.add(Activation({{choice(['softmax','linear'])}}))
    model_mlp.compile(loss={{choice(['categorical_crossentropy','mse'])}}, metrics=['accuracy'],
                  optimizer={{choice(['rmsprop', 'adam', 'sgd'])}})



    model_mlp.fit(x_train, y_train,
              batch_size={{choice([16, 32, 64, 128])}},
              epochs=50,
              verbose=2,
              validation_data=(x_test, y_test))
    score, acc = model_mlp.evaluate(x_test, y_test, verbose=0)
    print('Test accuracy:', acc)
    return {'loss': -acc, 'status': STATUS_OK, 'model': model_mlp}

    enter code here

if __name__ == '__main__':
    import gc; gc.collect()

    with K.get_session(): ## TF session
        best_run, best_model = optim.minimize(model=model,
                                              data=data,
                                              algo=tpe.suggest,
                                              max_evals=2,
                                              trials=Trials())
        X_train, Y_train, X_test, Y_test = data()
        print("Evalutation of best performing model:")
        print(best_model.evaluate(X_test, Y_test))
        print("Best performing model chosen hyper-parameters:")
        print(best_run)

由不同的gc顺序引起，如果python先collect session，程序会成功退出，如果python先collect swig memory(tf_session)，程序会失败退出。

您可以通过以下方式强制 python 删除 session :

del session

或者如果您使用的是 keras，您无法获取 session 实例，您可以在代码末尾运行以下代码:

import gc; gc.collect()

关于python - 如何使用hyperopt对Keras深度学习网络进行超参数优化？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/43533610/

python - 如何使用hyperopt对Keras深度学习网络进行超参数优化？

上一篇：python - 为什么 PyAutoGui LocateOnScreen() 只返回 None

下一篇：python - matplotlib 轴上的不同精度