tensorflow - 如果批量归一化是模型的一部分,如何在 tensorflow 中对 LSTM 应用蒙特卡罗 Dropout?

标签 tensorflow lstm dropout

我有一个由 3 个 LSTM 层组成的模型,后面是批标准化层,最后是密集层。这是代码:

def build_uncomplied_model(hparams):
    inputs = tf.keras.Input(shape=(None, hparams["n_features"]))
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_1"], return_sequences=True, recurrent_dropout=hparams['dropout'])(inputs)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_2"], return_sequences=True)(x)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_3"], return_sequences=True)(x)
    x = layers.BatchNormalization()(x)
    outputs = layers.TimeDistributed(layers.Dense(hparams["n_features"]))(x)

    model = tf.keras.Model(inputs, outputs, name=RNN_type + "_model")
    return model

现在我知道要应用MCDropout,我们可以应用以下代码:

y_predict = np.stack([my_model(X_test, training=True) for x in range(100)])
y_proba = y_predict.mean(axis=0)

但是,设置training = True将强制批量归一化层过度拟合测试数据集。

此外,在将训练设置为 True 的同时构建自定义 Dropout 层对于我来说并不是一个解决方案,因为我使用的是 LSTM。

class MCDropout(tf.keras.layers.Dropout):
    def call(self, inputs):
        return super().call(inputs, training=True)

非常感谢任何帮助!!

最佳答案

一个可能的解决方案是创建自定义 LSTM 层。您应该重写调用方法以强制训练标志为 True

class MCLSTM(keras.layers.LSTM):
    def __init__(self, units, **kwargs):
        super(MCLSTM, self).__init__(units, **kwargs)
    def call(self, inputs, mask=None, training=None, initial_state=None):
        return super(MCLSTM, self).call(
            inputs,
            mask=mask,
            training=True,
            initial_state=initial_state,
        )

然后您可以在代码中使用它

def build_uncomplied_model(hparams):
    inputs = tf.keras.Input(shape=(None, hparams["n_features"]))
    x = MCLSTM(hparams["cell_size_1"], return_sequences=True, recurrent_dropout=hparams['dropout'])(inputs)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_2"], return_sequences=True)(x)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_3"], return_sequences=True)(x)
    x = layers.BatchNormalization()(x)
    outputs = layers.TimeDistributed(layers.Dense(hparams["n_features"]))(x)

    model = tf.keras.Model(inputs, outputs, name=RNN_type + "_model")
    return model

或者将其添加到您的return_RNN工厂(更优雅的方式)

=====编辑=====

另一个解决方案可能是在创建模型时添加训练标志。像这样的事情:

def build_uncomplied_model(hparams):
    inputs = tf.keras.Input(shape=(None, hparams["n_features"]))
    # This the Monte Carlo LSTM
    x = LSTM(hparams["cell_size_1"], return_sequences=True, recurrent_dropout=hparams['dropout'])(inputs, training=True)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_2"], return_sequences=True)(x)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_3"], return_sequences=True)(x)
    x = layers.BatchNormalization()(x)
    outputs = layers.TimeDistributed(layers.Dense(hparams["n_features"]))(x)

    model = tf.keras.Model(inputs, outputs, name=RNN_type + "_model")
    return model

关于tensorflow - 如果批量归一化是模型的一部分,如何在 tensorflow 中对 LSTM 应用蒙特卡罗 Dropout?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62031302/

相关文章:

python - 数学公式的树形(作为点串)

python - 使用 LSTM 预测一个简单的合成时间序列。为什么那么糟糕?

Keras Dropout 层模型预测

python - ValueError : all the input array dimensions for the concatenation axis must match exactly, 但沿维度 2,索引 0 处的数组大小为 3

python - TensorFlow LSTM : Why does test accuracy become low, 但没有训练一个?

deep-learning - PyTorch:Dropout(?)导致训练+验证的不同模型收敛 V.仅训练

python - PyTorch:为什么要创建同一类型层的多个实例?

tensorflow - 得到形状 [4575, 32, 32, 3],但想要 [4575] Tensorflow

python - 检查输入 : expected input_49 to have shape (512, 512, 1) 时出错,但得到形状为 (28, 28, 1) 的数组

python - 如何检查 tf.estimator.inputs.numpy_input_fn 的内容?