machine-learning - dropout 在 keras 的 LSTM 层中如何工作？

在keras' documentation没有关于 LSTM 层如何实际实现 dropout 的信息。

但是，有一个指向论文“A Theoretically Grounded Application of Dropout in Recurrent Neural Networks”的链接，这让我相信 dropout 是按照该论文中描述的方式实现的。

也就是说，对于该层正在处理的时间序列中的每个时间步，都使用相同的丢失蒙版。

查看source code ，在我看来 LSTMCell.call被迭代调用，时间序列中的每个时间步调用一次，并在每次调用时生成一个新的 dropout mask。

我的问题是:

要么是我误解了 keras 的代码，要么是 keras 文档中对论文的引用具有误导性。是哪个？

最佳答案

论文和代码都是一致的。您理解正确，但对代码的解释有点错误。

初始化dropout_mask之前有一个检查，self._dropout_mask is None

因此 LSTMCell.call 会被迭代调用，对于时间序列中的每个时间步调用一次，但仅在第一次调用时才会生成新的 dropout mask。

if 0 < self.dropout < 1 and self._dropout_mask is None:
    self._dropout_mask = _generate_dropout_mask(
        K.ones_like(inputs),
        self.dropout,
        training=training,
        count=4)
if (0 < self.recurrent_dropout < 1 and
        self._recurrent_dropout_mask is None):
    self._recurrent_dropout_mask = _generate_dropout_mask(
        K.ones_like(states[0]),
        self.recurrent_dropout,
        training=training,
        count=4)

希望能解答您的疑问。

关于machine-learning - dropout 在 keras 的 LSTM 层中如何工作？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49898080/

上一篇：python - 可训练的多参数事件。函数 (RBF) NeuPy/Theano

下一篇：tensorflow - 带 ID 列的训练模型

相关文章：

python - 在 Keras 中进行批量归一化的双向 LSTM

python - 使用 openCV 和 python 检测图像中的图案和数字

machine-learning - 如何为 latex 字符串创建一个词袋？

tensorflow - 即使批量大小较小，Keras fit_generator 也会使用大量内存

python - 如何设计一个神经网络从数组中预测数组

python - 使用keras dropout层后如何知道哪个节点被删除

python - 如何使用TRAINS python自动神奇实验管理器手动注册sci-kit模型？

python - tensorflow 线性回归误差爆炸

keras - 异常 : Error when checking model target: expected dense_3 to have shape (None, 1000) 但得到形状为 (32, 2) 的数组

python - 从经过训练的自动编码器中提取编码器和解码器