python - 如何在 LSTM 中实现 Tensorflow 批量归一化

标签 python tensorflow neural-network lstm rnn

我当前的 LSTM 网络看起来像这样。

rnn_cell = tf.contrib.rnn.BasicRNNCell(num_units=CELL_SIZE)
init_s = rnn_cell.zero_state(batch_size=1, dtype=tf.float32)  # very first hidden state
outputs, final_s = tf.nn.dynamic_rnn(
    rnn_cell,              # cell you have chosen
    tf_x,                  # input
    initial_state=init_s,  # the initial hidden state
    time_major=False,      # False: (batch, time step, input); True: (time step, batch, input)
)

# reshape 3D output to 2D for fully connected layer
outs2D = tf.reshape(outputs, [-1, CELL_SIZE])
net_outs2D = tf.layers.dense(outs2D, INPUT_SIZE)

# reshape back to 3D
outs = tf.reshape(net_outs2D, [-1, TIME_STEP, INPUT_SIZE])

通常，我申请tf.layers.batch_normalization作为批量归一化。但我不确定这是否适用于 LSTM 网络。

b1 = tf.layers.batch_normalization(outputs, momentum=0.4, training=True)
d1 = tf.layers.dropout(b1, rate=0.4, training=True)

# reshape 3D output to 2D for fully connected layer
outs2D = tf.reshape(d1, [-1, CELL_SIZE])                       
net_outs2D = tf.layers.dense(outs2D, INPUT_SIZE)

# reshape back to 3D
outs = tf.reshape(net_outs2D, [-1, TIME_STEP, INPUT_SIZE])

最佳答案

如果你想对 RNN(LSTM 或 GRU)使用 batch norm，你可以查看 this implementation ，或阅读 blog post 中的完整说明.

然而，在序列数据中，layer-normalization 比 batch norm 更有优势。具体来说，“批量归一化的效果取决于小批量大小，如何将其应用于循环网络并不明显”(来自论文 Ba, et al. Layer normalization)。

对于层归一化，它对每个层内的求和输入进行归一化。您可以查看 implementation GRU 单元的层归一化:

关于python - 如何在 LSTM 中实现 Tensorflow 批量归一化，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46915354/

上一篇：python - 无法在 Jenkins 中运行 python 脚本

下一篇：python - window : Z3Exception ("init(Z3_LIBRARY_PATH) must be invoked before using Z3-python")

python - 如何在没有帮助的情况下克服卷积神经网络中的过度拟合？

machine-learning - Keras - 无法获得正确的类别预测

python - Keras 交叉验证精度在每个 epoch 后稳定在 (1/output_classes)

python - 制作 pycaffe -> "fatal error: cublas_v2.h: No such file or directory"

python - 如何处理 CSV 文件中 DECIMAL 列的缺失值

python - 无法使用 python HTTPSConnection() 连接到 Web 服务器

python - 条形或标签之间的间距

dll - "Hello TensorFlow!"使用 C API

python - 在 Keras 中串联训练多个模型以进行超参数优化