python - Tensorflow 中 MultiRNNCell 的输出和状态

标签 python tensorflow

我有一个堆叠式 MultiRNNCell,定义如下:

batch_size = 256
rnn_size = 512
keep_prob = 0.5

lstm_1 = tf.nn.rnn_cell.LSTMCell(rnn_size)
lstm_dropout_1 = tf.nn.rnn_cell.DropoutWrapper(lstm_1, output_keep_prob = keep_prob)

lstm_2 = tf.nn.rnn_cell.LSTMCell(rnn_size)
lstm_dropout_2 = tf.nn.rnn_cell.DropoutWrapper(lstm_2, output_keep_prob = keep_prob)

stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([lstm_dropout_1, lstm_dropout_2])

rnn_inputs = tf.nn.embedding_lookup(embedding_matrix, ques_placeholder)

init_state = stacked_lstm.zero_state(batch_size, tf.float32)
rnn_outputs, final_state = tf.nn.dynamic_rnn(stacked_lstm, rnn_inputs, initial_state=init_state)

在这段代码中,有两个 RNN 层。我只想处理这个动态 RNN 的最终状态。我希望状态是形状为 [batch_size, rnn_size*2] 的二维张量。

final_state 的形状是 4D - [2,2,256,512]

有人可以解释一下为什么我会得到这个形状吗?另外,我该如何处理这个张量,以便我可以将它传递给一个完全连接的层?

最佳答案

我无法重现 [2,2,256,512] 形状。但是有了这段代码:

rnn_size = 512
batch_size = 256
time_size = 5
input_size = 2
keep_prob = 0.5

lstm_1 = tf.nn.rnn_cell.LSTMCell(rnn_size)
lstm_dropout_1 = tf.nn.rnn_cell.DropoutWrapper(lstm_1, output_keep_prob=keep_prob)

lstm_2 = tf.nn.rnn_cell.LSTMCell(rnn_size)

stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([lstm_dropout_1, lstm_2])

rnn_inputs = tf.placeholder(tf.float32, shape=[None, time_size, input_size])
# Shape of the rnn_inputs is (batch_size, time_size, input_size)

init_state = stacked_lstm.zero_state(batch_size, tf.float32)
rnn_outputs, final_state = tf.nn.dynamic_rnn(stacked_lstm, rnn_inputs, initial_state=init_state)
print(rnn_outputs)
print(final_state)

我得到了 run_outputs 的正确形状:(batch_size, time_size, rnn_size)

Tensor("rnn/transpose_1:0", shape=(256, 5, 512), dtype=float32)

final_state 确实是一对 LSTMStateTuple(对于 2 个单元 lstm_dropout_1lstm_2):

(LSTMStateTuple(c=<tf.Tensor 'rnn/while/Exit_3:0' shape=(256, 512) dtype=float32>, h=<tf.Tensor 'rnn/while/Exit_4:0' shape=(256, 512) dtype=float32>),
 LSTMStateTuple(c=<tf.Tensor 'rnn/while/Exit_5:0' shape=(256, 512) dtype=float32>, h=<tf.Tensor 'rnn/while/Exit_6:0' shape=(256, 512) dtype=float32>))

tf.nn.dynamic_run 的字符串文档中所述:

  # 'outputs' is a tensor of shape [batch_size, max_time, 256]
  # 'state' is a N-tuple where N is the number of LSTMCells containing a
  # tf.contrib.rnn.LSTMStateTuple for each cell

关于python - Tensorflow 中 MultiRNNCell 的输出和状态,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49345172/

相关文章:

python - 为什么在卷积神经网络中可能有低损失,但准确率也很低?

python - 使用 seaborn,我如何在我的散点图中画一条我选择的线?

python - 在 Python 中模拟 Azure Functions httpRequest 表单

python - gcc 在lxml安装CentOS时出现内部错误

python - 用于初始状态的每日天气预报的 LSTM 网络

python - 迭代 tf.data.Dataset 的有效方法

python - 在嵌入式 Python 解释器中打印变量

python - 插入/并将 yymmdd 反转为 ddmmyy

python - TensorFlow - 返回多维张量的不同子张量

python - 在 session 期间获取 tensorflow 占位符的值