tensorflow - 在每个时间步提取 LSTM 模型的隐藏状态向量

我正在使用带有 tensorflow 后端的 keras。

我有一个经过训练的 LSTM 模型，我想在每个时间步提取其隐藏状态向量。

在 keras 中执行此操作的最佳方法是什么？

最佳答案

处理是否返回所有隐藏状态向量的函数是Recurrent.call()(在最新版本中已重命名为RNN.call())。它检查参数 return_sequences 来做出决定。

当后台函数K.rnn()在此函数中被调用时:

last_output, outputs, states = K.rnn(self.step,
                                     preprocessed_input,
                                     initial_state,
                                     go_backwards=self.go_backwards,
                                     mask=mask,
                                     constants=constants,
                                     unroll=self.unroll,
                                     input_length=input_shape[1])

...

if self.return_sequences:
    output = outputs
else:
    output = last_output

张量 outputs 就是您想要的。您可以通过再次调用 Recurrent.call() 来获取此张量，但使用 return_sequences=True。这应该不会损害您训练的 LSTM 模型(至少在当前的 Keras 中)。

这是一个演示此方法的玩具 Bi-LSTM 模型:

input_tensor = Input(shape=(None,), dtype='int32')
embedding = Embedding(10, 100, mask_zero=True)(input_tensor)
hidden = Bidirectional(LSTM(10, return_sequences=True))(embedding)
hidden = Bidirectional(LSTM(10, return_sequences=True))(hidden)
hidden = Bidirectional(LSTM(2))(hidden)
out = Dense(1, activation='sigmoid')(hidden)
model = Model(input_tensor, out)

首先，将最后一个 LSTM 层的 return_sequences 设置为 True(因为您使用的是 Bidirectional 包装器，所以您必须设置 forward_layer 和 backward_layer 也):

target_layer = model.layers[-2]
target_layer.return_sequences = True
target_layer.forward_layer.return_sequences = True
target_layer.backward_layer.return_sequences = True

现在通过再次调用该层，将返回包含所有时间步长的隐藏向量的张量(创建额外的入站节点会有副作用，但它不应该影响预测)。

outputs = target_layer(target_layer.input)
m = Model(model.input, outputs)

您可以通过调用 m.predict(X_test) 等方式获取隐藏向量。

X_test = np.array([[1, 3, 2, 0, 0]])
print(m.predict(X_test))

[[[ 0.00113332 -0.0006666   0.00428438 -0.00125567]
  [ 0.00106074 -0.00041183  0.00383953 -0.00027285]
  [ 0.00080892  0.00027685  0.00238486  0.00036328]
  [ 0.00080892  0.00027685  0.          0.        ]
  [ 0.00080892  0.00027685  0.          0.        ]]]

如您所见，返回了所有 5 个时间步的隐藏向量，最后 2 个时间步被正确屏蔽。

关于tensorflow - 在每个时间步提取 LSTM 模型的隐藏状态向量，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46511179/

tensorflow - 在每个时间步提取 LSTM 模型的隐藏状态向量

上一篇：opengl - 您如何使用 LIBGDX 将每像素照明合并到着色器中？

下一篇：c - 在 Cython 中使用 PyCapsule