Keras:嵌入 LSTM

在 LSTM 上用于建模 IMDB 序列数据 ( https://github.com/fchollet/keras/blob/master/examples/imdb_lstm.py ) 的 keras 示例中，在输入到 LSTM 层之前有一个嵌入层:

model.add(Embedding(max_features,128)) #max_features=20000
model.add(LSTM(128))

嵌入层的真正作用是什么？在这种情况下，这是否意味着进入 LSTM 层的输入序列的长度是 128？如果是这样，我可以将 LSTM 层写为:

model.add(LSTM(128,input_shape=(128,1))

但也注意到输入 X_train已遭受pad_sequences加工:

print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen) #maxlen=80
X_test = sequence.pad_sequences(X_test, maxlen=maxlen) #maxlen=80

好像输入序列长度是80？

最佳答案

To quote the documentation :

Turns positive integers (indexes) into dense vectors of fixed size. eg. [[4], [20]] -> [[0.25, 0.1], [0.6, -0.2]]

基本上，这会将索引(表示您的 IMDB 评论包含的单词)转换为具有给定大小(在您的情况下为 128)的向量。

如果您不知道嵌入一般是什么，here is the wikipedia definition :

Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers in a low-dimensional space relative to the vocabulary size ("continuous space").

回到你问的另一个问题:

In this case, does that means the length of the input sequence into the LSTM layer is 128?

不完全的。对于循环网络，您将有一个时间维度和一个特征维度。 128 是你的特征维度，就像每个嵌入向量应该有多少维度一样。您示例中的时间维度是存储在 maxlen 中的内容。，用于生成训练序列。

无论您以 128 的形式提供给 LSTM layer is the actual number of output units of the LSTM .

关于Keras:嵌入 LSTM，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44759194/

上一篇：sql - 在 Oracle SQL 语句中使用分号

下一篇：audio - Monogame:WAV 不播放