我有一个模型,它采用三个具有相同标签的可变长度输入。有没有办法以某种方式使用 pack_padded_sequence
?如果是这样,我应该如何对序列进行排序?
例如,
a = (([0,1,2], [3,4], [5,6,7,8]), 1) # training data is in length 3,2,4; label is 1
b = (([0,1], [2], [6,7,8,9,10]), 1)
a 和 b 都将被馈送到三个独立的 LSTM 中,结果将被合并以预测目标。
最佳答案
让我们一步一步来。
Input Data Processing
a = (([0,1,2], [3,4], [5,6,7,8]), 1)
# store length of each element in an array
len_a = np.array([len(a) for a in a[0]])
variable_a = np.zeros((len(len_a), np.amax(len_a)))
for i, a in enumerate(a[0]):
variable_a[i, 0:len(a)] = a
vocab_size = len(np.unique(variable_a))
Variable(torch.from_numpy(variable_a).long())
print(variable_a)
它打印:
Variable containing:
0 1 2 0
3 4 0 0
5 6 7 8
[torch.DoubleTensor of size 3x4]
Defining embedding and RNN layer
现在,假设我们有一个嵌入和 RNN 层类,如下所示。
class EmbeddingLayer(nn.Module):
def __init__(self, input_size, emsize):
super(EmbeddingLayer, self).__init__()
self.embedding = nn.Embedding(input_size, emsize)
def forward(self, input_variable):
return self.embedding(input_variable)
class Encoder(nn.Module):
def __init__(self, input_size, hidden_size, bidirection):
super(Encoder, self).__init__()
self.input_size = input_size
self.hidden_size = hidden_size
self.bidirection = bidirection
self.rnn = nn.LSTM(self.input_size, self.hidden_size, batch_first=True,
bidirectional=self.bidirection)
def forward(self, sent_variable, sent_len):
# Sort by length (keep idx)
sent_len, idx_sort = np.sort(sent_len)[::-1], np.argsort(-sent_len)
idx_unsort = np.argsort(idx_sort)
idx_sort = torch.from_numpy(idx_sort)
sent_variable = sent_variable.index_select(0, Variable(idx_sort))
# Handling padding in Recurrent Networks
sent_packed = nn.utils.rnn.pack_padded_sequence(sent_variable, sent_len, batch_first=True)
sent_output = self.rnn(sent_packed)[0]
sent_output = nn.utils.rnn.pad_packed_sequence(sent_output, batch_first=True)[0]
# Un-sort by length
idx_unsort = torch.from_numpy(idx_unsort)
sent_output = sent_output.index_select(0, Variable(idx_unsort))
return sent_output
Embed and encode the processed input data
我们可以按如下方式嵌入和编码我们的输入。
emb = EmbeddingLayer(vocab_size, 50)
enc = Encoder(50, 100, False, 'LSTM')
emb_a = emb(variable_a)
enc_a = enc(emb_a, len_a)
如果打印 enc_a
的大小,您将得到 torch.Size([3, 4, 100])
。我希望你能理解这个形状的含义。
请注意,以上代码仅在 CPU 上运行。
关于python - 如何在pytorch中使用具有相同标签的多个可变长度输入的pack_padded_sequence,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49203019/