python-3.x - PyTorch:DecoderRNN:RuntimeError:输入必须有 3 维,得到 2

标签 python-3.x pytorch rnn encoder-decoder

我正在使用 PyTorch 构建 DecoderRNN(这是一个图像字幕解码器):

class DecoderRNN(nn.Module):
    def __init__(self, embed_size, hidden_size, vocab_size):

        super(DecoderRNN, self).__init__()
        self.hidden_size = hidden_size
        self.gru = nn.GRU(embed_size, hidden_size, hidden_size)
        self.softmax = nn.LogSoftmax(dim=1)

    def forward(self, features, captions):

        print (features.shape)
        print (captions.shape)
        output, hidden = self.gru(features, captions)
        output = self.softmax(self.out(output[0]))
        return output, hidden 

数据具有以下形状:
torch.Size([10, 200])  <- features.shape (10 for batch size)
torch.Size([10, 12])   <- captions.shape (10 for batch size)

然后我收到以下错误。有什么想法我在这里错过了吗?谢谢!
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-2-76e05ba08b1d> in <module>()
     44         # Pass the inputs through the CNN-RNN model.
     45         features = encoder(images)
---> 46         outputs = decoder(features, captions)
     47 
     48         # Calculate the batch loss.

/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    323         for hook in self._forward_pre_hooks.values():
    324             hook(self, input)
--> 325         result = self.forward(*input, **kwargs)
    326         for hook in self._forward_hooks.values():
    327             hook_result = hook(self, input, result)

/home/workspace/model.py in forward(self, features, captions)
     37         print (captions.shape)
     38         # features = features.unsqueeze(1)
---> 39         output, hidden = self.gru(features, captions)
     40         output = self.softmax(self.out(output[0]))
     41         return output, hidden

/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    323         for hook in self._forward_pre_hooks.values():
    324             hook(self, input)
--> 325         result = self.forward(*input, **kwargs)
    326         for hook in self._forward_hooks.values():
    327             hook_result = hook(self, input, result)

/opt/conda/lib/python3.6/site-packages/torch/nn/modules/rnn.py in forward(self, input, hx)
    167             flat_weight=flat_weight
    168         )
--> 169         output, hidden = func(input, self.all_weights, hx)
    170         if is_packed:
    171             output = PackedSequence(output, batch_sizes)

/opt/conda/lib/python3.6/site-packages/torch/nn/_functions/rnn.py in forward(input, *fargs, **fkwargs)
    383             return hack_onnx_rnn((input,) + fargs, output, args, kwargs)
    384         else:
--> 385             return func(input, *fargs, **fkwargs)
    386 
    387     return forward

/opt/conda/lib/python3.6/site-packages/torch/autograd/function.py in _do_forward(self, *input)
    326         self._nested_input = input
    327         flat_input = tuple(_iter_variables(input))
--> 328         flat_output = super(NestedIOFunction, self)._do_forward(*flat_input)
    329         nested_output = self._nested_output
    330         nested_variables = _unflatten(flat_output, self._nested_output)

/opt/conda/lib/python3.6/site-packages/torch/autograd/function.py in forward(self, *args)
    348     def forward(self, *args):
    349         nested_tensors = _map_variable_tensor(self._nested_input)
--> 350         result = self.forward_extended(*nested_tensors)
    351         del self._nested_input
    352         self._nested_output = result

/opt/conda/lib/python3.6/site-packages/torch/nn/_functions/rnn.py in forward_extended(self, input, weight, hx)
    292             hy = tuple(h.new() for h in hx)
    293 
--> 294         cudnn.rnn.forward(self, input, hx, weight, output, hy)
    295 
    296         self.save_for_backward(input, hx, weight, output)

/opt/conda/lib/python3.6/site-packages/torch/backends/cudnn/rnn.py in forward(fn, input, hx, weight, output, hy)
    206         if (not is_input_packed and input.dim() != 3) or (is_input_packed and input.dim() != 2):
    207             raise RuntimeError(
--> 208                 'input must have 3 dimensions, got {}'.format(input.dim()))
    209         if fn.input_size != input.size(-1):
    210             raise RuntimeError('input.size(-1) must be equal to input_size. Expected {}, got {}'.format(

RuntimeError: input must have 3 dimensions, got 2

最佳答案

您的 GRU 输入需要是 3 维的:

input of shape (seq_len, batch, input_size): tensor containing the features of the input sequence.



此外,您需要提供隐藏状态(在本例中为最后一个编码器隐藏状态)作为第二个参数:
self.gru(input, h_0)

哪里input是您的实际输入和 h_0隐藏状态也需要是 3 维的:

h_0 of shape (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch. Defaults to zero if not provided.



https://pytorch.org/docs/master/nn.html#torch.nn.GRU

关于python-3.x - PyTorch:DecoderRNN:RuntimeError:输入必须有 3 维,得到 2,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50399055/

相关文章:

python - 在 TensorFlow 中展平张量的最后两个维度

python - 为什么这个高阶函数不通过 mypy 中的静态类型检查?

Django Rest Framework 完整性错误捕获

python - Peewee 在迁移过程中不使用主键递增整数字段

python - 有没有办法使用 HuggingFace TrainerAPI 在同一个图表上绘制训练和验证损失?

keras - keras 中用于文本分类的 convolution2d 的尺寸误差

python-3.x - 抓取所有 youtube 搜索结果

python - CNN 的输出不会随输入变化太大

deep-learning - torch 。 pin_memory 在 Dataloader 中如何工作?

tensorflow - 由于 batch_size 问题,有状态 LSTM 无法预测