python - conv2d 之后的 PyTorch CNN 线性层形状

我试图学习 PyTorch 并发现了一个教程，其中 CNN 的定义如下，

class Net(Module):   
    def __init__(self):
        super(Net, self).__init__()

        self.cnn_layers = Sequential(
            # Defining a 2D convolution layer
            Conv2d(1, 4, kernel_size=3, stride=1, padding=1),
            BatchNorm2d(4),
            ReLU(inplace=True),
            MaxPool2d(kernel_size=2, stride=2),
            # Defining another 2D convolution layer
            Conv2d(4, 4, kernel_size=3, stride=1, padding=1),
            BatchNorm2d(4),
            ReLU(inplace=True),
            MaxPool2d(kernel_size=2, stride=2),
        )

        self.linear_layers = Sequential(
            Linear(4 * 7 * 7, 10)
        )

    # Defining the forward pass    
    def forward(self, x):
        x = self.cnn_layers(x)
        x = x.view(x.size(0), -1)
        x = self.linear_layers(x)
        return x

我了解了 cnn_layers 是如何制作的。在 cnn_layers 之后，数据应该被展平并提供给 Linear_layers。

我不明白Linear的特征数量是多少4*7*7 。我知道 4 是最后一个 Conv2d 层的输出维度。

怎么样7*7进来拍照吗？步幅或填充在其中有任何作用吗？

输入图像形状为[1, 28, 28]

最佳答案

Conv2d层的内核大小为 3，步长和填充为 1，这意味着它不会改变图像的空间大小。有两个MaxPool2d减少空间维度的层 (H, W)至(H/2, W/2) 。因此，对于每个批处理，具有 4 个输出 channel 的最后一个卷积的输出的形状为 (batch_size, 4, H/4, W/4) 。在前向传递中，特征张量被 x = x.view(x.size(0), -1) 展平。这使得它的形状为 (batch_size, H*W/4) 。我假设 H 和 W 为 28，为此线性层将采用形状 (batch_size, 196) 的输入.

关于python - conv2d 之后的 PyTorch CNN 线性层形状，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/65982152/

python - conv2d 之后的 PyTorch CNN 线性层形状

上一篇：json - Integromat - 从 rpc 动态渲染集合的规范

下一篇：git - 在哪里可以找到 git-lfs-authenticate 命令？