PyTorch Softmax 输出总和不等于 1

交叉发帖 my question from the PyTorch forum :

我开始收到目标狄利克雷分布和模型的输出狄利克雷分布之间的负 KL 散度。网上有人提出这可能表明狄利克雷分布的参数总和不等于1。我认为这很荒谬，因为模型的输出是通过传递的

output = F.softmax(self.weights(x), dim=1)

但仔细研究后，我发现torch.all(torch.sum(output, dim=1) == 1.)返回假!查看有问题的行，我发现它是 tensor([0.0085, 0.9052, 0.0863], grad_fn=<SelectBackward>) 。但是torch.sum(output[5]) == 1.产生tensor(False) .

我误用了 softmax 的哪些内容，导致输出概率之和不等于 1？

这是 PyTorch 版本 1.2.0+cpu。完整模型复制如下:

import torch
import torch.nn as nn
import torch.nn.functional as F



def assert_no_nan_no_inf(x):
    assert not torch.isnan(x).any()
    assert not torch.isinf(x).any()


class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.weights = nn.Linear(
            in_features=2,
            out_features=3)

    def forward(self, x):
        output = F.softmax(self.weights(x), dim=1)
        assert torch.all(torch.sum(output, dim=1) == 1.)
        assert_no_nan_no_inf(x)
        return output

最佳答案

这很可能是由于有限精度导致的 float 值误差。

您应该检查均方误差或其他是否在可接受的限度内，而不是检查严格的不平等。

例如:我得到 torch.norm(output.sum(dim=1)-1)/N 小于 1e-8。 N 是批量大小。

关于PyTorch Softmax 输出总和不等于 1，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/58615923/

上一篇：r - 在 R 中将字符串与组合列表组合

下一篇：swift - 当我运行应用程序时出现两个搜索栏，但 Storyboard上只显示一个

相关文章：

python - 访问 PyTorch 中预训练模型中的特定层

python - 无法在 jupyter 笔记本中导入手电筒

neural-network - 为什么需要 softmax 函数？为什么不简单归一化？

python - 理解RNN的softmax输出层

python - 如何在Tensorflow中用Logistic层替换Softmax输出层？

conv-neural-network - CNN : Softmax layer for pixel-wise classification

带损失层的caffe softmax用于语义分割损失计算

matrix - PyTorch 向量/矩阵/张量的逐元素乘积

pytorch - Pytorch 中是否有一种方法可以以可以反向传播的方式计算唯一值的数量？

python - PyTorch 线性回归模型