keras - 如何训练多输出深度学习模型？

我想我不了解多输出网络。

虽然我了解实现是如何进行的，并且我成功地训练了一个这样的模型，但我不明白如何训练多输出深度学习网络。我的意思是，训练期间网络内部发生了什么？

以来自 keras functional api guide 的这个网络为例:

您可以看到两个输出(aux_output 和 main_output)。反向传播是如何工作的？

我的直觉是该模型进行了两次反向传播，每个输出一个。
然后，每个反向传播都会更新退出之前的层的权重。
但似乎不是这样:来自 here (SO)，我得到的信息是，尽管有多个输出，但只有一个反向传播； 使用的损失根据输出加权。

但是，我仍然不明白网络及其辅助分支是如何训练的；辅助分支权重如何更新，因为它没有直接连接到主输出？网络中位于辅助分支的根和主要输出之间的部分是否与损失的权重有关？或者权重只影响连接到辅助输出的网络部分？

另外，我正在寻找关于这个主题的好文章。我已经阅读了 GoogLeNet/Inception 文章( v1 ， v2-v3 )，因为这个网络使用了辅助分支。

最佳答案

Keras 计算基于图形并使用 只有一个优化器 .

优化器也是图的一部分，在它的计算中它获得了整组权重的梯度。 (不是两组梯度，每个输出一组，而是整个模型一组梯度)。

从数学上讲，它并不复杂，你有一个最终的损失函数:

loss = (main_weight * main_loss) + (aux_weight * aux_loss) #you choose the weights in model.compile

一切由你定义。加上一系列其他可能的权重(样本权重、类权重、正则化项等)

在哪里:

main_loss是 function_of(main_true_output_data, main_model_output)

aux_loss是 function_of(aux_true_output_data, aux_model_output)

而梯度只是 ∂(loss)/∂(weight_i)对于所有重量。

一旦优化器有了梯度，它就会执行一次优化步骤。

问题:

how are the auxiliary branch weights updated as it is not connected directly to the main output?

您有 两个输出数据集 . main_output 的一个数据集和另一个数据集 aux_output .您必须将它们传递给 fit在 model.fit(inputs, [main_y, aux_y], ...)

您还有两个损失函数，每个损失函数一个，其中 main_loss需要 main_y和 main_out ;和 aux_loss takex aux_y和 aux_out .

两个损失相加:loss = (main_weight * main_loss) + (aux_weight * aux_loss)

为函数 loss 计算梯度一次，这个函数连接到整个模型。

aux期限会影响lstm_1和 embedding_1在反向传播中。

因此，在下一次前向传递中(更新权重后)，它将最终影响主分支。 (是好是坏只看辅助输出有没有用)

Is the part of the network which is between the root of the auxiliary branch and the main output concerned by the the weighting of the loss? Or the weighting influences only the part of the network that is connected to the auxiliary output?

权重是简单的数学。您将在 compile 中定义它们:

model.compile(optimizer=one_optimizer, 

              #you choose each loss   
              loss={'main_output':main_loss, 'aux_output':aux_loss},

              #you choose each weight
              loss_weights={'main_output': main_weight, 'aux_output': aux_weight}, 

              metrics = ...)

并且损失函数将在 loss = (weight1 * loss1) + (weight2 * loss2) 中使用它们.
剩下的就是∂(loss)/∂(weight_i)的数学计算了对于每个重量。

关于keras - 如何训练多输出深度学习模型？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57149476/

keras - 如何训练多输出深度学习模型？

上一篇：github - 如何禁用对 Google Cloud Build 的 Github 检查

下一篇：主子程序