我有一个在 ImageNet 上训练的简单 CNN 模型。我使用 keras.utils.multi_gpu_model 进行多 GPU 训练。它工作得很好,但是当我尝试训练基于相同主干网络的 SSD 模型时遇到了问题。它在主干顶部有自定义损失和多个自定义层:
model, predictor_sizes, input_encoder = build_model(input_shape=(args.img_height, args.img_width, 3),
n_classes=num_classes, mode='training')
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
loss = SSDMultiBoxLoss(neg_pos_ratio=3, alpha=1.0)
if args.num_gpus > 1:
model = multi_gpu_model(model, gpus=args.num_gpus)
model.compile(optimizer=optimizer, loss=loss.compute_loss)
model.summary()
如果num_gpus==1
,我有以下摘要:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 512, 512, 3) 0
__________________________________________________________________________________________________
conv1_pad (Lambda) (None, 516, 516, 3) 0 input_1[0][0]
__________________________________________________________________________________________________
conv1 (Conv2D) (None, 256, 256, 16) 1216 conv1_pad[0][0]
__________________________________________________________________________________________________
conv1_bn (BatchNormalization) (None, 256, 256, 16) 64 conv1[0][0]
__________________________________________________________________________________________________
conv1_relu (Activation) (None, 256, 256, 16) 0 conv1_bn[0][0]
__________________________________________________________________________________________________
....
det_ctx6_2_mbox_loc_reshape[0][0]
__________________________________________________________________________________________________
mbox_priorbox (Concatenate) (None, None, 8) 0 det_ctx1_2_mbox_priorbox_reshape[
det_ctx2_2_mbox_priorbox_reshape[
det_ctx3_2_mbox_priorbox_reshape[
det_ctx4_2_mbox_priorbox_reshape[
det_ctx5_2_mbox_priorbox_reshape[
det_ctx6_2_mbox_priorbox_reshape[
__________________________________________________________________________________________________
mbox (Concatenate) (None, None, 33) 0 mbox_conf_softmax[0][0]
mbox_loc[0][0]
mbox_priorbox[0][0]
==================================================================================================
Total params: 1,890,510
Trainable params: 1,888,366
Non-trainable params: 2,144
但是,在多 GPU 情况下,我可以看到所有中间层都打包在模型
下:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 512, 512, 3) 0
__________________________________________________________________________________________________
lambda (Lambda) (None, 512, 512, 3) 0 input_1[0][0]
__________________________________________________________________________________________________
lambda_1 (Lambda) (None, 512, 512, 3) 0 input_1[0][0]
__________________________________________________________________________________________________
model (Model) (None, None, 33) 1890510 lambda[0][0]
lambda_1[0][0]
__________________________________________________________________________________________________
mbox (Concatenate) (None, None, 33) 0 model[1][0]
model[2][0]
==================================================================================================
Total params: 1,890,510
Trainable params: 1,888,366
Non-trainable params: 2,144
训练运行正常,但我无法加载之前预训练的权重:
model.load_weights(args.weights, by_name=True)
由于错误:
ValueError: Layer #3 (named "model") expects 150 weight(s), but the saved weights have 68 element(s).
当然,预训练模型仅具有主干网的权重,而不具有对象检测模型其余部分的权重。
任何人都可以帮助我理解:
- 为什么所有中间层都打包到 Lambda 层中?
- 为什么分类模型不会发生这种情况
- 如何克服“模型打包”问题或加载此类模型的预训练权重?
注意:我正在使用 tf.Keras,它现在是 Tensorflow 的一部分。
最佳答案
您可以在构建后立即加载权重,然后再转换为多 GPU 对应项。或者,您可以为单 GPU 和多 GPU 版本设置两个对象,并使用第一个对象加载权重,使用第二个对象进行训练。
关于python - Keras 多 GPU 模型对于自定义模型失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54238303/