python - Tensorflow 和 Keras 的迁移学习问题

标签 python tensorflow keras conv-neural-network vgg-net

我一直在尝试重新创建在 this blog post 中完成的工作.这篇文章非常全面并且code通过协作共享。

我想要做的是从预训练的 VGG19 网络中提取层,并创建一个将这些层作为输出的新网络。但是,当我组装新网络时,它与 VGG19 网络非常相似,并且似乎包含我没有提取的层。下面是一个例子。

import tensorflow as tf
from tensorflow.python.keras import models

## Create network based on VGG19 arch with pretrained weights
vgg = tf.keras.applications.vgg19.VGG19(include_top=False, weights='imagenet')
vgg.trainable = False

当我们查看 VGG19 的摘要时,我们看到了我们期望的架构。

vgg.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         (None, None, None, 3)     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, None, None, 256)   295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv4 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, None, None, 256)   0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv4 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, None, None, 512)   0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv4 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, None, None, 512)   0         
=================================================================
Total params: 20,024,384
Trainable params: 0
Non-trainable params: 20,024,384
_________________________________________________________________

然后,我们提取层并创建一个新模型

## Layers to extract
content_layers = ['block5_conv2'] 
style_layers = ['block1_conv1','block2_conv1','block3_conv1','block4_conv1','block5_conv1']
## Get output layers corresponding to style and content layers 
style_outputs = [vgg.get_layer(name).output for name in style_layers]
content_outputs = [vgg.get_layer(name).output for name in content_layers]
model_outputs = style_outputs + content_outputs

new_model = models.Model(vgg.input, model_outputs)

new_model 创建时,我相信我们应该有一个更小的模型。但是,模型摘要显示新模型非常接近原始模型(它包含来自 VGG19 的 22 个层中的 19 个)并且它包含我们没有提取的层。

new_model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         (None, None, None, 3)     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, None, None, 256)   295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv4 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, None, None, 256)   0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv4 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, None, None, 512)   0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   
=================================================================
Total params: 15,304,768
Trainable params: 15,304,768
Non-trainable params: 0
_________________________________________________________________

所以我的问题是...

  1. 为什么我没有提取的层出现在 new_model 中。这些是由模型的实例化过程推断出来的吗 per the docs ?这似乎太接近 VGG19 架构而无法推断。
  2. 根据我对 Keras 的理解 Model (functional API) ,传递多个输出层应该创建一个具有多个输出的模型,但是,新模型似乎是顺序的并且只有一个输出层。是这样吗?

最佳答案

  1. Why are layers that I didn't extract showing up in new_model.

那是因为当您使用 models.Model(vgg.input, model_outputs) 创建模型时,包含 vgg.input 和输出层之间的“中间”层以及。这是 VGG 以这种方式构建的预期方式。

例如,如果您要以这种方式创建模型:models.Model(vgg.input, vgg.get_layer('block2_pool') input_1 之间的每个中间层> 和 block2_pool 将被包括在内,因为输入必须在到达 block2_pool 之前流经它们。下面是 VGG 的部分图表,可以帮助那个。

enter image description here

现在,- 如果我没有误解的话- 如果您想创建一个不包含那些中间层(可能效果不佳)的模型,您必须自己创建一个。函数式 API 在这方面非常有用。 documentation 上有例子但你想要做的事情的要点如下:

from keras.layers import Conv2D, Input

x_input = Input(shape=(28, 28, 1,))
block1_conv1 = Conv2D(64, (3, 3), padding='same')(x_input)
block2_conv2 = Conv2D(128, (3, 3), padding='same')(x_input)
...

new_model = models.Model(x_input, [block1_conv1, block2_conv2, ...])
  1. ... however, it seems that the new model is sequential and only has a single output layer. Is this the case?

不,您的模型具有您预期的多个输出。 model.summary() 应该显示哪些层连接到什么(这将有助于理解结构),但我相信某些版本存在一个小错误可以防止这种情况发生。在任何情况下,您都可以通过检查 new_model.output 看到您的模型有多个输出,这应该给您:

[<tf.Tensor 'block1_conv1/Relu:0' shape=(?, ?, ?, 64) dtype=float32>,
 <tf.Tensor 'block2_conv1/Relu:0' shape=(?, ?, ?, 128) dtype=float32>,
 <tf.Tensor 'block3_conv1/Relu:0' shape=(?, ?, ?, 256) dtype=float32>,
 <tf.Tensor 'block4_conv1/Relu:0' shape=(?, ?, ?, 512) dtype=float32>,
 <tf.Tensor 'block5_conv1/Relu:0' shape=(?, ?, ?, 512) dtype=float32>,
 <tf.Tensor 'block5_conv2/Relu:0' shape=(?, ?, ?, 512) dtype=float32>]

new_model.summary() 中按顺序打印它只是一种设计选择,因为复杂的模型会变得很麻烦。

关于python - Tensorflow 和 Keras 的迁移学习问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52619166/

相关文章:

python - pandas DataFrame 中每组的最大日期

Python 导入错误 "No module named template"

tensorflow - 从 TensorFlow 中给定的非均匀分布中进行无放回采样

python - 运行时错误: as_numpy_iterator() is not supported while tracing functions

python - 堆叠 RNN 的输入形状

python - 集合的 all() 方法的逻辑

python - Pandas 每年转换为每月

tensorflow - 如何在 tensorflow 2.0 中计算 hessian?

python - Tensorflow v1.12.0中export_savedmodel的问题

python - Predict_proba不输出概率