python - Tensorflow 2 Keras 嵌套模型子类化 - 总参数为零

标签 python tensorflow keras

我正在尝试实现一个受 VGG 网络启发的简单模型子类化。

代码如下:

class ConvMax(tf.keras.Model):
    def __init__(self, filters=4, kernel_size=3, pool_size=2, activation='relu'):
        super(ConvMax, self).__init__()

        self.conv = tf.keras.layers.Conv2D(filters, kernel_size, padding='same', activation=activation)
        self.maxpool = tf.keras.layers.MaxPool2D((pool_size, pool_size))

    def call(self, input_tensor):
        x = self.conv(input_tensor)
        x = self.maxpool(x)
        return x

class RepeatedConvMax(tf.keras.Model):
    def __init__(self, repetitions=4, filters=4, kernel_size=3, pool_size=2, activation='relu', **kwargs):
        super(RepeatedConvMax, self).__init__(**kwargs)
    
        self.repetitions = repetitions
        self.filters = filters
        self.kernel_size = kernel_size
        self.pool_size = pool_size
        self.activation = activation
    
        # Define a repeated ConvMax
        for i in range(self.repetitions):
            # Define a ConvMax layer, specifying filters, kernel_size, pool_size.
            vars(self)[f'convMax_{i}'] = ConvMax(self.filters, self.kernel_size, self.pool_size, self.activation)

    def call(self, input_tensor):
        # Connect the first layer
        x = vars(self)['convMax_0'](input_tensor)
   
        # Connect the existing layers
        for i in range(1, self.repetitions):
            x = vars(self)[f'convMax_{i}'](x)
    
        # return the last layer
        return x

但是当我尝试构建网络以查看摘要时,这是我发现的:

model_input = tf.keras.layers.Input(shape=(64,64,3,), name="input_layer")
x = RepeatedConvMax()(model_input)
model = tf.keras.Model(inputs=model_input, outputs=x)

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_layer (InputLayer)     [(None, 64, 64, 3)]       0         
_________________________________________________________________
repeated_conv_max (RepeatedC (None, 4, 4, 4)           0         
=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________

总参数为

但是,当我尝试时:

model_input = tf.keras.layers.Input(shape=(64,64,3,), name="input_layer")
x = ConvMax()(model_input)
x = ConvMax()(x)
x = ConvMax()(x)
x = ConvMax()(x)
model = tf.keras.Model(inputs=model_input, outputs=x)
model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_layer (InputLayer)     [(None, 64, 64, 3)]       0         
_________________________________________________________________
conv_max (ConvMax)           (None, 32, 32, 4)         112       
_________________________________________________________________
conv_max_1 (ConvMax)         (None, 16, 16, 4)         148       
_________________________________________________________________
conv_max_2 (ConvMax)         (None, 8, 8, 4)           148       
_________________________________________________________________
conv_max_3 (ConvMax)         (None, 4, 4, 4)           148       
=================================================================
Total params: 556
Trainable params: 556
Non-trainable params: 0
_________________________________________________________________

它显示正确总参数。

你知道问题出在哪里吗? 为什么在二级子类上,参数为0? 会不会影响训练?

谢谢...

最佳答案

问题不在于 keras,而在于您在 RepeatedConvMax 中初始化层的方式。

TLDR:不要使用 vars 动态实例化和检索属性,而是使用 setattrgetattr

要解决此问题,您只需将 vars[] 替换为 setattrgetattr。根据我的理解(非常有限,我现在在寻找解决方案时实际上发现了这一点),当您调用 vars 时,您正在处理代表您的对象的字典副本。当您以这种方式动态创建属性时,Keras 无法将权重添加到模型中(这是为什么,我还不知道,但我会找出并在我这样做时更新答案)。

如果你这样定义你的类,一切都会按预期工作:

class RepeatedConvMax(tf.keras.Model):
    def __init__(self, repetitions=4, filters=4, kernel_size=3, pool_size=2, activation='relu', **kwargs):
        super(RepeatedConvMax, self).__init__(**kwargs)

        self.repetitions = repetitions
        self.filters = filters
        self.kernel_size = kernel_size
        self.pool_size = pool_size
        self.activation = activation

        # Define a repeated ConvMax
        for i in range(self.repetitions):
            # Define a ConvMax layer, specifying filters, kernel_size, pool_size.
            setattr(self, f'convMax_{i}', ConvMax(self.filters, self.kernel_size, self.pool_size, self.activation))

    def call(self, input_tensor, training=None, mask=None):
        # Connect the first layer
        x = getattr(self, 'convMax_0')(input_tensor)

        # Connect the existing layers
        for i in range(1, self.repetitions):
            print(f"Layer {i}")
            x = getattr(self, f'convMax_{i}')(x)
            print(x)

        # return the last layer
        return x

关于python - Tensorflow 2 Keras 嵌套模型子类化 - 总参数为零,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66491624/

相关文章:

python - 如何映射 F4 在 vi​​m 上运行无名文件?

python - Pygame - 更有效地从 spritesheet 中 blit Sprite ,或将单个 Sprite blit 到它们自己的表面?

python - 使用旋转代理运行 scrapy splash

python - 检查两个列表在 Python 中是否至少有 2 个共同项目的最快方法?

python - 在 TensorFlow Functional API 中保存和加载具有相同图表的多个模型

tensorflow - 无法将 EfficientNet 与迁移学习结合使用

python - 一维 CNN (Keras) 的输入形状

tensorflow - 如何使用 TFlearn 中的 ImageAugmentation 在 CNN 中训练图像和数据的混合

python - 在 Python 中编写和注册自定义 Tensorflow Op

python - 属性错误: 'Sequential' object has no attribute '_get_distribution_strategy'