python - 在创建自定义层时，当在 Keras 中调用构建方法时

抱歉，我是深度学习和 keras 的新手。我正在尝试自己定义一个图层。

我查看了keras文档，https://keras.io/api/layers/base_layer/#layer-class

class SimpleDense(Layer):

  def __init__(self, units=32):
      super(SimpleDense, self).__init__()
      self.units = units

  def build(self, input_shape):  # Create the state of the layer (weights)
    w_init = tf.random_normal_initializer()
    self.w = tf.Variable(
        initial_value=w_init(shape=(input_shape[-1], self.units),
                             dtype='float32'),
        trainable=True)
    b_init = tf.zeros_initializer()
    self.b = tf.Variable(
        initial_value=b_init(shape=(self.units,), dtype='float32'),
        trainable=True)

  def call(self, inputs):  # Defines the computation from inputs to outputs
      return tf.matmul(inputs, self.w) + self.b

# Instantiates the layer.
linear_layer = SimpleDense(4)

据我所知，当我创建linear_layer时，会调用__init__方法，当我将输入放入linear_layer时，调用 方法被调用。但我不知道什么时候调用 build 方法，更具体地说，build 方法中的 input_shape 是如何指定的？这里的input_shape是什么？我不知道 build 方法何时被调用，因此我不知道将哪些参数作为 input_shape 参数放入。

此外，我想指定一个固定大小的参数，在我的例子中是(1,768)。那么在这种情况下，我还应该在构建方法中使用 input_shape 吗？

最佳答案

要了解这个SimpleDense层并回答您的问题，我们需要解释权重和偏差。 SimpleDense 中的权重首先获取随机数，bias 获取零 数字，在模型训练中，权重和偏差会发生变化以最小化损失。 第一个问题的答案: build方法仅一次性调用，并且在第一次使用layer时，该方法被调用，并且权重和偏差设置为随机数和零数，但调用每个训练批处理中都会调用方法。 第二个问题的答案:是的，在调用方法中，我们可以访问一批数据，第一个维度显示该批处理。我编写了一个示例，在调用 build 和 call 方法时打印并打印输入和输出数据的形状，以澄清上述解释。

在下面的示例中:

我使用batch_size = 5和25个样本数据，在每个时期，我们可以在调用方法中看到对的访问>5 个样本数据。
一次性层创建和构建以及一次性构建方法正在调用、5 epoch和5次调用方法正在调用。
Units = 4 和 shape data = (100, 2) [sample, features] 然后 total params = 12 <-> 4*2(权重*特征)+ 4(偏差)
添加末尾，附上一张图像，显示 matmul 的工作原理以及为什么输出形状为 (5,4)，以及 的计算公式>输入*权重+偏差。

import tensorflow as tf

class SimpleDense(tf.keras.layers.Layer):
  def __init__(self, units=32):
      super(SimpleDense, self).__init__()
      self.units = units

  def build(self, input_shape):  # Create the state of the layer (weights)
    tf.print('calling build method')
    w_init = tf.random_normal_initializer()
    self.w = tf.Variable(
        initial_value=w_init(shape=(input_shape[-1], self.units),
                             dtype='float32'),trainable=True)
    b_init = tf.zeros_initializer()
    self.b = tf.Variable(initial_value=b_init(shape=(self.units,), 
                                              dtype='float32'),trainable=True)

  def call(self, inputs):  # Defines the computation from inputs to outputs
      tf.print('\ncalling call method')
      tf.print(f'input shape : {inputs.shape}')
      out = tf.matmul(inputs, self.w) + self.b
      tf.print(f'output shape : {out.shape}')
      return out

model = tf.keras.Sequential()
model.add(SimpleDense(units = 4))
model.compile(optimizer = 'adam',loss = 'mse',)
model.fit(tf.random.uniform((25, 2)), tf.ones((25, 1)), batch_size = 5)
model.summary()

输出:

calling build method

calling call method
input shape : (5, 2)
output shape : (5, 4)
1/5 [=====>........................] - ETA: 1s - loss: 0.9794
calling call method
input shape : (5, 2)
output shape : (5, 4)

calling call method
input shape : (5, 2)
output shape : (5, 4)

calling call method
input shape : (5, 2)
output shape : (5, 4)

calling call method
input shape : (5, 2)
output shape : (5, 4)
5/5 [==============================] - 0s 15ms/step - loss: 0.9770
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 simple_dense (SimpleDense)  (5, 4)                    12        
                                                                 
=================================================================
Total params: 12
Trainable params: 12
Non-trainable params: 0
_________________________________________________________________

关于python - 在创建自定义层时，当在 Keras 中调用构建方法时，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/72669152/

python - 在创建自定义层时，当在 Keras 中调用构建方法时

上一篇：go - 如何模拟调用 io.Copy 的函数

下一篇：typescript - 将类型限制为数组的键但尊重长度