python - 将自动编码器权重绑定(bind)到密集的 Keras 层中

我正在尝试在 Keras 中创建一个自定义的密集层来绑定(bind)自动编码器中的权重。我尝试按照一个示例在卷积层中执行此操作here ，但似乎有些步骤不适用于密集层(而且，代码是两年多前的)。

通过绑定(bind)权重，我希望解码层使用编码层的转置权重矩阵。 this article中也采取了这种做法。 (第 5 页)。以下是文章的相关引用:

Here, we choose both the encoding and decoding activation function to be sigmoid function and only consider the tied weights case, in which W ′ = W^T (where W^T is the transpose of W ) as most existing deep learning methods do.

在上面的引用中，W是编码层中的权重矩阵，W'(等于W的转置)是解码层的权重矩阵。

我在密集层中没有改变太多。我向构造函数添加了一个 tied_to 参数，该参数允许您传递要将其绑定(bind)到的层。唯一的其他更改是 build 函数，其代码片段如下:

def build(self, input_shape):
    assert len(input_shape) >= 2
    input_dim = input_shape[-1]

    if self.tied_to is not None:
        self.kernel = K.transpose(self.tied_to.kernel)
        self._non_trainable_weights.append(self.kernel)
    else:
        self.kernel = self.add_weight(shape=(input_dim, self.units),
                                      initializer=self.kernel_initializer,
                                      name='kernel',
                                      regularizer=self.kernel_regularizer,
                                      constraint=self.kernel_constraint)
    if self.use_bias:
        self.bias = self.add_weight(shape=(self.units,),
                                    initializer=self.bias_initializer,
                                    name='bias',
                                    regularizer=self.bias_regularizer,
                                    constraint=self.bias_constraint)
    else:
        self.bias = None
    self.input_spec = InputSpec(min_ndim=2, axes={-1: input_dim})
    self.built = True

下面是__init__方法，这里唯一的变化是添加了tied_to参数。

def __init__(self, units,
             activation=None,
             use_bias=True,
             kernel_initializer='glorot_uniform',
             bias_initializer='zeros',
             kernel_regularizer=None,
             bias_regularizer=None,
             activity_regularizer=None,
             kernel_constraint=None,
             bias_constraint=None,
             tied_to=None,
             **kwargs):
    if 'input_shape' not in kwargs and 'input_dim' in kwargs:
        kwargs['input_shape'] = (kwargs.pop('input_dim'),)
    super(Dense, self).__init__(**kwargs)
    self.units = units
    self.activation = activations.get(activation)
    self.use_bias = use_bias
    self.kernel_initializer = initializers.get(kernel_initializer)
    self.bias_initializer = initializers.get(bias_initializer)
    self.kernel_regularizer = regularizers.get(kernel_regularizer)
    self.bias_regularizer = regularizers.get(bias_regularizer)
    self.activity_regularizer = regularizers.get(activity_regularizer)
    self.kernel_constraint = constraints.get(kernel_constraint)
    self.bias_constraint = constraints.get(bias_constraint)
    self.input_spec = InputSpec(min_ndim=2)
    self.supports_masking = True
    self.tied_to = tied_to

call 函数未经编辑，但如下供引用。

def call(self, inputs):
    output = K.dot(inputs, self.kernel)
    if self.use_bias:
        output = K.bias_add(output, self.bias, data_format='channels_last')
    if self.activation is not None:
        output = self.activation(output)
    return output

在上面，我添加了一个条件来检查是否设置了 tied_to 参数，如果设置了，则将层的内核设置为 tied_to 层内核的转置。

下面是用于实例化模型的代码。它是使用 Keras 的顺序 API 完成的，DenseTied 是我的自定义层。

# encoder
#
encoded1 = Dense(2, activation="sigmoid")

decoded1 = DenseTied(4, activation="sigmoid", tied_to=encoded1)

# autoencoder
#
autoencoder = Sequential()
autoencoder.add(encoded1)
autoencoder.add(decoded1)

训练模型后，下面是模型摘要和权重。

autoencoder.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_7 (Dense)              (None, 2)                 10        
_________________________________________________________________
dense_tied_7 (DenseTied)     (None, 4)                 12        
=================================================================
Total params: 22
Trainable params: 14
Non-trainable params: 8
________________________________________________________________

autoencoder.layers[0].get_weights()[0]
array([[-2.122982  ,  0.43029135],
       [-2.1772149 ,  0.16689162],
       [-1.0465667 ,  0.9828905 ],
       [-0.6830663 ,  0.0512633 ]], dtype=float32)


autoencoder.layers[-1].get_weights()[1]
array([[-0.6521988 , -0.7131109 ,  0.14814234,  0.26533198],
       [ 0.04387903, -0.22077179,  0.517225  , -0.21583867]],
      dtype=float32)

正如您所看到的，autoencoder.get_weights() 报告的权重似乎没有关联。

因此，在展示我的方法之后，我的问题是，这是在 Dense Keras 层中绑定(bind)权重的有效方法吗？我能够运行该代码，并且目前正在训练。看起来损失函数也在合理地下降。我担心这只会在模型构建时将它们设置为相等，但实际上并不会将它们绑定(bind)。我希望后端 transpose 函数通过引擎盖下的引用将它们联系起来，但我确信我遗漏了一些东西。

最佳答案

感谢米哈伊尔·柏林科夫，重要提示:此代码在 Keras 下运行，但在 TF2.0 中不在 eager 模式下运行。它会跑，但训练得很差。

关键点是对象如何存储转置权重。 self.kernel = K.transpose(self.tied_to.kernel)

在非急切模式下，这会以正确的方式创建图表。在急切模式下，此操作会失败，可能是因为转置变量的值存储在构建时(==第一次调用)，然后在后续调用中使用。

但是:解决方案是在构建时存储不变的变量，并将转置操作放入call方法中。

我花了几天时间来解决这个问题，如果这对任何人有帮助，我很高兴。

关于python - 将自动编码器权重绑定(bind)到密集的 Keras 层中，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53751024/

python - 将自动编码器权重绑定(bind)到密集的 Keras 层中

上一篇：angular - URL 的编码字符串( Angular )

下一篇：java - 如何禁用按钮一天