python - 使用 TensorFlow 层的 `kernel_constraint` 实现权重归一化

一些 TensorFlow 层，例如 tf.layers.dense和 tf.layers.conv2d ，接受 kernel_constraint论据，根据 tf api docs docs实现一个

Optional projection function to be applied to the kernel after being updated by an Optimizer (e.g. used to implement norm constraints or value constraints for layer weights).

在 [1] ，萨利曼斯等。提出一种神经网络归一化技术，称为权重归一化，它对网络层的权重向量进行归一化，例如批量归一化 [2] ，它规范了流经层的实际数据批次。在某些情况下，权重归一化方法的计算开销较低，也可以用于批量归一化不可行的情况。

我的问题是:是否可以使用上述 TensorFlow 层的 kernel_constraint 来实现权重归一化？ ?假设 x是形状为 (batch, height, width, channels) 的输入，我以为我可以按如下方式实现它:

x = tf.layers.conv2d(
    inputs=x,
    filters=16,
    kernel_size=(3, 3),
    strides=(1, 1),
    kernel_constraint=lambda kernel: (
        tf.nn.l2_normalize(w, list(range(kernel.shape.ndims-1)))))

验证/使我的解决方案无效的简单测试用例是什么？

[1] SALIMANS，蒂姆； KINGMA, Diederik P. 权重归一化:加速深度神经网络训练的简单重新参数化。在:神经信息处理系统的进展。 2016 年。 901-909。

[2] IOFFE，谢尔盖； SZEGEDY，基督徒。批量归一化:通过减少内部协变量偏移来加速深度网络训练。 arXiv 预印本 arXiv:1502.03167, 2015。

最佳答案

尽管标题如此，但 Salimans 和 Kingma 的论文建议将权重范数与其方向分离，而不是实际将权重归一化(即按照您的建议将他们的 l2 范数设置为 1)。

如果您想验证您的代码是否具有预期效果，即使它不是他们提出的，您可以获取模型的权重并检查它们的范数。
在伪代码中:

model = tf.models.Model(inputs=inputs, outputs=x)
weights = model.get_weights()[i] # checking the weights of the i-th layer
flat_weights = weights.flatten()
import numpy as np
print(np.linalg.norm(flat_weights, 2))

关于python - 使用 TensorFlow 层的 `kernel_constraint` 实现权重归一化，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/51351004/

python - 使用 TensorFlow 层的 `kernel_constraint` 实现权重归一化

上一篇：SQL : finding which clause is making my query returning no answer

下一篇：wcf - 将对象注入(inject)自定义 WCF UserNamePassValidator - Autofac