我想共享连体模型两侧的权重。
给定两个输入集,每个输入集都应该通过具有相同权重的完全相同的模型函数(连体部分)。然后两个输出连接在一起作为输出。
我已经了解了如何共享文档中的特定层( https://keras.io/getting-started/functional-api-guide/#shared-layers )以及该板上的其他问题。它有效。
但是当我创建自己的多层模型函数时,Keras 不会共享权重。
这是一个最小的例子:
from keras.layers import Input, Dense, concatenate
from keras.models import Model
# Define inputs
input_a = Input(shape=(16,), dtype='float32')
input_b = Input(shape=(16,), dtype='float32')
# My simple model
def my_model(x):
x = Dense(128, input_shape=(x.shape[1],), activation='relu')(x)
x = Dense(128, activation='relu')(x)
return x
# Instantiate model parameters to share
processed_a = my_model(input_a)
processed_b = my_model(input_b)
# Concatenate output vector
final_output = concatenate([processed_a, processed_b], axis=-1)
model = Model(inputs=[input_a, input_b], outputs=final_output)
此模型如果共享,总共应有 (16*128 + 128) + (128*128 + 128) 个参数 = 18688 个参数。如果我们检查一下:
model.summary()
这表明我们有双重:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_3 (InputLayer) (None, 16) 0
__________________________________________________________________________________________________
input_4 (InputLayer) (None, 16) 0
__________________________________________________________________________________________________
dense_5 (Dense) (None, 128) 2176 input_3[0][0]
__________________________________________________________________________________________________
dense_7 (Dense) (None, 128) 2176 input_4[0][0]
__________________________________________________________________________________________________
dense_6 (Dense) (None, 128) 16512 dense_5[0][0]
__________________________________________________________________________________________________
dense_8 (Dense) (None, 128) 16512 dense_7[0][0]
__________________________________________________________________________________________________
concatenate_2 (Concatenate) (None, 256) 0 dense_6[0][0]
dense_8[0][0]
==================================================================================================
Total params: 37,376
Trainable params: 37,376
Non-trainable params: 0
__________________________________________________________________________________________________
我不知道我做错了什么。这是一个简化的示例。我的示例首先加载预训练的语言模型并将文本输入编码/处理为向量,然后应用此暹罗模型。由于是预训练模型,因此最好将模型放在像这样的单独函数中。
谢谢。
最佳答案
问题是,当您调用 my_model
时,您正在创建全新的层(即每次都初始化 Dense
层)。你想做的只是每层初始化一次。看起来像这样:
from keras.layers import Input, Dense, concatenate
from keras.models import Model
# Define inputs
input_a = Input(shape=(16,), dtype='float32')
input_b = Input(shape=(16,), dtype='float32')
# Instantiate model parameters to share
layer1 = Dense(128, input_shape=(input_a.shape[1],), activation='relu')
layer2 = Dense(128, activation='relu')
processed_a = layer2(layer1(input_a))
processed_b = layer2(layer1(input_b))
# Concatenate output vector
final_output = concatenate([processed_a, processed_b], axis=-1)
model = Model(inputs=[input_a, input_b], outputs=final_output)
现在 model.summary()
给出:
Model: "model_2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_5 (InputLayer) (None, 16) 0
__________________________________________________________________________________________________
input_6 (InputLayer) (None, 16) 0
__________________________________________________________________________________________________
dense_5 (Dense) (None, 128) 2176 input_5[0][0]
input_6[0][0]
__________________________________________________________________________________________________
dense_6 (Dense) (None, 128) 16512 dense_5[0][0]
dense_5[1][0]
__________________________________________________________________________________________________
concatenate_2 (Concatenate) (None, 256) 0 dense_6[0][0]
dense_6[1][0]
==================================================================================================
Total params: 18,688
Trainable params: 18,688
Non-trainable params: 0
编辑:如果您只想在函数内创建一次图层,则应使用类似下面的内容
# Instantiate model parameters to share
def my_model(x):
return Sequential([Dense(128, input_shape=(x.shape[1],), activation='relu'),
Dense(128, activation='relu')])
# create sequential model (and layers) only once
model = my_model(input_a)
processed_a = model(input_a)
processed_b = model(input_b)
关于python - 如何在自定义 Keras 模型函数中共享层权重,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59571518/