def conv2d_bn(x, nb_filter, nb_row, nb_col,
border_mode='same', subsample=(1, 1),
name=None):
'''Utility function to apply conv + BN.
'''
x = Convolution2D(nb_filter, nb_row, nb_col,
subsample=subsample,
activation='relu',
border_mode=border_mode,
name=conv_name)(x)
x = BatchNormalization(axis=bn_axis, name=bn_name)(x)
return x
当我使用官方inception_v3 keras 中的模型,我发现他们在“relu”非线性之后使用了 BatchNormalization,如上面的代码脚本。
但在 Batch Normalization 论文中,作者说
we add the BN transform immediately before the nonlinearity, by normalizing x=Wu+b.
然后我查看 tensorflow 中 inception 的实现,正如他们所说的那样,在非线性之前立即添加 BN。更多详情请见 inception ops.py
我糊涂了。为什么人们在 Keras 中使用上述样式而不是以下样式?
def conv2d_bn(x, nb_filter, nb_row, nb_col,
border_mode='same', subsample=(1, 1),
name=None):
'''Utility function to apply conv + BN.
'''
x = Convolution2D(nb_filter, nb_row, nb_col,
subsample=subsample,
border_mode=border_mode,
name=conv_name)(x)
x = BatchNormalization(axis=bn_axis, name=bn_name)(x)
x = Activation('relu')(x)
return x
在密集的情况下:
x = Dense(1024, name='fc')(x)
x = BatchNormalization(axis=bn_axis, name=bn_name)(x)
x = Activation('relu')(x)
最佳答案
我也是在激活前用的,确实是这样设计的,其他库也是,比如lasagne的batch_norm http://lasagne.readthedocs.io/en/latest/modules/layers/normalization.html#lasagne.layers.batch_norm .
然而,实际上在激活之后放置它似乎效果更好:
https://github.com/ducha-aiki/caffenet-benchmark/blob/master/batchnorm.md
(虽然这只是一个基准)
关于tensorflow - 在非线性之前或之后在 Keras 中添加批量归一化?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42359860/