注意:
我是 MXNet 的新手。
似乎 Gluon
模块旨在替换(?)Symbol
模块作为高级神经网络 (nn
) 接口(interface)。所以这个问题专门寻求使用 Gluon
的答案。模块。
语境
Residual neural networks (res-NNs)是相当流行的架构(链接提供了对 res-NNs 的评论)。简而言之,res-NNs 是一种架构,其中输入经过(一系列)转换(例如,通过标准 nn 层),最后在激活函数之前与其纯粹的自身相结合:
所以主要 这里的问题是“如何使用自定义 gluon.Block
实现 res-NN 结构?”接下来是:
通常,子问题被视为并发的主要问题,导致帖子被标记为过于笼统。在这种情况下,它们是合法的子问题,因为我无法解决我的主要问题源于这些子问题,并且胶子模块的部分/初稿文档不足以回答它们。
主要问题
“如何使用自定义
gluon.Block
实现 res-NN 结构?”首先让我们做一些导入:
import mxnet as mx
import numpy as np
import math
import random
gpu_device=mx.gpu()
ctx = gpu_device
在定义我们的 res-NN 结构之前,首先我们定义一个通用的卷积 NN(cnn)架构;即卷积→批范数。 → 斜坡。
class CNN1D(mx.gluon.Block):
def __init__(self, channels, kernel, stride=1, padding=0, **kwargs):
super(CNN1D, self).__init__(**kwargs)
with self.name_scope():
self.conv = mx.gluon.nn.Conv1D(channels=channels, kernel_size=kernel, strides=1, padding=padding)
self.bn = mx.gluon.nn.BatchNorm()
self.ramp = mx.gluon.nn.Activation(activation='relu')
def forward(self, x):
x = self.conv(x)
x = self.bn(x)
x = self.ramp(x)
return x
Subquestion: mx.gluon.nn.Activation vs
NDArray
module's nd.relu? When to use which and why. In all MXNet tutorials / demos I saw in their documentation, customgluon.Block
s usend.relu(x)
in theforward
function.Subquestion:
self.ramp(self.conv(x))
vsmx.gluon.nn.Conv1D(activation='relu')(x)
? i.e. what is the consequence of adding the activation argument to a layer? Does that mean the activation is automatically applied in theforward
function when that layer is called?
现在我们有了一个可重复使用的 cnn 卡盘,让我们定义一个 res-NN,其中:
chain_length
cnn 卡盘数量 所以这是我的尝试:
class RES_CNN1D(mx.gluon.Block):
def __init__(self, channels, kernel, initial_stride, chain_length=1, stride=1, padding=0, **kwargs):
super(RES_CNN1D, self).__init__(**kwargs)
with self.name_scope():
num_rest = chain_length - 1
self.ramp = mx.gluon.nn.Activation(activation='relu')
self.init_cnn = CNN1D(channels, kernel, initial_stride, padding)
# I am guessing this is how to correctly add an arbitrary number of chucks
self.rest_cnn = mx.gluon.nn.Sequential()
for i in range(num_rest):
self.rest_cnn.add(CNN1D(channels, kernel, stride, padding))
def forward(self, x):
# make a copy of untouched input to send through chuncks
y = x.copy()
y = self.init_cnn(y)
# I am guess that if I call a mx.gluon.nn.Sequential object that all nets inside are called / the input gets passed along all of them?
y = self.rest_cnn(y)
y += x
y = self.ramp(y)
return y
Subquestion: adding a variable number of layers, should one use the hacky
eval("self.layer" + str(i) + " = mx.gluon.nn.Conv1D()")
or is this whatmx.gluon.nn.Sequential
is meant for?Subquestion: when defining the
forward
function in a customgluon.Block
which has an instance ofmx.gluon.nn.Sequential
(let us refer to it asself.seq
), doesself.seq(x)
just pass the argumentx
down the line? e.g. if this isself.seq
self.seq = mx.gluon.nn.Sequential()
self.conv1 = mx.gluon.nn.Conv1D()
self.conv2 = mx.gluon.nn.Conv1D()
self.seq.add(self.conv1)
self.seq.add(self.conv2)
is
self.seq(x)
equivalent toself.conv2(self.conv1(x))
?
它是否正确?
期望的结果
RES_CNN1D(10, 3, 2, chain_length=3)
应该是这样的
Conv1D(10, 3, stride=2) -----
BatchNorm |
Ramp |
Conv1D(10, 3) |
BatchNorm |
Ramp |
Conv1D(10, 3) |
BatchNorm |
Ramp |
| |
(+)<-------------------------
v
Ramp
最佳答案
关于python-3.x - MXNet (python3) 将残差卷积结构定义为来自 Gluon 模块的 Block,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46306782/