我在 MATLAB 中实现了反向传播,但在训练它时遇到了问题。在训练阶段的早期,所有输出都变为 1。我已将输入数据(除了所需的类,用于生成二进制目标向量)标准化为区间 [0, 1]。我一直在引用 Norvig 等人的《人工智能:现代方法》中的实现。
根据我的代码检查了伪代码(并研究了算法一段时间),我无法发现错误。我使用 MATLAB 的时间不长,因此一直在尝试在需要的地方使用文档。
我还尝试了隐藏层中不同数量的节点和不同的学习率 (ALPHA
)。
目标数据编码如下:当目标要分类为2
时,目标向量将为[0,1,0]
,例如分别是 1
、[1, 0, 0]
等等。我还尝试对目标使用不同的值,例如(例如,对于类 1
)[0.5, 0, 0]
。
我注意到我的一些体重超过了 1
,导致净值较大。
%Topological constants
NUM_HIDDEN = 8+1;%written as n+1 so is clear bias is used
NUM_OUT = 3;
%Training constants
ALPHA = 0.01;
TARG_ERR = 0.01;
MAX_EPOCH = 50000;
%Read and normalize data file.
X = normdata(dlmread('iris.data'));
X = shuffle(X);
%X_test = normdata(dlmread('iris2.data'));
%epocherrors = fopen('epocherrors.txt', 'w');
%Weight matrices.
%Features constitute size(X, 2)-1, however size is (X, 2) to allow for
%appending bias.
w_IH = rand(size(X, 2), NUM_HIDDEN)-(0.5*rand(size(X, 2), NUM_HIDDEN));
w_HO = rand(NUM_HIDDEN+1, NUM_OUT)-(0.5*rand(NUM_HIDDEN+1, NUM_OUT));%+1 for bias
%Layer nets
net_H = zeros(NUM_HIDDEN, 1);
net_O = zeros(NUM_OUT, 1);
%Layer outputs
out_H = zeros(NUM_HIDDEN, 1);
out_O = zeros(NUM_OUT, 1);
%Layer deltas
d_H = zeros(NUM_HIDDEN, 1);
d_O = zeros(NUM_OUT, 1);
%Control variables
error = inf;
epoch = 0;
%Run the algorithm.
while error > TARG_ERR && epoch < MAX_EPOCH
for n=1:size(X, 1)
x = [X(n, 1:size(X, 2)-1) 1]';%Add bias for hiddens & transpose to column vector.
o = X(n, size(X, 2));
%Forward propagate.
net_H = w_IH'*x;%Transposed w.
out_H = [sigmoid(net_H); 1]; %Append 1 for bias to outputs
net_O = w_HO'*out_H;
out_O = sigmoid(net_O); %Again, transposed w.
%Calculate output deltas.
d_O = ((targetVec(o, NUM_OUT)-out_O) .* (out_O .* (1-out_O)));
%Calculate hidden deltas.
for i=1:size(w_HO, 1);
delta_weight = 0;
for j=1:size(w_HO, 2)
delta_weight = delta_weight + d_O(j)*w_HO(i, j);
end
d_H(i) = (out_H(i)*(1-out_H(i)))*delta_weight;
end
%Update hidden-output weights
for i=1:size(w_HO, 1)
for j=1:size(w_HO, 2)
w_HO(i, j) = w_HO(i, j) + (ALPHA*out_H(i)*d_O(j));
end
end
%Update input-hidden weights.
for i=1:size(w_IH, 1)
for j=1:size(w_IH, 2)
w_IH(i, j) = w_IH(i, j) + (ALPHA*x(i)*d_H(j));
end
end
out_O
o
%out_H
%w_IH
%w_HO
%d_O
%d_H
end
end
function outs = sigmoid(nets)
outs = zeros(size(nets, 1), 1);
for i=1:size(nets, 1)
if nets(i) < -45
outs(i) = 0;
elseif nets(i) > 45
outs(i) = 1;
else
outs(i) = 1/1+exp(-nets(i));
end
end
end
最佳答案
从我们在评论中建立的内容来看,我脑海中唯一想到的是所有食谱一起写在这个伟大的 NN 文件中:
ftp://ftp.sas.com/pub/neural/FAQ2.html#questions
您可以尝试的第一件事是:
1) How to avoid overflow in the logistic function?可能这就是问题所在 - 很多次我实现的神经网络的问题都是这样的溢出。
2) How should categories be encoded?
更一般的:
关于machine-learning - 反向传播,所有输出趋于1,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18048561/