c++ - 神经网络反向传播实现问题

我已经阅读了很多关于神经网络的文章并使用反向传播训练它们，主要是 this Coursera course , 额外阅读 here和 here .我以为我对核心算法有相当扎实的掌握，但我尝试构建一个反向传播训练的神经网络并没有完全成功，我不确定为什么。

代码是用 C++ 编写的，目前还没有向量化。

我想构建一个简单的 2 个输入神经元、1 个隐藏神经元、1 个输出神经元的网络来模拟 AND 函数。只是为了在转向更复杂的示例之前了解这些概念是如何工作的，当我手动编写权重和偏差的值时，我的前向传播代码起作用了。

float NeuralNetwork::ForwardPropagte(const float *dataInput)
{
        int number = 0; // Write the input data into the input layer
        for ( auto & node : m_Network[0])
        {
            node->input = dataInput[number++];
        }

        // For each layer in the network
        for ( auto & layer : m_Network)
        {
            // For each neuron in the layer
            for (auto & neuron : layer)
            {
                float activation;
                if (layerIndex != 0)
                {
                   neuron->input += neuron->bias;
                   activation = Sigmoid( neuron->input);
                } else {
                    activation = neuron->input;
                }

                for (auto & pair : neuron->outputNeuron)
                {
                    pair.first->input += static_cast<float>(pair.second)*activation;
                }
            }
        }

        return Sigmoid(m_Network[m_Network.size()-1][0]->input);
}

其中一些变量的命名相当糟糕，但基本上，neuron->outputNeuron 是一个 vector 对。第一个是指向下一个神经元的指针，第二个是权重值。 neuron->input是神经网络方程中的“z”值，所有wights*activation + bais之和。 Sigmoid 由下式给出:

float NeuralNetwork::Sigmoid(float value) const
{
    return 1.0f/(1.0f + exp(-value));
}

这两个似乎按预期工作。通过网络后，所有“z”或“neuron->input”值都重置为零(或在反向传播之后)。

然后我按照下面的伪代码训练网络。多次运行训练代码。

for trainingExample=0 to m // m = number of training examples
   perform forward propagation to calculate hyp(x)
   calculate cost delta of last layer
         delta = y - hyp(x)
   use the delta of the output to calculate delta for all layers
   move over the network adjusting the weights based on this value
   reset network

实际代码在这里:

void NeuralNetwork::TrainNetwork(const std::vector<std::pair<std::pair<float,float>,float>> & trainingData)
{
    for (int i = 0; i < 100; ++i)
    {
        for (auto & trainingSet : trainingData)
        {
            float x[2] = {trainingSet.first.first,trainingSet.first.second};
            float y      = trainingSet.second;
            float estimatedY = ForwardPropagte(x);

            m_Network[m_Network.size()-1][0]->error = estimatedY - y;
            CalculateError();
            RunBackpropagation();
            ResetActivations();
        }
    }
}

使用由下式给出的反向传播函数:

void NeuralNetwork::RunBackpropagation()
{
    for (int index = m_Network.size()-1; index >= 0; --index)
    {
        for(auto &node : m_Network[index])
        {
            // Again where the "outputNeuron" is a list of the next layer of neurons and associated weights
            for (auto &weight : node->outputNeuron)
            {
                weight.second += weight.first->error*Sigmoid(node->input);
            }
            node->bias = node->error; // I'm not sure how to adjust the bias, some of the formulas seemed to point to this. Is it correct?
        }
    }
}

成本计算方式:

void NeuralNetwork::CalculateError()
{
    for (int index = m_Network.size()-2; index > 0; --index)
    {
        for(auto &node : m_Network[index])
        {
            node->error = 0.0f;

            float sigmoidPrime = Sigmoid(node->input)*(1 - Sigmoid(node->input));

            for (auto &weight : node->outputNeuron)
            {
                node->error += (weight.first->error*weight.second)*sigmoidPrime;
            }
        }
    }   
}

我将权重随机化并在数据集上运行:

    x = {0.0f,0.0f} y =0.0f
    x = {1.0f,0.0f} y =0.0f
    x = {0.0f,1.0f} y =0.0f
    x = {1.0f,1.0f} y =1.0f

当然，我不应该使用相同的数据集进行训练和测试，但我只是想启动并运行基本的反向传播算法。当我运行这段代码时，我看到权重/偏差如下:

Layer 0
    Bias 0.111129
    NeuronWeight 0.058659
    Bias -0.037814
    NeuronWeight -0.018420
Layer 1
    Bias 0.016230
    NeuronWeight -0.104935
Layer 2
    Bias 0.080982

训练集运行，delta[outputLayer] 的均方误差看起来像这样:

Error: 0.156954
Error: 0.152529
Error: 0.213887
Error: 0.305257
Error: 0.359612
Error: 0.373494
Error: 0.374910
Error: 0.374995
Error: 0.375000

... remains at this value for ever...

最终的权重看起来像:(它们总是大致达到这个值)

Layer 0
    Bias 0.000000
    NeuronWeight 15.385233
    Bias 0.000000
    NeuronWeight 16.492933
Layer 1
    Bias 0.000000
    NeuronWeight 293.518585
Layer 2
    Bias 0.000000

我承认这似乎是学习神经网络的一种迂回方式，而且(目前)实现情况非常不理想。但是有谁能发现我做出了无效假设，或者实现或公式有误的地方吗？

编辑

感谢对偏差值的反馈，我停止将它们应用于输入层并停止通过 sigmoid 函数传递输入层。另外我的 Sigmoid 素数函数无效。但网络仍然无法正常工作。我已经用现在发生的事情更新了上面的错误和输出。

最佳答案

正如 lejilot 所说，你有很多偏见。最后一层不需要偏置，它是一个输出层，偏置必须连接到它的输入，但不能连接到它的输出。看看下面的图片:

在这张图片中，您可以看到每一层只有一个偏差，除了最后一层，那里不需要偏差。

Here you can read一种非常直观的神经网络方法。它是用 Python 编写的，但它可以帮助您更好地理解神经网络的一些概念。

关于c++ - 神经网络反向传播实现问题，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33762864/

c++ - 神经网络反向传播实现问题

上一篇：c++ - 用 C++ 线程模拟 pthread_kill

下一篇：c++ - 谷歌风格指南 "<chrono> is an unapproved C++11 header"