python - 基本神经网络,权重过高

标签 python python-3.x math neural-network

我试图用python编写一个非常基本的神经网络,其中3个输入节点的值为0或1,一个输出节点的值为0或1。输出应该几乎等于第二个输入,但是经过训练,权重太高了,网络几乎总是猜测1。
我将python 3.7与numpy和scipy一起使用。我试过改变训练集、新实例和随机种子

import numpy as np
from scipy.special import expit as ex

rand.seed(10)
training_set=[[0,1,0],[1,0,1],[0,0,0],[1,1,1]] #The training sets and their outputs
training_outputs=[0,1,0,1]
weightlst=[rand.uniform(-1,1),rand.uniform(-1,1),rand.uniform(-1,1)]  #Weights are randomly set with a value between -1 and 1

print('Random weights\n'+str(weightlst))

def calcout(inputs,weights):    #Calculate the expected output with given inputs and weights
    output=0.5

    for i in range(len(inputs)):
        output=output+(inputs[i]*weights[i])
    #print('\nmy output is ' + str(ex(output)))
    return ex(output)                 #Return the output on a sigmoid curve between 0 and 1

def adj(expected_output,training_output,weights,inputs):   #Adjust the weights based on the expected output, true (training) output and the weights
    adjweights=[]
    error=expected_output-training_output

    for i in weights:
        adjweights.append(i+(error*(expected_output*(1-expected_output))))
    return adjweights

                                                       #Train the network, adjusting weights each time
training_iterations=10000
for k in range(training_iterations):
    for l in range(len(training_set)):

        expected=calcout(training_set[l],weightlst)
        weightlst=adj(expected,training_outputs[l],weightlst,training_set[l])

new_instance=[1,0,0]           #Calculate and return the expected output of a new instance

print('Adjusted weights\n'+str(weightlst))
print('\nExpected output of new instance = ' + str(calcout(new_instance,weightlst)))

预期的输出将是0,或者非常接近它的值,但是不管我将new_instance设置为什么,输出仍然是
Random weights
[-0.7312715117751976, 0.6948674738744653, 0.5275492379532281]
Adjusted weights
[1999.6135460307303, 2001.03968501638, 2000.8723667804588]

Expected output of new instance = 1.0

我的代码怎么了?

最佳答案

漏洞:
神经元没有偏倚
误差=训练输出-期望输出(不是相反的方向)梯度下降
第i个权重的权重更新规则w_i = w_i + learning_rate * delta_w_i,(delta_w_i是相对于w_i的损失梯度)
对于平方损失delta_w_i = error*sample[i](输入向量样本的第i个值)
由于只有一个神经元(一个隐藏层或大小1),因此模型只能学习线性可分离数据(它只是一个线性分类器)。线性可分数据的例子是由booleanANDOR等函数生成的数据。注意booleanXOR不是线性可分的。
修复了错误的代码

import numpy as np
from scipy.special import expit as ex

rand.seed(10)
training_set=[[0,1,0],[1,0,1],[0,0,0],[1,1,1]] #The training sets and their outputs
training_outputs=[1,1,0,1] # Boolean OR of input vector
#training_outputs=[0,0,,1] # Boolean AND of input vector

weightlst=[rand.uniform(-1,1),rand.uniform(-1,1),rand.uniform(-1,1)]  #Weights are randomly set with a value between -1 and 1
bias = rand.uniform(-1,1)

print('Random weights\n'+str(weightlst))

def calcout(inputs,weights, bias):    #Calculate the expected output with given inputs and weights
    output=bias
    for i in range(len(inputs)):
        output=output+(inputs[i]*weights[i])
    #print('\nmy output is ' + str(ex(output)))
    return ex(output)                 #Return the output on a sigmoid curve between 0 and 1

def adj(expected_output,training_output,weights,bias,inputs):   #Adjust the weights based on the expected output, true (training) output and the weights
    adjweights=[]
    error=training_output-expected_output
    lr = 0.1
    for j, i in enumerate(weights):
        adjweights.append(i+error*inputs[j]*lr)
    adjbias = bias+error*lr
    return adjweights, adjbias

#Train the network, adjusting weights each time
training_iterations=10000
for k in range(training_iterations):
    for l in range(len(training_set)):
        expected=calcout(training_set[l],weightlst, bias)
        weightlst, bias =adj(expected,training_outputs[l],weightlst,bias,training_set[l])

new_instance=[1,0,0]           #Calculate and return the expected output of a new instance

print('Adjusted weights\n'+str(weightlst))
print('\nExpected output of new instance = ' + str(calcout(new_instance,weightlst, bias)))

输出:
Random weights
[0.142805189379827, -0.14222189064977075, 0.15618260226894076]
Adjusted weights
[6.196759842119063, 11.71208191137411, 6.210137255008176]
Expected output of new instance = 0.6655563851223694

正如上面所看到的,对于输入[1,0,0]模型预测的概率0.66为1级(因为0.66>0.5)。它是正确的,因为输出类是或输入向量。
注:
为了学习/理解如何更新每个权重,可以像上面那样编写代码,但实际上所有操作都是矢量化的。检查link的矢量化实现。

关于python - 基本神经网络,权重过高,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55320283/

相关文章:

python - 类型错误 : 'TimeSeriesSplit' object is not iterable

Python OrderedDict 按日期排序

python - 是否可以使用 argparse 但传入自定义 argv 而不是使用 sys.argv 传递?

python - 将字典列表中的单引号更改为双引号

python - 基于深度信息的多级dict字典列表

python - cygwin hadoop映射减少问题

python - 如何使用python icrawler爬取多个关键词

algorithm - 确定输入是否为完美正方形的好算法是什么?

python - 如何使用 scipy.spatial.distance.cosine 计算加权相似度?

java - 计算矩阵平方行列式