python - 无法使多变量线性回归收敛

标签 python numpy machine-learning

以下是我修改过的简单线性回归/机器学习代码。对于所有初始权重和偏差(即 weight = np.array([0.03, 0.04, 0.02]), bias = 0.01),训练会爆炸(它不会收敛)。

想知道代码中是否存在错误,或者如何选择好的初始值(权重和偏差)使其收敛。

#Adopted from http://ml-cheatsheet.readthedocs.io/en/latest/linear_regression.html
import numpy as np
from numpy import genfromtxt


def predict(X, weight, bias):
    return np.dot(X, weight) + bias

def cost_function(X, Y, weight, bias):
    companies = X.shape[0]
    return np.sum((predict(X, weight, bias) - Y) **2) / companies



def update_weights(X, Y, weight, bias, learning_rate):
    companies = X.shape[0]

    dW = 2 * np.dot(X.T,  predict(X, weight, bias) - Y)
    db = 2 * np.sum(predict(X, weight, bias) - Y)
    """
    for i in range(companies):
        # Calculate partial derivatives
        # -2x(y - (mx + b))
        dw += -2*X[i] * (sales[i] - (weight*X[i] + bias))

        # -2(y - (mx + b))
        db += -2*(sales[i] - (weight*X[i] + bias))
    """
    #print(dW, db)
    # We subtract because the derivatives point in direction of steepest ascent
    #weight -= (dW / companies) * learning_rate
    #bias -= (db / companies) * learning_rate

    return weight - (dW / companies) * learning_rate, bias - (db / companies) * learning_rate

def train(X, Y, weight, bias, learning_rate, iters):
    cost_history = []

    for i in range(iters):
        weight,bias = update_weights(X, Y, weight, bias, learning_rate)

        #Calculate cost for auditing purposes
        cost = cost_function(X, Y, weight, bias)
        cost_history.append(cost)

        # Log Progress
        if i % 100 == 0:
            print ("iter: "+str(i) + " cost: "+str(cost) + "\n")

    return weight, bias, cost_history

#the Advertising.csv is from http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv
if __name__ == "__main__":
    my_data = genfromtxt('Advertising.csv', delimiter=',')
    X = my_data[1:, 1:4:1]
    Y = my_data[1:, 4];  #the sales
    a,b, _ = train(X, Y, np.array([0.03, 0.04, 0.02]), 0.01, 0.001, 1000)

问题是,无论我使用什么值作为初始权重和偏置(即 weight = np.array([0.03, 0.04, 0.02]), bias = 0.01)都会导致值爆炸。 它只是不会收敛。
train(X, Y, weight, bias, 0.001, 1000)

更新1

当我运行上面的代码片段时,我得到了

$ python linearRegression_multi.py 
iter: 0 cost: 212337.75728564826

/Users/joe/anaconda3/lib/python3.6/site-packages/numpy/core/_methods.py:32: RuntimeWarning: overflow encountered in reduce
  return umr_sum(a, axis, dtype, out, keepdims)
linearRegression_multi.py:11: RuntimeWarning: overflow encountered in square
  return np.sum((predict(X, weight, bias) - Y) **2) / companies
iter: 100 cost: inf

linearRegression_multi.py:34: RuntimeWarning: invalid value encountered in subtract
  return weight - dW * learning_rate / companies , bias - db * learning_rate / companies
iter: 200 cost: nan

iter: 300 cost: nan

iter: 400 cost: nan

iter: 500 cost: nan

iter: 600 cost: nan

iter: 700 cost: nan

iter: 800 cost: nan

iter: 900 cost: nan

最佳答案

找出问题的原因!本例中的学习率 0.001 太高了。

将其更改为 0.00001 有效。即,将原始代码段中的最后一行更改为以下使其有效。

a,b, _ = train(X, Y, np.array([0.03, 0.04, 0.02]), 0.01, 0.00001, 1000)

这是输出:

python te.py 
iter: 0 cost: 23.07411798374272

iter: 100 cost: 6.479930413738248

iter: 200 cost: 5.097751463999494

iter: 300 cost: 4.528064099014893

iter: 400 cost: 4.263917598438141

iter: 500 cost: 4.1398851132621655

iter: 600 cost: 4.081383875535448

iter: 700 cost: 4.053584811192947

iter: 800 cost: 4.040172367398533

iter: 900 cost: 4.033501506011401

关于python - 无法使多变量线性回归收敛,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48981687/

相关文章:

python - Cv2 中的 cvtcolor - 无属性

python - 从数组创建 dask 数据框不保留列类型

python - 获取python中两个字典之间的差异(值)

python - 如何使用 urllib2 从 Python 中打开的 url 中提取特定数据?

python - 使用 numpy.average 的加权平均值

matlab - 如何使用矩阵作为输入来训练 Matlab 神经网络?

python - 最近邻文本分类

r - 如何对排名数据建模

python - 如何收缩 NetworkX 中只有 2 条边的节点?

python - get() 接受 2 到 3 个位置参数,但给出了 4 个。为什么会出现这个错误?解决这个问题的办法是什么?