python - 回归运动损失巨大!使用批量输入进行测试?

标签 python tensorflow neural-network batch-processing linear-regression

我目前正在运行以下代码,以便根据 6 个参数预测房屋价格:

import pandas as pd
import tensorflow as tf
import numpy as np

housing = pd.read_csv('cal_housing_clean.csv')

X = housing.iloc[:,0:6]
y = housing.iloc[:,6:]

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3)

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
scaler.fit(X_train)
X_train = pd.DataFrame(data=scaler.transform(X_train),columns = X_train.columns,index=X_train.index)
X_test = pd.DataFrame(data=scaler.transform(X_test),columns = X_test.columns,index=X_test.index)

X_data = tf.placeholder(dtype = "float", shape=[None,6])
y_target = tf.placeholder(dtype = "float", shape=[None,1])

hidden_layer_nodes = 10

w1 = tf.Variable(tf.random_normal(shape=[6,hidden_layer_nodes]))
b1 = tf.Variable(tf.random_normal(shape=[hidden_layer_nodes]))
w2 = tf.Variable(tf.random_normal(shape=[hidden_layer_nodes,1]))
b2 = tf.Variable(tf.random_normal(shape=[1]))

hidden_output = tf.nn.relu(tf.add(tf.matmul(X_data,w1),b1))
y_output = tf.add(tf.matmul(hidden_output,w2),b2)

loss = tf.reduce_mean(tf.square(y_target-y_output))

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.00001)
train = optimizer.minimize(loss)

init = tf.global_variables_initializer()

steps = 100000

with tf.Session() as sess:

    sess.run(init)

    for i in range(steps):

        sess.run(train, feed_dict={X_data:X_train,y_target:y_train})

        if i%500 == 0:

            print('Currently on step {}'.format(i))

            training_cost = sess.run(loss, feed_dict={X_data:X_test,y_target:y_test})
            print("Training cost=", training_cost/6192)

    training_cost = sess.run(loss, feed_dict={X_data:X_test,y_target:y_test})
    print("Training cost=", training_cost/6192)

我特此认为,由于 test_set 包含 6192 行数据,只需将总损失或错误除以该值即可解决问题,但不幸的是我得到以下输出:

Currently on step 0
Training cost= 9190063.95866
Currently on step 500
Training cost= 9062077.85013
Currently on step 1000
Training cost= 8927415.89664
Currently on step 1500
Training cost= 8795428.38243
Currently on step 2000
Training cost= 8666037.25065
Currently on step 2500
Training cost= 8539182.30491
Currently on step 3000
Training cost= 8414841.71576

其中误差将降至大约 200 万,而我希望值接近 100 或 20 万。

也许我的代码中有一个错误,导致近似值如此糟糕。我还尝试了不同的learning_rates,但得到了相同的结果。

我还想尝试通过批量发送测试数据来测试模型。我试过这个:

if i%500 == 0:

    rand_ind = np.random.randint(len(X_test),size=8)

    feed = {X_data:X_test[rand_ind],y_target:y_test[rand_ind]}

    loss = tf.reduce_sum(tf.square(y_target-y_output)) / 8

    print(sess.run(loss,feed_dict=feed))

但不幸的是,我总是被告知我用 rand_ind 选择的索引“不在索引中”。

最佳答案

您可以尝试 tf.train.AdamOptimizer 并提高学习率(可能在 0.1 左右)。这将提高收敛速度。

关于python - 回归运动损失巨大!使用批量输入进行测试?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47543589/

相关文章:

python - Django 模板不加载任何内容(黑屏)

python - AWS : Unable to import module 'handler' : No module named 'numpy'

python - 有效地从文件中读取主版本号

python - Keras tensorflow 已经存在错误?

python - 自编码器网络中的数组形状

python - 类型错误 : unsupported operand type(s) for/: 'Dimension' and 'int'

neural-network - CNN 中卷积层到全连接层的输入维度

python - 创建 python 生成器后更新它

python - 检查模型输入 : expected conv1d_1_input to have shape (None, 44​​1, 216) 时出错,但得到形状为 (1, 441, 216) 的数组

python - Tensorflow conv3d_transpose *** 'python' : free(): invalid pointer 中的错误