python - 为什么 TensorFlow 估计器无法进行这种简单的线性回归预测

标签 python tensorflow machine-learning

我目前正在学习 tensorflow ,但无法理解为什么 tensorflow 不能对以下简单回归问题进行正确的预测。

X是1000到8000之间的随机数 Y 为 X + 250

所以如果 X 是 2000,Y 就是 2250。对我来说这似乎是一个线性回归问题。然而,当我尝试进行预测时,它与我的预期相差甚远,X of 1000 给我的预测是 1048,而不是 1250。

损失和平均损失也很大:

{'average_loss': 10269.81, 'loss': 82158.48, 'global_step': 1000}

完整代码如下:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from sklearn.model_selection import train_test_split

x_data = np.random.randint(1000, 8000, 1000000)
y_true = x_data + 250

feat_cols = [tf.feature_column.numeric_column('x', shape=[1])]
estimator = tf.estimator.LinearRegressor(feature_columns=feat_cols)

x_train, x_eval, y_train, y_eval = train_test_split(x_data, y_true, test_size=0.3, random_state=101)

input_func = tf.estimator.inputs.numpy_input_fn({'x': x_train}, y_train, batch_size=8, num_epochs=None, shuffle=True)
train_input_func = tf.estimator.inputs.numpy_input_fn({'x': x_train}, y_train, batch_size=8, num_epochs=1000, shuffle=False)
eval_input_func = tf.estimator.inputs.numpy_input_fn({'x': x_eval}, y_eval, batch_size=8, num_epochs=1000, shuffle=False)

estimator.train(input_fn=input_func, steps=1000)

train_metrics = estimator.evaluate(input_fn=train_input_func, steps=1000)
eval_metrics = estimator.evaluate(input_fn=eval_input_func, steps=1000)

print(train_metrics)
print(eval_metrics)

brand_new_data = np.array([1000, 2000, 7000])
input_fn_predict = tf.estimator.inputs.numpy_input_fn({'x': brand_new_data}, shuffle=False)

prediction_result = estimator.predict(input_fn=input_fn_predict)

print(list(prediction_result))

我做错了什么或者我误解了线性回归的含义吗?

最佳答案

我认为当你调整一些超参数时就会发生这种情况。我还将优化器更改为 AdamOptimizer

主要是批量大小为1,epochs为None

train_input_func = tf.estimator.inputs.numpy_input_fn({'x': x_train}, y_train, batch_size=1, num_epochs=None, shuffle=True)

代码:

import tensorflow as tf
import numpy as np
from sklearn.model_selection import train_test_split

x_data = np.random.randint(1000, 8000, 10000)
y_true = x_data + 250


feat_cols = tf.feature_column.numeric_column('x')
optimizer = tf.train.AdamOptimizer(1e-3)

estimator = tf.estimator.LinearRegressor(feature_columns=[feat_cols],optimizer=optimizer)

x_train, x_eval, y_train, y_eval = train_test_split(x_data, y_true, test_size=0.3, random_state=101)


train_input_func = tf.estimator.inputs.numpy_input_fn({'x': x_train}, y_train, batch_size=1, num_epochs=None,
                                                      shuffle=True)

eval_input_func = tf.estimator.inputs.numpy_input_fn({'x': x_eval}, y_eval, batch_size=1, num_epochs=None,
                                                     shuffle=True)

estimator.train(input_fn=train_input_func, steps=1005555)

train_metrics = estimator.evaluate(input_fn=train_input_func, steps=10000)
eval_metrics = estimator.evaluate(input_fn=eval_input_func, steps=10000)

print(train_metrics)
print(eval_metrics)

brand_new_data = np.array([1000, 2000, 7000])
input_fn_predict = tf.estimator.inputs.numpy_input_fn({'x': brand_new_data}, num_epochs=1,shuffle=False)

prediction_result = estimator.predict(input_fn=input_fn_predict)

for prediction in prediction_result:
    print(prediction['predictions'])

指标:

{'average_loss': 3.9024353e-06, 'loss': 3.9024353e-06, 'global_step': 1005555}

{'average_loss': 3.9721594e-06, 'loss': 3.9721594e-06, 'global_step': 1005555}

[1250.003]

[2250.002]

[7249.997]

关于python - 为什么 TensorFlow 估计器无法进行这种简单的线性回归预测,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51573068/

相关文章:

python - 模块未找到错误: No module named 'numpy.testing.nosetester'

python - import argparse 失败,某些设置/路径问题?

Python 方法返回值

python - 'Conv2D' 从 1 中减去 3 导致的负尺寸大小

python - Google App Engine 应用程序是否可以交流或控制机器学习模型或任务?

python - 如何处理 scikit learn 模型中基数的变化

python - 了解 gensim Word2Vec most_similar 结果的 3 个单词

python - OSX- "NumPy/SciPy requires Python 2.6 to Install"

TensorFlow:未实现:不支持将字符串转换为 float

python - TensorFlow错误: funcsigs module doesn't have signature attribute