python - 如何从数据数组中进行预测-python scikit learn pandas

标签 python pandas scikit-learn

我找到了一个可以使用 python scikit-learn 线性回归预测下一个值的代码。

我能够预测单个数据..但实际上我需要预测 6 个值并打印六个值的预测。

这是代码

def linear_model_main(x_parameters, y_parameters, predict_value):
    # Create linear regression object
    regr = linear_model.LinearRegression()<
    regr.fit(x_parameters, y_parameters)
    # noinspection PyArgumentList
    predict_outcome = regr.predict(predict_value)
    score = regr.score(X, Y)
    predictions = {'intercept': regr.intercept_, 'coefficient': regr.coef_,   'predicted_value': predict_outcome, 'accuracy' : score}
    return predictions

predicted_value = 9 #I NEED TO PREDICT 9,10,11,12,13,14,15

result = linear_model_main(X, Y, predicted_value)
print('Constant Value: {0}'.format(result['intercept']))
print('Coefficient: {0}'.format(result['coefficient']))
print('Predicted Value: {0}'.format(result['predicted_value']))
print('Accuracy: {0}'.format(result['accuracy']))

我尝试这样做:

predicted_value = {9,10,11,12,13,14,15}

result = linear_model_main(X, Y, predicted_value)
print('Constant Value: '.format(result['intercept']))
print('Coefficient: '.format(result['coefficient']))
print('Predicted Value: '.format(result['predicted_value']))
print('Accuracy: '.format(result['accuracy']))

错误消息是:

Traceback (most recent call last):
File "C:Python34\data\cp.py", line 28, in <module>
result = linear_model_main(X, Y, predicted_value)
File "C:Python34\data\cp.py", line 22, in linear_model_main
predict_outcome = regr.predict(predict_value)
File "C:\Python34\lib\site-packages\sklearn\linear_model\base.py", line 200,  in predict return self._decision_function(X)
 File "C:\Python34\lib\site-packages\sklearn\linear_model\base.py", line 183, in _decision_function
X = check_array(X, accept_sparse=['csr', 'csc', 'coo'])
File "C:\Python34\lib\site-packages\sklearn\utils\validation.py", line 393, in check_array array = array.astype(np.float64)
TypeError: float() argument must be a string or a number, not 'set'

C:\>

predicted_value = 9,10,11,12,13,14,15

result = linear_model_main(X, Y, predicted_value)
print('Constant Value: '.format(result['intercept']))
print('Coefficient: '.format(result['coefficient']))
print('Predicted Value: '.format(result['predicted_value']))
print('Accuracy: '.format(result['accuracy']))

出现这些错误

   C:\Python34\lib\site-packages\sklearn\utils\validation.py:386:    DeprecationWarnin
g: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0
.19. Reshape your data either using X.reshape(-1, 1) if your data has a   single feature or X.reshape(1, -1) if it contains a single sample.
 DeprecationWarning)
 Traceback (most recent call last):
  File "C:Python34\data\cp.py", line 28, in <module>
  result = linear_model_main(X, Y, predicted_value)
  File "C:Python34\data\cp.py", line 22, in linear_model_main
predict_outcome = regr.predict(predict_value)
    File "C:\Python34\lib\site-packages\sklearn\linear_model\base.py", line 200, in predict return self._decision_function(X)
    File "C:\Python34\lib\site-packages\sklearn\linear_model\base.py", line   185, in _decision_function dense_output=True) + self.intercept_
    File "C:\Python34\lib\site-packages\sklearn\utils\extmath.py", line 184, in safe_sparse_dot return fast_dot(a, b)
    ValueError: shapes (1,3) and (1,1) not aligned: 3 (dim 1) != 1 (dim 0)

C:\>

如果我进行这样的更改:

predicted_value = 9
result = linear_model_main(X, Y, predicted_value)
print('Constant Value: {1}'.format(result['intercept']))
print('Coefficient: {1}'.format(result['coefficient']))
print('Predicted Value: {}'.format(result['predicted_value']))
print('Accuracy: {1}'.format(result['accuracy']))

它会再次给我错误,说它超出了限制。必须做什么?

最佳答案

这是一个工作示例。我没有构造您的函数,只是向您展示了正确的语法。您似乎没有正确地将数据传递到 fit 中。

import numpy as np
from sklearn import linear_model

x = np.random.uniform(-2,2,101)
y = 2*x+1 + np.random.normal(0,1, len(x))

#Note that x and y must be in specific shape.

x = x.reshape(-1,1)
y = y.reshape(-1,1)


LM  = linear_model.LinearRegression().fit(x,y) #Note I am passing in x and y in column shape

predict_me = np.array([ 9,10,11,12,13,14,15])

predict_me = predict_me.reshape(-1,1)

score = LM.score(x,y)


predicted_values = LM.predict(predict_me)

predictions = {'intercept': LM.intercept_, 'coefficient': LM.coef_,   'predicted_value': predicted_values, 'accuracy' : score}

关于python - 如何从数据数组中进行预测-python scikit learn pandas,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41269038/

相关文章:

python - pymysql:MySQL 参数化 LIKE 查询

python - Python数组中的浮点精度

python - 多索引数据帧创建加速

python - 带有 GaussianProcessClassifier 的 sklearn RFE

python - 在使用 kmeans 创建集群时,有没有办法输出每行的失真?

python - 如何获得Scikit-learn RandomForest的训练精度?

python - pyspark RDD countByKey() 是如何计数的?

python - (Python/TFTP-Server)如何监听(尚未)现有的IP地址(RNDIS)?

python - 来自类别列中的类标记的多个数据帧的 pairplot 列

python - 在循环内重新索引数据帧