python - dask_xgboost.predict 有效但无法显示 - 数据必须是一维的

标签 python machine-learning dask xgboost dask-ml

我正在尝试使用 XGBoost 创建模型。
似乎我设法训练模型,但是,当我尝试预测测试数据并查看实际预测时,出现以下错误:

ValueError: Data must be 1-dimensional

这就是我尝试预测数据的方法:

from dask_ml.model_selection import train_test_split
import dask
import xgboost
import dask_xgboost
from dask.distributed import Client
import dask_ml.model_selection as dcv

#split the data
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.33,random_state=42)

client = Client(n_workers=10, threads_per_worker=1)

#Trying to do hyperparamter running
model_xgb = xgb.XGBRegressor(seed=42,verbose=True)


params={
    'learning_rate':[0.1,0.01,0.05],
    'max_depth':[1,5,8],
    'gamma':[0,0.5,1],
    'scale_pos_weight':[1,3,5]
}

grid_search = GridSearchCV(model_xgb, params, cv=3, scoring='neg_mean_squared_error')

grid_search.fit(x_train, y_train)

#train data with best paraeters
bst = dask_xgboost.train(client, grid_search.best_params_, x_train, y_train, num_boost_round=10)

#predict data
dask_xgboost.predict(client, bst, x_test).persist()

预测的最后一行有效,但是当我添加计算到 endd 以查看实际数组时,我得到了尺寸错误:

dask_xgboost.predict(client, bst, x_test).persist().compute()
>>>ValueError: Data must be 1-dimensional

如何使用 .predict 获得预测?

最佳答案

dask-xgboostpip 页面所述:

Dask-XGBoost has been deprecated and is no longer maintained.
The functionality of this project has been included directly
in XGBoost. To use Dask and XGBoost together, please use
xgboost.dask instead
https://xgboost.readthedocs.io/en/latest/tutorials/dask.html.

您提供的代码缺少一些赋值和表达式(例如,x 的定义方式、GridSearchCV 的导入位置)。一些可能应该改变的事情:

# note the .dask
model_xgb = xgb.dask.DaskXGBRegressor(seed=42, verbose=True)

grid_search = GridSearchCV(model_xgb, params, cv=3, scoring='neg_mean_squared_error')

grid_search.fit(x_train, y_train)

#train data with best params
model_xgb.client = client
model_xgb.set_params(grid_search.best_params_)
model_xgb.fit(X_train, y_train, eval_set=[(X_test, y_test)])

关于python - dask_xgboost.predict 有效但无法显示 - 数据必须是一维的,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69911409/

相关文章:

machine-learning - 样本量不是 10 倍的 10 倍交叉验证

python - dask.delayed 导致没有加速

python - Kubernetes 和 Dask 和调度程序

python - 我在类型转换时犯了什么错误?

python - 如何实现一个函数来覆盖单个值和多个值

python - 如何向 scikit-neuralnetwork 中的分类器添加偏差?

tensorflow - LSTM 或任何其他层的 TimeDistributed 包装器有什么用途

python - Dask Dataframe 形状属性给出了错误的形状

python - Python中有相当于 "append"的切片吗?

python - 如何使该算法在执行除法运算时返回整数?