python - 最小最大缩放后随机森林回归 MAPE 除以零误差

标签 python machine-learning scikit-learn regression random-forest

为了提高随机森林回归模型的准确性，我将 scikit learn 标准特征缩放器更改为 MinMax 缩放器。在标准标量期间，我没有收到错误。准确性有所提高，但在计算 MAPE 时，我得到的误差低于。

Mean Absolute Error: 0.03
Accuracy: -inf %.

__main__:5: RuntimeWarning: divide by zero encountered in true_divide

代码是:

from sklearn.preprocessing import MinMaxScaler
sc_X = MinMaxScaler()
X_train = sc_X.fit_transform(X_train)
X_test = sc_X.transform(X_test)
sc_y = MinMaxScaler()
y_train = sc_y.fit_transform(y_train)


#MAE
errors = abs(y_pred - y_test)
print('Mean Absolute Error:', round(np.mean(errors), 2))

# Calculate mean absolute percentage error (MAPE)
mape = 100 * (errors / y_test)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

最佳答案

不幸的是，这是使用 MAPE 的问题之一。来自维基百科:

Percentage forecast accuracy measures such as the Mean absolute percentage error (MAPE) rely on division of y_t, skewing the distribution of the MAPE for values of y_t near or equal to 0. This is especially problematic for datasets whose scales do not have a meaningful 0 or for intermittent demand datasets, where y_t=0 occurs frequently.

作为替代方案，我建议使用 MASE反而。 MASE 应该能够很好地处理您的问题。

关于python - 最小最大缩放后随机森林回归 MAPE 除以零误差，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53128495/

上一篇：python - 运行算法之前调整数据

下一篇：python - 如何在 TFRecord 中保存不同长度的列表列表？

machine-learning - 如何将文档拆分为训练集和测试集？

敏捷:机器学习项目的用户故事？

python - 在 Windows 上使用 Scipy 的 AMD64 版本调用 scikit-learn 时出错

python - 如何在 Ubuntu 20.04 GCP 实例上安装 virtualenv？

python - 在 Python 中模拟目录结构

javascript - 如何单击具有 javascript :__doPostBack in href? 的链接

python - 如何优雅地将 Sklearn GridsearchCV 最佳参数传递给另一个模型？

python - 由于神秘的 TypeError，Scikit-learn GridSearchCV 无法使用 silhouette_score 拟合 EM 模型

Python:sci-kit 中的特征选择学习正态分布