为了提高随机森林回归模型的准确性,我将 scikit learn 标准特征缩放器更改为 MinMax 缩放器。在标准标量期间,我没有收到错误。准确性有所提高,但在计算 MAPE 时,我得到的误差低于。
Mean Absolute Error: 0.03
Accuracy: -inf %.
__main__:5: RuntimeWarning: divide by zero encountered in true_divide
代码是:
from sklearn.preprocessing import MinMaxScaler
sc_X = MinMaxScaler()
X_train = sc_X.fit_transform(X_train)
X_test = sc_X.transform(X_test)
sc_y = MinMaxScaler()
y_train = sc_y.fit_transform(y_train)
#MAE
errors = abs(y_pred - y_test)
print('Mean Absolute Error:', round(np.mean(errors), 2))
# Calculate mean absolute percentage error (MAPE)
mape = 100 * (errors / y_test)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')
最佳答案
不幸的是,这是使用 MAPE 的问题之一。来自维基百科:
Percentage forecast accuracy measures such as the Mean absolute percentage error (MAPE) rely on division of y_t, skewing the distribution of the MAPE for values of y_t near or equal to 0. This is especially problematic for datasets whose scales do not have a meaningful 0 or for intermittent demand datasets, where y_t=0 occurs frequently.
作为替代方案,我建议使用 MASE反而。 MASE 应该能够很好地处理您的问题。
关于python - 最小最大缩放后随机森林回归 MAPE 除以零误差,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53128495/