python - 值错误: Classification metrics can't handle a mix of unknown and binary targets?

标签 python pandas machine-learning scikit-learn

我很确定我的随机森林模型正在工作。当我查看所做的预测和测试集中的实际类别时,它们非常匹配。第一部分是我对分类数据进行编码:

Y_train[Y_train == 'Blue'] = 0.0
Y_train[Y_train == 'Green'] = 1.0
Y_test[Y_test == 'Blue'] = 0.0
Y_test[Y_test == 'Green'] = 1.0

rf = RandomForestRegressor(n_estimators=50)
rf.fit(X_train, Y_train)
predictions = rf.predict(X_test)

for i in range(len(predictions)):
    predictions[i] = predictions[i].round()

print(predictions)
print(Y_test)

print(confusion_matrix(Y_test, predictions))

当我运行此代码时,我成功打印了预测Y_test:

[1. 1. 1. 0. 1. 0. 0. 1. 1. 1. 1. 0. 0. 0. 1. 0. 1. 0. 1. 0. 0. 1. 0. 1.
 1. 0. 1. 1. 1. 0. 1. 0. 1. 1. 0. 0. 0. 0. 1. 1. 0. 1. 0. 1. 1. 0. 1. 0.
 0. 0. 0. 0. 1. 1. 0. 1. 1. 1. 1. 1. 1. 0. 0. 1. 0. 0. 1. 0. 1. 1. 1. 0.
 0. 1. 0. 1. 1. 1. 1. 0. 0. 0. 1. 1. 1. 1. 1. 1. 0. 0. 0. 0. 1. 1. 0. 1.
 0. 0. 0. 0.]
615    1
821    1
874    1
403    0
956    1
      ..
932    1
449    0
339    0
191    0
361    0
Name: Colour, Length: 100, dtype: object

正如您所看到的,它们完美匹配,因此模型正在运行。我遇到的问题是当我尝试在 scikit-learn 中使用 confusion_matrix() 函数的最后一部分时,我收到此错误:

    Traceback (most recent call last):
  File "G:\Work\Colours.py", line 101, in <module>
    Main()
  File "G:\Work\Colours.py", line 34, in Main
    RandForest(X_train, Y_train, X_test, Y_test)
  File "G:\Work\Colours.py", line 97, in RandForest
    print(confusion_matrix(Y_test, predictions))
  File "C:\Users\Me\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\metrics\classification.py", line 253, in confusion_matrix
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "C:\Users\Me\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\metrics\classification.py", line 81, in _check_targets
    "and {1} targets".format(type_true, type_pred))
ValueError: Classification metrics can't handle a mix of unknown and binary targets

我可以对这两个数据集执行什么操作,以便 confusion_matrix() 函数不会引发任何类型错误?

编辑 - 预测Y_test都是相同的形状,(100,)

最佳答案

您必须比较具有相同维度的矩阵,因此如果预测包含 1 列和 850 行的矩阵(例如),则 Y_test 必须是 1 列和 850 行的矩阵。

打印(confusion_matrix(Y_test[1],预测))

关于python - 值错误: Classification metrics can't handle a mix of unknown and binary targets?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59269464/

相关文章:

python - 如何在 Python 中获取处理器名称?

python - 如何从Python数组中删除特定字符

Python Pandas to_pickle 无法 pickle 大型数据帧

python - pandas 中的分组和转换

python - 在 Pandas 系列中制作缺失的时隙并填充 0 值

python - 不要使用 json.dumps 引用某些字符串

Python pandas 合并 keyerror

swift - XCode 无法识别 Core ML 模型文件

python - RNN 的梯度消失/爆炸

python - 时间序列分析 - 不均匀间隔的措施 - Pandas + statsmodels