python - 值错误 : Found arrays with inconsistent numbers of samples

这是我的代码:

import pandas as pa
from sklearn.linear_model import Perceptron
from sklearn.metrics import accuracy_score

def get_accuracy(X_train, y_train, y_test):
    perceptron = Perceptron(random_state=241)
    perceptron.fit(X_train, y_train)
    result = accuracy_score(y_train, y_test)
    return result

test_data = pa.read_csv("C:/Users/Roman/Downloads/perceptron-test.csv")
test_data.columns = ["class", "f1", "f2"]
train_data = pa.read_csv("C:/Users/Roman/Downloads/perceptron-train.csv")
train_data.columns = ["class", "f1", "f2"]

accuracy = get_accuracy(train_data[train_data.columns[1:]], train_data[train_data.columns[0]], test_data[test_data.columns[0]])
print(accuracy)

我不明白为什么会出现此错误:

Traceback (most recent call last):
  File "C:/Users/Roman/PycharmProjects/data_project-1/lecture_2_perceptron.py", line 35, in <module>
    accuracy = get_accuracy(train_data[train_data.columns[1:]], 
train_data[train_data.columns[0]], test_data[test_data.columns[0]])
  File "C:/Users/Roman/PycharmProjects/data_project-1/lecture_2_perceptron.py", line 22, in get_accuracy
    result = accuracy_score(y_train, y_test)
  File "C:\Users\Roman\AppData\Roaming\Python\Python35\site-packages\sklearn\metrics\classification.py", line 172, in accuracy_score
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "C:\Users\Roman\AppData\Roaming\Python\Python35\site-packages\sklearn\metrics\classification.py", line 72, in _check_targets
    check_consistent_length(y_true, y_pred)
  File "C:\Users\Roman\AppData\Roaming\Python\Python35\site-packages\sklearn\utils\validation.py", line 176, in check_consistent_length
    "%s" % str(uniques))
ValueError: Found arrays with inconsistent numbers of samples: [199 299]

我想通过获取此类错误的方法 accuracy_score 获得准确度。我用谷歌搜索找不到任何可以帮助我的东西。谁能给我解释一下发生了什么？

最佳答案

sklearn.metrics.accuracy_score() 采用 y_true 和 y_pred 参数。也就是说，对于相同的数据集(大概是测试集)，它想知道基本事实和模型预测的值。这将允许它评估您的模型与假设的完美模型相比的表现如何。

在您的代码中，您为两个不同的数据集传递了真实的结果变量。这些结果都是真实的，绝不反射(reflect)您的模型正确分类观察结果的能力!

更新您的 get_accuracy() 函数以将 X_test 作为参数，我认为这更符合您的意图:

def get_accuracy(X_train, y_train, X_test, y_test):
    perceptron = Perceptron(random_state=241)
    perceptron.fit(X_train, y_train)
    pred_test = perceptron.predict(X_test)
    result = accuracy_score(y_test, pred_test)
    return result

关于python - 值错误 : Found arrays with inconsistent numbers of samples，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/35247687/

python - 值错误 : Found arrays with inconsistent numbers of samples

上一篇：python - 随机梯度下降和性能

下一篇：python - Pandas DataFrame 未按预期工作