python - 值错误 : Found arrays with inconsistent numbers of samples

标签 python pandas machine-learning scikit-learn perceptron

这是我的代码:

import pandas as pa
from sklearn.linear_model import Perceptron
from sklearn.metrics import accuracy_score

def get_accuracy(X_train, y_train, y_test):
    perceptron = Perceptron(random_state=241)
    perceptron.fit(X_train, y_train)
    result = accuracy_score(y_train, y_test)
    return result

test_data = pa.read_csv("C:/Users/Roman/Downloads/perceptron-test.csv")
test_data.columns = ["class", "f1", "f2"]
train_data = pa.read_csv("C:/Users/Roman/Downloads/perceptron-train.csv")
train_data.columns = ["class", "f1", "f2"]

accuracy = get_accuracy(train_data[train_data.columns[1:]], train_data[train_data.columns[0]], test_data[test_data.columns[0]])
print(accuracy)

我不明白为什么会出现此错误:

Traceback (most recent call last):
  File "C:/Users/Roman/PycharmProjects/data_project-1/lecture_2_perceptron.py", line 35, in <module>
    accuracy = get_accuracy(train_data[train_data.columns[1:]], 
train_data[train_data.columns[0]], test_data[test_data.columns[0]])
  File "C:/Users/Roman/PycharmProjects/data_project-1/lecture_2_perceptron.py", line 22, in get_accuracy
    result = accuracy_score(y_train, y_test)
  File "C:\Users\Roman\AppData\Roaming\Python\Python35\site-packages\sklearn\metrics\classification.py", line 172, in accuracy_score
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "C:\Users\Roman\AppData\Roaming\Python\Python35\site-packages\sklearn\metrics\classification.py", line 72, in _check_targets
    check_consistent_length(y_true, y_pred)
  File "C:\Users\Roman\AppData\Roaming\Python\Python35\site-packages\sklearn\utils\validation.py", line 176, in check_consistent_length
    "%s" % str(uniques))
ValueError: Found arrays with inconsistent numbers of samples: [199 299]

我想通过获取此类错误的方法 accuracy_score 获得准确度。我用谷歌搜索找不到任何可以帮助我的东西。谁能给我解释一下发生了什么?

最佳答案

sklearn.metrics.accuracy_score() 采用 y_truey_pred 参数。也就是说,对于相同的数据集(大概是测试集),它想知道基本事实和模型预测的值。这将允许它评估您的模型与假设的完美模型相比的表现如何。

在您的代码中,您为两个不同的数据集传递了真实的结果变量。这些结果都是真实的,绝不反射(reflect)您的模型正确分类观察结果的能力!

更新您的 get_accuracy() 函数以将 X_test 作为参数,我认为这更符合您的意图:

def get_accuracy(X_train, y_train, X_test, y_test):
    perceptron = Perceptron(random_state=241)
    perceptron.fit(X_train, y_train)
    pred_test = perceptron.predict(X_test)
    result = accuracy_score(y_test, pred_test)
    return result

关于python - 值错误 : Found arrays with inconsistent numbers of samples,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35247687/

相关文章:

python - 使用 Python 3.4 的服务器到服务器应用程序的 OAuth 2.0,无法导入名称 'SERVICE_ACCOUNT'

python - 如何将图像列表转换为 Pytorch Tensor

python - 类型错误 : 'numpy.float64' object is not callable?

python - pandas DataFrame 中的假日日历

python-2.7 - (Python - sklearn)如何通过gridsearchcv将参数传递给自定义ModelTransformer类

python - 在哪里可以找到 Python 的 win32api 模块?

python - Pandas map 专栏到位

python - 从 pandas 中的选定数据中删除 NaN

python - 当达到特定的损失和精度值时,如何停止 tflearn 训练时期或迭代?

machine-learning - 强化学习玩具项目