python - 如何修复 "ValueError: Found input variables with inconsistent numbers of samples: [10000, 60000]"？

标签 python machine-learning scikit-learn mnist

我在使用随机梯度下降和 MNIST 数据库训练代码时遇到问题。

    from sklearn.datasets import fetch_mldata
    from sklearn.linear_model import SGDClassifier

    mnist = fetch_mldata('MNIST original')
    X, y = mnist["data"], mnist["target"]

    some_digit = X[36000]
    some_digit_image = some_digit.reshape(28, 28)

    X_train, X_train, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]


    y_train_5 = (y_train == 5)
    y_test_5 = (y_test == 5)

    sgd_clf = SGDClassifier(random_state=42)
    sgd_clf.fit(X_train, y_train_5)

进程结束时出错(我认为最后一段代码很糟糕):

  ValueError: Found input variables with inconsistent numbers of samples: [10000, 60000]

最佳答案

这是您这边的拼写错误，您分配给 X_train 两次:

X_train, X_train, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]

正确答案是:

X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]

顺便说一句。 fetch_mldata 很快就会被弃用，使用它会是一个更好的主意:

from sklearn.datasets import fetch_openml
X, y = fetch_openml("mnist_784", version=1, return_X_y=True)

关于python - 如何修复 "ValueError: Found input variables with inconsistent numbers of samples: [10000, 60000]"？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54259778/

上一篇：python-3.x - CNN-将图像分为训练/验证/测试

下一篇：python - 在 scikit learn 中自定义损失函数

相关文章：

python - 使用 python 和 xpath 选择多个值

python - (Python) 高斯伯努利 RBM 计算 P(v|h)

Scikit-Learn:使用 DBSCAN 预测新点

machine-learning - 使用无标签机器学习进行异常检测

python - 是否有可能检索由混淆矩阵识别的误报/漏报？

python - 是否可以在 scikit learn 中使用复数作为目标标签？

python - 在 Anaconda 中安装和使用 scikit-learn 的问题

python - 使用 sobol 序列的准随机标准正态数在 Python 中进行蒙特卡罗模拟给出了错误值

python - 如何获取所有不包含数字的特定长度的单词？

python - 找不到命令 'scrapy'