machine-learning - 在 MNIST 数据集上训练的 CNN 在数字识别方面表现不佳

标签 machine-learning tensorflow conv-neural-network mnist

我使用 MNIST 数据集训练了 CNN(在 tensorflow 上)进行数字识别。测试集的准确率接近 98%。

我想使用我自己创建的数据来预测数字，但结果很糟糕。

我对自己写的图像做了什么？

我分割出每个数字并转换为灰度，并将图像大小调整为 28x28 并输入模型。

为什么我的数据集准确率这么低，而测试集准确率却这么高？

我还应该对图像进行其他修改吗？

编辑:

这是link图像和一些示例:

image of digit 7 digit 5 digit 9 digit 6

最佳答案

排除错误和明显的错误，我的猜测是您的问题是您捕获手写数字的方式与您的训练集有很大不同。

捕获数据时，您应该尝试尽可能模仿用于创建 MNIST 数据集的过程:

来自oficial MNIST dataset website :

The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.

如果您的数据在训练和测试阶段有不同的处理，那么您的模型无法从训练数据推广到测试数据。

所以我有两个建议给你:

try catch 并处理您的数字图像，使它们看起来尽可能与 MNIST 数据集相似；
将一些示例添加到训练数据中，以便您的模型能够针对与您要分类的图像类似的图像进行训练；

关于machine-learning - 在 MNIST 数据集上训练的 CNN 在数字识别方面表现不佳，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/40627099/

上一篇：machine-learning - 如何逐步训练朴素贝叶斯分类器？

下一篇：python - 使用隐马尔可夫模型对数据流进行分类

相关文章：

machine-learning - 哪些分类器提供权重向量？

tensorflow - Model.fit() 是否将整个训练数据集上传到 GPU？

ubuntu - TensorFlow 找不到集成 GPU

python - 如何为 ConvLSTM2D 模型 reshape 多元时间序列数据

python - keras:如何阻止卷积层权重

machine-learning - 如何在 Pyspark 中获得直线线性回归结果？

python - 如何识别哪些特征会影响预测结果？

machine-learning - 当我想收集和不想收集 TensorBoard 统计信息时，如何创建单个脚本文件？

python - Tensorflow，二进制分类

python - TensorFlow的conv2d如何跨越多个 channel ？