python - 使用 python 和 tensorflow 从图像中识别数字

详细信息:Ubuntu 14.04(LTS)、OpenCV 2.4.13、Spyder 2.3.9(Python 2.7)、Tensorflow r0.10

我想识别号码来自 the image使用 Python 和 Tensorflow(可选 OpenCV)。

此外，我想使用 MNIST 数据训练和 tensorflow

像这样(代码引用了this page的视频)，

代码:

import tensorflow as tf
import random

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

x = tf.placeholder("float", [None, 784])
y = tf.placeholder("float", [None, 10])

W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

learning_rate = 0.01
training_epochs = 25
batch_size = 100
display_step = 1

### modeling ###

activation = tf.nn.softmax(tf.matmul(x, W) + b)

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y * tf.log(activation), reduction_indices=1))

optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)

init = tf.initialize_all_variables()

sess = tf.Session()
sess.run(init)

### training ###

for epoch in range(training_epochs) :

    avg_cost = 0
    total_batch = int(mnist.train.num_examples/batch_size)

    for i in range(total_batch) :

        batch_xs, batch_ys =mnist.train.next_batch(batch_size)
        sess.run(optimizer, feed_dict={x: batch_xs, y: batch_ys})
        avg_cost += sess.run(cross_entropy, feed_dict = {x: batch_xs, y: batch_ys}) / total_batch

    if epoch % display_step == 0 :
        print "Epoch : ", "%04d" % (epoch+1), "cost=", "{:.9f}".format(avg_cost)

print "Optimization Finished"

### predict number ###

r = random.randint(0, mnist.test.num_examples - 1)
print "Prediction: ", sess.run(tf.argmax(activation,1), {x: mnist.test.images[r:r+1]})
print "Correct Answer: ", sess.run(tf.argmax(mnist.test.labels[r:r+1], 1))

但是，问题是我怎样才能使 numpy 数组像

代码补充:

mnist.test.images[r:r+1]

[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 0.50196081 0.50196081 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 1. 1. 1. 1. 0.50196081 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 1. 1. 0.50196081 0.50196081 0.50196081 0.74901962 1. 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 0.74901962 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 1. 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 0.74901962 1. 1. 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0.74901962 0. 0. 0. 0. 0. 0.25098041 0.50196081 1. 1. 1. 1. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 1. 1. 0.50196081 0.50196081 0.74901962 1. 1. 1. 1. 1. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 1. 1. 1. 1. 1. 0.50196081 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 0.50196081 0.50196081 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. ]]

当我使用 OpenCV 解决问题时，我可以制作关于图像的 numpy 数组，但有点奇怪。 (我想把array做成一个28x28的vector)

代码补充:

image = cv2.imread("img_easy.jpg")
resized_image = cv2.resize(image, (28, 28))

[[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]]

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]]

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]]

...,

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]]

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]]

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]]]

然后，我将 value('resized_image') 放入 Tensorflow 代码中。像这样，

代码修改:

### predict number ###

print "Prediction: ", sess.run(tf.argmax(activation,1), {x: resized_image})
print "Correct Answer: 9"

因此，错误发生在这一行。

ValueError: Cannot feed value of shape (28, 28, 3) for Tensor u'Placeholder_2:0', which has shape '(?, 784)'

最后，

1)我想知道如何制作一个可以输入tensorflow代码的数据(可能是numpy数组[784])

2)你知道使用tensorflow的数字识别例子吗？

我是机器学习的初学者。

请详细告诉我我该怎么做。

最佳答案

您使用的图像似乎是 RGB，因此是第 3 维 (28,28,3)。

因为原始 MNIST 图像是灰度，宽度和高度为 28。这就是为什么 x 占位符的形状是 [None, 784]，因为 28*28= 784。

CV2 以 RGB 格式读取图像，而您希望它是灰度图像，即 (28,28) 在执行 imread 时，您可能会发现使用它很有帮助。

image = cv2.imread("img_easy.jpg", cv2.CV_LOAD_IMAGE_GRAYSCALE)

通过这样做，您的图像应该具有正确的形状 (28, 28)。

此外，CV2 图像值与您的问题中显示的 MNIST 图像不在同一范围内。您可能需要对图像中的值进行归一化，使它们在 0-1 范围内。

此外，您可能还想为此使用 CNN(稍微高级一些，但应该会提供更好的结果)。看本页教程https://www.tensorflow.org/tutorials/了解更多详情。

关于python - 使用 python 和 tensorflow 从图像中识别数字，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/39032277/

python - 使用 python 和 tensorflow 从图像中识别数字

上一篇：scala - sbt 下载组件失败

下一篇：.net - Ubuntu 14.04 上的 "Could not resolve coreclr"路径