python - 学习 MNIST 后对非 MNIST 图像进行分类

标签 python image machine-learning classification mnist

我的机器学习算法已经学习了 MNIST 数据库中的 70000 张图像。我想在 MNIST 数据集中未包含的图像上对其进行测试。但是,我的预测函数无法读取我的测试图像的数组表示。

如何在外部图像上测试我的算法? 为什么我的代码会失败?

PS 我正在使用 python3

收到错误:

Traceback (most recent call last):
  File "hello_world2.py", line 28, in <module>
    print(sgd_clf.predict(arr))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/linear_model/base.py", line 336, in predict
    scores = self.decision_function(X)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/sklearn/linear_model/base.py", line 317, in decision_function
    % (X.shape[1], n_features))
ValueError: X has 15 features per sample; expecting 784

代码:

# Common Imports
import numpy as np
from sklearn.datasets import fetch_mldata
from sklearn.linear_model import SGDClassifier
from PIL import Image
from resizeimage import resizeimage   

# loading and learning MNIST data
mnist = fetch_mldata('MNIST original')
x, y = mnist["data"], mnist["target"]
sgd_clf = SGDClassifier(random_state=42)
sgd_clf.fit(x, y)

# loading and converting to array a non-MNIST image of a "5", which is in the same folder
img = Image.open("5.png")
arr = np.array(img)

# trying to predict that the image is a "5"
img = Image.open("5.png")   
img = img.convert('L') #makes it greyscale
img = resizeimage.resize_thumbnail(img, [28,28])
arr = np.array(img)

print(sgd_clf.predict(arr)) # ERROR... why????????? How do you fix it?????

最佳答案

这不仅仅是调整大小的问题,图像需要数字居中和黑底白字等。我一直在为这项工作开发一个功能。这是使用 opencv 的当前版本,虽然它可以做进一步的改进,包括使用 PIL 代替 opencv,但它应该给出所需步骤的想法。

def open_as_mnist(image_path):
    """
    Assume this is a color or grey scale image of a digit which has not so far been preprocessed

    Black and White
    Resize to 20 x 20 (digit in center ideally)
    Sharpen
    Add white border to make it 28 x 28
    Convert to white on black
    """
    # open as greyscale
    image = cv2.imread(image_path, 0)

    # crop to contour with largest area
    cropped = do_cropping(image)

    # resizing the image to 20 x 20
    resized20 = cv2.resize(cropped, (20, 20), interpolation=cv2.INTER_CUBIC)

    cv2.imwrite('1_resized.jpg', resized20)

    # gaussian filtering
    blurred = cv2.GaussianBlur(resized20, (3, 3), 0)

    # white digit on black background
    ret, thresh = cv2.threshold(blurred, 127, 255, cv2.THRESH_BINARY_INV)

    padded = to20by20(thresh)


    resized28 = padded_image(padded, 28)

    # normalize the image values to fit in the range [0,1]
    norm_image = np.asarray(resized28, dtype=np.float32) / 255.

    # cv2.imshow('image', norm_image)
    # cv2.waitKey(0)

    # # Flatten the image to a 1-D vector and return
    flat = norm_image.reshape(1, 28 * 28)
    # return flat

    # normalize pixels to 0 and 1. 0 is pure white, 1 is pure black.
    tva = [(255 - x) * 1.0 / 255.0 for x in flat]
    return tva



def padded_image(image, tosize):
    """
    This method adds padding to the image and makes it to a tosize x tosize array,
    without losing the aspect ratio.
    Assumes desired image is square

    :param image: the input image as numpy array
    :param tosize: the final dimensions
    """

    # image dimensions
    image_height, image_width = image.shape


    # if not already square then pad to square
    if image_height != image_width:

        # Add padding
        # The aim is to make an image of different width and height to a sqaure image
        # For that first the biggest attribute among width and height are determined.
        max_index = np.argmax([image_height, image_width])


        # if height is the biggest one, then add padding to width until width becomes
        # equal to height
        if max_index == 0:
            #order of padding is: top, bottom, left, right
            left = int((image_height - image_width) / 2)
            right = image_height - image_width - left
            padded_img = cv2.copyMakeBorder(image, 0, 0,
                                            left,
                                            right,
                                            cv2.BORDER_CONSTANT)

        # else if width is the biggest one, then add padding to height until height becomes
        # equal to width
        else:
            top = int((image_width - image_height) / 2)
            bottom = image_width - image_height - top
            padded_img = cv2.copyMakeBorder(image, top, bottom, 0, 0,  cv2.BORDER_CONSTANT)
    else:
        padded_img = image


    # now that it's a square, add any additional padding required
    image_height, image_width = padded_img.shape
    padding = tosize - image_height

    # need to handle where padding is not divisiable by 2
    left = top = int(padding/2)
    right = bottom = padding - left
    resized = cv2.copyMakeBorder(image, top, bottom, left, right, cv2.BORDER_CONSTANT)


    return resized

关于python - 学习 MNIST 后对非 MNIST 图像进行分类,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44816314/

相关文章:

javascript - 固定大小的 div 中的图像并通过裁剪保持纵横比

algorithm - 生成算法和判别算法有什么区别?

tensorflow - 我应该对 3D 灰度图像使用 2D 还是 3D 卷积?

python - 连接错误 ('<urllib3.connection.HTTPSConnection object at 0x7f3a5d760390>: Failed to establish a new connection: [Errno 111] Connection refused' )

python - Python 中的字母顺序

python - 在Python for循环中改变 float 的值

c++ - 如何访问OpenCV图像的所有 channel

python - 如何在不写/读的情况下在 Python 中执行 JPEG 压缩

machine-learning - 基于内容的推荐和K均值聚类的区别

python - 使用和不使用包依赖项进行测试