python - Python-OpenCV手写顺序数字识别与提取

标签 python image opencv image-processing computer-vision

我想按顺序提取框中的数字。

原图

enter image description here

我使用分水岭算法来分离连接到框的数字,但它不会正确地绘制数字轮廓,而是只选择部分数字。

#To get in big box that contain smaller boxes from the image
img = cv2.imread('1_6.png',0)
img = cv2.GaussianBlur(img,(3,3),1)
_,img  = cv2.threshold(img,240,255,cv2.THRESH_BINARY)
img = cv2.GaussianBlur(img,(11,11),1)
edges = cv2.Canny(img,100,200)
_,c,h = cv2.findContours(edges.copy(),cv2.RETR_CCOMP,cv2.CHAIN_APPROX_NONE)
img = cv2.imread('1_6.png')
temp_c = sorted(c,key=cv2.contourArea,reverse=True)

#Select the big box
epsilon = 0.0001*cv2.arcLength(temp_c[0],True)
approx = cv2.approxPolyDP(temp_c[0],epsilon,True)

#Crop big box
pts = approx.copy()
rect = cv2.boundingRect(pts)
x,y,w,h = rect
croped = img[y:y+h, x:x+w].copy()

## (2) make mask
pts = pts - pts.min(axis=0)

mask = np.ones(croped.shape[:2], np.uint8)
cv2.drawContours(mask, [pts], -1, (255, 255, 255), -1, cv2.LINE_AA)

## (3) do bit-op
dst = cv2.bitwise_and(croped, croped, mask=mask)


gray = cv2.cvtColor(dst,cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
kernel = np.ones((1,1),np.uint8)
opening = cv2.morphologyEx(thresh,cv2.MORPH_CLOSE,kernel, iterations = 2)

sure_bg = cv2.dilate(opening,kernel,iterations=1)

dist_transform = cv2.distanceTransform(opening,cv2.DIST_L2,5)

ret, sure_fg = cv2.threshold(dist_transform,0.3*dist_transform.max(),255,0)

sure_fg = np.uint8(sure_fg)

unknown = cv2.subtract(sure_bg,sure_fg)

ret, markers = cv2.connectedComponents(sure_fg)

# Add one to all labels so that sure background is not 0, but 1
markers = markers+1

# Now, mark the region of unknown with zero
markers[unknown==255] = 0

plt.imshow(markers,cmap="gray")

img = dst.copy()
markers = cv2.watershed(dst,markers)
img[markers == -1] = [0,0,255]

当前结果

enter image description here

最佳答案

这是我的方法。我会尽量详细:

  • 将图像转换为灰度
  • 执行精明的边缘检测
  • 删除水平线和垂直线以隔离字符
  • 执行形态学操作以增强字母
  • 寻找轮廓
  • 使用轮廓面积和纵横比过滤轮廓
  • 从左到右排序轮廓以按顺序提取数字
  • 遍历排序的轮廓并提取 ROI

首先,我们使用 cv2.Canny()

执行 Canny 边缘检测

enter image description here

接下来的目标是删除垂直线和水平线,我们可以隔离数字。我们首先创建各种内核,每个内核都针对水平、垂直或一般方向

vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,2))
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,1))
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (3,3))

我们首先使用 cv2.erode() 删除水平线

enter image description here

现在我们用 cv2.dilate()

扩大垂直线

enter image description here

接下来我们去掉竖线

enter image description here

现在注意我们几乎什么都没有了,所以我们必须通过扩张来恢复数字

enter image description here

从这里我们使用 cv2.findContours() 找到轮廓。我们使用 cv2.contourArea() 并按纵横比进行过滤以获得边界框。

enter image description here

现在要按顺序提取数字,我们使用 imutils.contours.sort_contours()

最后,我们提取每个数字的 ROI 并保存图像。这是按顺序保存的 ROI 的屏幕截图

enter image description here

import cv2
import numpy as np
from imutils import contours

image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
canny = cv2.Canny(gray, 130, 255, 1)

vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,2))
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,1))
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (3,3))
erode = cv2.erode(canny, vertical_kernel)
cv2.imshow('remove horizontal', erode)
dilate = cv2.dilate(erode, vertical_kernel, iterations=5)
cv2.imshow('dilate vertical', dilate)
erode = cv2.erode(dilate, horizontal_kernel, iterations=1)
cv2.imshow('remove vertical', erode)
dilate = cv2.dilate(erode, kernel, iterations=4)
cv2.imshow('dilate horizontal', dilate)

cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

digit_contours = []
for c in cnts:
    area = cv2.contourArea(c)
    peri = cv2.arcLength(c, True)
    approx = cv2.approxPolyDP(c, 0.01 * peri, True)
    x,y,w,h = cv2.boundingRect(approx)
    aspect_ratio = w / float(h)

    if (aspect_ratio >= 0.4 and aspect_ratio <= 1.3):
        if area > 150:
            ROI = original[y:y+h, x:x+w]
            cv2.rectangle(image,(x,y),(x+w,y+h),(0,255,0),2)
            digit_contours.append(c)

sorted_digit_contours = contours.sort_contours(digit_contours, method='left-to-right')[0]
contour_number = 0
for c in sorted_digit_contours:
    x,y,w,h = cv2.boundingRect(c)
    ROI = original[y:y+h, x:x+w]
    cv2.imwrite('ROI_{}.png'.format(contour_number), ROI)
    contour_number += 1

cv2.imshow('canny', canny)
cv2.imshow('image', image)
cv2.waitKey(0)

关于python - Python-OpenCV手写顺序数字识别与提取,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56735873/

相关文章:

opencv - 在Android Studio中导入OpenCV SDK

python - PhantomJS 返回空网页(python,Selenium)

python - 无法在 pandas 中使用数据框元素的平方

html - 悬停时在图像上显示文字

python - PyQt 显示来自 opencv 的视频流

opencv - 在 OpenCV 中使用角度和点画线

python - 为什么 Tkinter getvar() 返回 Booleanvar 的字符串类型?

python - 不理解 "Longest Substring with At Least K Repeating Characters"的本质

java - 如何在 JavaFX Canvas 中同步多个图像的旋转

javascript - 用另一个替换现有图像