c++ - OpenCV 中文档的模式提取层

首先我们生成二进制文件给定图像的图像，将其阈值设置为其 80% 强度并反转结果图像。在二值图像中白色像素代表文字、图形、线条等。模式提取的第一步是定位矩形称为“矩形”的区域。矩形是松散的矩形区域连接的白色像素1 ，包含一定的逻辑文件的一部分。我们考虑简单的8邻域连接性和执行的连接组件(轮廓) 分析二值图像以进行分割文本组件。对于算法的下一部分，我们使用轮廓的最小外接矩形。这些矩形然后使用从上到下和从左到右的顺序进行排序最左上角的2D点信息。较小基于假设，连接的模式被丢弃它们可能是由于依赖于图像的噪声而产生的采集系统并不会以任何方式促进最终布局。标点符号也被忽略使用较小的尺寸标准，例如逗号、句号等。在此级别我们还根据边界的高度来隔离字体使用avgh(平均高度)作为阈值的矩形。两个阈值用于将字体分为三类 - 小字体、普通字体而且很大。

equation http://a1.sphotos.ak.fbcdn.net/hphotos-ak-snc7/401374_144585198985889_100003032296249_180106_343820769_n.jpg

can you help me translate this theory into opencv source code or give me any related link for this, im currently working with document image analyzing for my thesis ....

最佳答案

我知道这是一个迟到的回复。但我认为 future 的人可以从中得到帮助。

下面是我认为我从上面的段落中理解的答案(所有代码都在 OpenCV-Python v 2.4-beta 中):

我将此作为输入图像。为了便于理解，这是一个简单的图像。

input image

First we generate the binary image of the give image by thresholding it at 80% of its intensity and inverting the resulting image.

import cv2
import numpy as np

img = cv2.imread('doc4.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,0.8*gray.max(),255,1)
contours, hier = cv2.findContours(thresh,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)

阈值图像:

threshold image

We considered simple 8-neighborhood connectivity and performed connected component (contour) analysis of the binary image leading to the segmentation of the textual components.

这只是OpenCV中的轮廓查找，也称为connected-component labelling.它选择图像中的所有白色 Blob (组件)。

contours, hier = cv2.findContours(thresh,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)

轮廓:

contours

For next part of algorithm we use the minimum bounding rectangle of contours.

现在我们在每个检测到的轮廓周围找到边界矩形。然后去掉小区域的轮廓去掉逗号等。见声明:

Smaller connected patterns were discarded based on the assumption that they may have originated due to noise dependent on image acquisition system and does not in any way contribute to the final layout. Also punctuation marks were neglected using smaller size criterion e.g. comma, full-stop etc.

我们还找到了平均高度，avgh .

height = 0
num = 0
letters = []
ht = []

for (i,cnt) in enumerate(contours):
    (x,y,w,h) = cv2.boundingRect(cnt)
    if w*h<200:
        cv2.drawContours(thresh2,[cnt],0,(0,0,0),-1)
    else:
        cv2.rectangle(thresh2,(x,y),(x+w,y+h),(0,255,0),1)
        height = height + h
        num = num + 1
        letters.append(cnt)
        ht.append(h)

avgh = height/num

因此，在此之后，所有逗号等都被删除，并在选定的周围绘制绿色矩形:

bounding rect

At this level we also segregate the fonts based on the height of the bounding rect using avgh (average height) as threshold. Two thresholds are used to classify fonts into three categories - small, normal and large (根据段落中给定的方程)。

这里获得的平均高度 avgh 是 40。所以一个字母是 small如果高度小于 26.66(即 40x2/3)，normal如果高度 > 60，则为 26.66 大。但在给定的图像中，所有高度都落在 (28,58) 之间，因此都是正常的。所以你看不出区别。

所以我只是做了一个小修改来轻松可视化它:如果高度<30则较小，如果3050则正常。

for (cnt,h) in zip(letters,ht):
    print h
    if h<=30:
        cv2.drawContours(thresh2,[cnt],0,(255,0,0),-1)
    elif 30 < h <= 50:
        cv2.drawContours(thresh2,[cnt],0,(0,255,0),-1)
    else:
        cv2.drawContours(thresh2,[cnt],0,(0,0,255),-1)
cv2.imshow('img',thresh2)
cv2.waitKey(0)
cv2.destroyAllWindows()

现在您将得到字母分类为小、正常、大的结果:

result

These rectangles were then sorted top-to-bottom and left-to-right order, using 2D point information of leftmost-topmost corner.

这部分我省略了。它只是对所有边界矩形的最左上角进行排序。

关于c++ - OpenCV 中文档的模式提取层，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/8655085/

c++ - OpenCV 中文档的模式提取层

上一篇：c++ - 使用 XMLHTTPRequest 异步轮询 ReadyState

下一篇：c++ - 函数模板链接错误