python - 检测文本之间的空格(OpenCV、Python)

我有以下代码(实际上这只是运行我正在处理的所有项目所需的 4 部分中的 1 部分):

#python classify.py --model models/svm.cpickle --image images/image.png

from __future__ import print_function
from sklearn.externals import joblib
from hog import HOG
import dataset
import argparse
import mahotas
import cv2

ap = argparse.ArgumentParser()
ap.add_argument("-m", "--model", required = True,
    help = "path to where the model will be stored")
ap.add_argument("-i", "--image", required = True,
    help = "path to the image file")
args = vars(ap.parse_args())

model = joblib.load(args["model"])

hog = HOG(orientations = 18, pixelsPerCell = (10, 10),
    cellsPerBlock = (1, 1), transform = True)

image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 30, 150)
(_, cnts, _) = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)

cnts = sorted([(c, cv2.boundingRect(c)[0]) for c in cnts], key =
    lambda x: x[1])

for (c, _) in cnts:
    (x, y, w, h) = cv2.boundingRect(c)

    if w >= 7 and h >= 20:
        roi = gray[y:y + h, x:x + w]
        thresh = roi.copy()
        T = mahotas.thresholding.otsu(roi)
        thresh[thresh > T] = 255
        thresh = cv2.bitwise_not(thresh)

        thresh = dataset.deskew(thresh, 20)
        thresh = dataset.center_extent(thresh, (20, 20))

        cv2.imshow("thresh", thresh)

        hist = hog.describe(thresh)
        digit = model.predict([hist])[0]
        print("I think that number is: {}".format(digit))

        cv2.rectangle(image, (x, y), (x + w, y + h),
        (0, 255, 0), 1)
        cv2.putText(image, str(digit), (x - 10, y - 10),
        cv2.FONT_HERSHEY_SIMPLEX, 1.2, (0, 255, 0), 2)
        cv2.imshow("image", image)
        cv2.waitKey(0)

此代码正在检测和识别图像中的手写数字。这是一个例子:

假设我不关心准确度识别。

我的问题如下:如您所见，该程序获取了他可以看到的所有数字，并在控制台中打印出来。如果需要，我可以从控制台将它们保存在文本文件中，但我无法告诉程序数字之间有空格。

我想要的是，如果我在文本文件中打印数字，它们应该像图像中那样分开(抱歉，这有点难以解释..)。这些数字不应(即使在控制台中)一起打印，但在有空格的地方，也打印一个空白区域。

请看第一张图片。在前 10 位数字之后，图像中有一个空白区域，控制台中没有。

无论如何，这是完整代码的链接。有4个.py文件和3个文件夹。要执行，请在文件夹中打开 CMD 并粘贴命令 python classify.py --model models/svm.cpickle --image images/image.png where image.png 是 images 文件夹中的一个文件的名称。

Full Code

提前致谢。在我看来，所有这些工作都必须使用神经网络来完成，但我想先尝试这种方式。我对此很陌生。

最佳答案

这是一个入门解决方案。

目前我在 Python 中没有任何东西，但转换它应该不难，而且 OpenCV 函数调用是相似的，我在下面链接了它们。

TLDR；

找到 boundingRects 的中心，然后找到它们之间的距离。如果一个矩形距离某个阈值，您可以假设它是一个空格。

首先，找到边界矩形的中心

vector<Point2f> centres;

for(size_t index = 0; index < contours.size(); ++index)
{
    Moments moment = moments(contours[index]);

    centres.push_back(Point2f(static_cast<float>(moment.m10/moment.m00), static_cast<float>(moment.m01/moment.m00)));
}

(可选但推荐)

您可以绘制中心以直观地了解它们。

for(size_t index = 0; index < centres.size(); ++index)
{
    Scalar colour = Scalar(255, 255, 0);
    circle(frame, circles[index], 2, colour, 2);
}

有了这个，只需遍历它们，确认到下一个的距离在合理阈值内

for(size_t index = 0; index < centres.size(); ++index)
{
    // this is just a sample value. Tweak it around to see which value actually makes sense
    double distance = 0.5;
    Point2f current = centres[index];
    Point2f nextPoint = centres[index + 1];

    // norm calculates the euclidean distance between two points
    if(norm(nextPoint - current) >= distance)
    {
        // TODO: This is a potential space??
    }
}

您可以阅读更多关于 moments 的信息, norm和 circle drawing在 Python 中调用。

快乐的编码，干杯伙计:)

关于python - 检测文本之间的空格(OpenCV、Python)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46001090/

python - 检测文本之间的空格(OpenCV、Python)

上一篇：python - 使用 openCV 和 python 旋转 2D 点

下一篇：java - 如何在 intellij 子模块中包含 opencv (maven)