用于 OCR 的 Python OpenCV 偏差校正

目前，我正在做一个 OCR 项目，我需要从标签上读取文本(参见下面的示例图片)。我遇到了图像倾斜问题，我需要帮助修复图像倾斜，使文本水平而不是倾斜。目前，我正在使用的过程尝试对给定范围内的不同角度进行评分(下面包含代码)，但这种方法不一致，有时会过度校正图像倾斜或平坦化无法识别倾斜并纠正它。请注意，在歪斜校正之前，我将所有图像旋转 270 度以使文本直立，然后通过下面的代码传递图像。传递给函数的图像已经是二进制图像。

代码:


def findScore(img, angle):
    """
    Generates a score for the binary image recieved dependent on the determined angle.\n
    Vars:\n
    - array <- numpy array of the label\n
    - angle <- predicted angle at which the image is rotated by\n
    Returns:\n
    - histogram of the image
    - score of potential angle
    """
    data = inter.rotate(img, angle, reshape = False, order = 0)
    hist = np.sum(data, axis = 1)
    score = np.sum((hist[1:] - hist[:-1]) ** 2)
    return hist, score

def skewCorrect(img):
    """
    Takes in a nparray and determines the skew angle of the text, then corrects the skew and returns the corrected image.\n
    Vars:\n
    - img <- numpy array of the label\n
    Returns:\n
    - Corrected image as a numpy array\n
    """
    #Crops down the skewImg to determine the skew angle
    img = cv2.resize(img, (0, 0), fx = 0.75, fy = 0.75)

    delta = 1
    limit = 45
    angles = np.arange(-limit, limit+delta, delta)
    scores = []
    for angle in angles:
        hist, score = findScore(img, angle)
        scores.append(score)
    bestScore = max(scores)
    bestAngle = angles[scores.index(bestScore)]
    rotated = inter.rotate(img, bestAngle, reshape = False, order = 0)
    print("[INFO] angle: {:.3f}".format(bestAngle))
    #cv2.imshow("Original", img)
    #cv2.imshow("Rotated", rotated)
    #cv2.waitKey(0)
    
    #Return img
    return rotated

校正前后标签的示例图像

修正前->修正后

如果有人能帮我解决这个问题，那会很有帮助。

最佳答案

这是 Projection Profile Method algorithm for skew angle estimation 的一个实现.各种角度点被投影到累加器阵列中，其中倾斜角可以定义为搜索间隔内的投影角度，使对齐最大化。这个想法是以不同角度旋转图像并为每次迭代生成像素直方图。为了确定偏斜角，我们比较峰值之间的最大差异并使用该偏斜角，旋转图像以校正偏斜。

原始 -> 更正

Skew angle: -2

import cv2
import numpy as np
from scipy.ndimage import interpolation as inter

def correct_skew(image, delta=1, limit=5):
    def determine_score(arr, angle):
        data = inter.rotate(arr, angle, reshape=False, order=0)
        histogram = np.sum(data, axis=1, dtype=float)
        score = np.sum((histogram[1:] - histogram[:-1]) ** 2, dtype=float)
        return histogram, score

    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1] 

    scores = []
    angles = np.arange(-limit, limit + delta, delta)
    for angle in angles:
        histogram, score = determine_score(thresh, angle)
        scores.append(score)

    best_angle = angles[scores.index(max(scores))]

    (h, w) = image.shape[:2]
    center = (w // 2, h // 2)
    M = cv2.getRotationMatrix2D(center, best_angle, 1.0)
    corrected = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC, \
            borderMode=cv2.BORDER_REPLICATE)

    return best_angle, corrected

if __name__ == '__main__':
    image = cv2.imread('1.png')
    angle, corrected = correct_skew(image)
    print('Skew angle:', angle)
    cv2.imshow('corrected', corrected)
    cv2.waitKey()

注意:您可能需要根据图像调整delta 或limit 值。 delta 值控制迭代步长，它将迭代到控制最大角度的 limit。此方法通过迭代检查每个角度 + delta 非常简单，目前仅适用于校正 +/- 5 度范围内的倾斜。如果您需要在更大的角度进行校正，请调整limit 值。对于另一种处理偏斜的方法，take a look at this alternative method .

关于用于 OCR 的 Python OpenCV 偏差校正，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57964634/

用于 OCR 的 Python OpenCV 偏差校正

上一篇：python-3.x - cv2 imshow 与 matplotlib imshow 有何巨大差异？

下一篇：python - 如何使用python将图像中方形标题的背景从黑色反转为白色？