python - 如何绘制更大的边界框和仅裁剪边界框文本 Python Opencv

标签 python opencv image-processing computer-vision contour

我正在使用 easyocr 来检测图像中的文本。该方法给出输出边界框。输入图像如下所示

图片 1

Imag2

图2 Image2

使用下面的代码获得输出图像。

但我想绘制一个包含所有文本的单个/更大的边界框,并根据边界框裁剪图像并删除剩余的不需要的区域或文本。 outputImage1

outputImage2

这是附上的代码 要求

pip 安装 pytesseract

pip 安装 easyocr

使用 python main.py -i image1.jpg 运行代码

# USAGE
# python localize_text_tesseract.py --image apple_support.png
# python localize_text_tesseract.py --image apple_support.png --min-conf 50

# import the necessary packages
from pytesseract import Output
import pytesseract
import argparse
import cv2
from matplotlib import pyplot as plt
import numpy as np
import os
import easyocr
from PIL import ImageDraw, Image



def remove_lines(image):
    result = image.copy()
    gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

    # Remove horizontal lines
    horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (40,1))
    remove_horizontal = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
    cnts = cv2.findContours(remove_horizontal, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]
    for c in cnts:
        cv2.drawContours(result, [c], -1, (255,255,255), 5)


    # Remove vertical lines
    vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,40))
    remove_vertical = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel, iterations=2)
    cnts = cv2.findContours(remove_vertical, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]
    for c in cnts:
        cv2.drawContours(result, [c], -1, (255,255,255), 5)

    plt.imshow(result)
    plt.show()

    return result



# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
    help="path to input image to be OCR'd")
ap.add_argument("-c", "--min-conf", type=int, default=0,
    help="mininum confidence value to filter weak text detection")
args = vars(ap.parse_args())


reader = easyocr.Reader(['ch_sim','en']) # need to run only once to load model into memory



# load the input image, convert it from BGR to RGB channel ordering,
# and use Tesseract to localize each area of text in the input image
image = cv2.imread(args["image"])
image = remove_lines(image)

results = reader.readtext(image)
#print('originalresult',results)

low_precision = []
for text in results:
    if text[2]<0.45: # precision here
        low_precision.append(text)
for i in low_precision:
    results.remove(i) # remove low precision
print(results)

#import pdb; pdb.set_trace()


image2 = Image.fromarray(image)

draw = ImageDraw.Draw(image2)
for i in range(0, len(results)):
    p0, p1, p2, p3 = results[i][0]
    draw.line([*p0, *p1, *p2, *p3, *p0], fill='red', width=1)

plt.imshow(np.asarray(image2))
plt.show()




最佳答案

去除低精度结果后,您可以将所有有效点合并到一个二维数组中,并使用cv2.boundingRect来获取边界框。

代码:

points = []
for result in results:
    points.extend(result[0])

rect = cv2.boundingRect(np.array(points))

x, y, w, h = rect

image2 = image.copy()
cv2.rectangle(image2, (x, y), (x + w, y + h), (255, 0, 0), 1)

plt.imshow(image2)
plt.show()

图片:

enter image description here enter image description here

要裁剪文本,请使用以下行:

image_cropped = image[y:y+h, x:x+w]

或者如果需要更精确的裁剪:

mask = np.zeros_like(image)
# grayscale or color image
color = 255 if len(mask.shape) == 2 else mask.shape[2] * [255]
# create a mask
for result in results:
    cv2.fillConvexPoly(mask, np.array(result[0]), color)

# mask the text, and invert the mask to preserve white background
image_masked = cv2.bitwise_or(cv2.bitwise_and(image, mask), cv2.bitwise_not(mask))

image_cropped = image_masked[y:y+h, x:x+w]

关于python - 如何绘制更大的边界框和仅裁剪边界框文本 Python Opencv,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69455733/

相关文章:

Python 和 Plotly : custom colors to pie chart via dictionary

python - 使用 OpenCV Python 和 Tesseract 从图像中读取车牌

android - OpenGL ES 2 不适用于 Android API23

image-processing - 如何从应用于图像的 gabor 滤波器制作特征向量

python - 如何在 python 中以数组或矩阵显示我的图像?

python - 如何使用 python 日志框架在带有回溯的警告或信息级别记录异常?

python - 如何将变量分配给列表中的每个值

python - 获取 ElastiCache 标签

c++ - OpenCV UMat 运算符

java - 如何使用相机获取一个物体到另一个物体的距离?