我正在尝试从视频的左上角获取文本“P1”和“P2”。
我拍摄一个框架并将其裁剪为以下图像,然后应用此处找到的图像处理:
use pytesseract to recognize text from image
虽然它适用于我使用图像编辑器手动编辑的裁剪静态图像,但在使用 cv2 从视频中获取帧时它不起作用。
我不确定为什么会这样,但我怀疑它与下图所示的黑白背景有关,但我不知道如何在不删除文本的情况下摆脱它。
这是我的代码
import cv2
import pytesseract
import re
from difflib import SequenceMatcher
def determineWinner(video):
winnerRect = [(70,95),(146,152)]
cap = cv2.VideoCapture(video)
if(cap.isOpened() == False):
print("No dice")
return
fps = cap.get(cv2.CAP_PROP_FPS)
frames = cap.get(cv2.CAP_PROP_FRAME_COUNT)
print(fps)
print(frames)
desiredSeek = frames - int(fps * 9)
print(desiredSeek)
seconds = desiredSeek/fps
print(seconds)
minutes = seconds/60
print(minutes)
partial = minutes - int(minutes)
print(partial)
seconds = partial * 60
print(seconds)
print(str(int(minutes)) +":"+ str(seconds))
cap.set(cv2.CAP_PROP_POS_FRAMES,(desiredSeek))
ret,img = cap.read()
winTxt = []
p1Count = 0
p2Count = 0
cv2.namedWindow("",cv2.WINDOW_NORMAL)
ret,img = cap.read()
while ret:
key = cv2.waitKey(1)
if key == ord('q'):
break
if key == ord('e'):
ret,img = cap.read()
if ret:
winROI = img[winnerRect[0][1]:winnerRect[1][1],winnerRect[0][0]:winnerRect[1][0]]
gray = cv2.cvtColor(winROI, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Morph open to remove noise and invert image
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
invert = 255-opening
invert=cv2.resize(invert,None,fx=2,fy=2)
wConfig='-l eng --oem 1 --psm 10 -c tessedit_char_whitelist=P12'
winTxt = pytesseract.image_to_string(invert,config=wConfig)
cv2.rectangle(img,winnerRect[0],winnerRect[1],(255,0,0),2)
cv2.imshow("winroi",invert)
cv2.imshow("",img)
cv2.resizeWindow("",800,600)
print(winTxt)
desiredSeek+=1
seconds = desiredSeek/fps
minutes = seconds/60
partial = minutes - int(minutes)
seconds = partial * 60
print(str(int(minutes)) +":"+ str(seconds))
else:
break
cap.release()
cv2.destroyAllWindows()
最佳答案
此代码用作测试脚本。我只提取了包含 P1
的图像的参数。要在新图像上应用滤镜,只需删除预定义的阈值,如下所示:
来自:
低蓝、低绿、低红、上蓝、上绿、上红 = (115, 0, 0, 255, 178, 255)
致:
low_blue、low_green、low_red、upper_blue、upper_green、upper_red = (0, 0, 0, 255, 255, 255)
并开始按如下所述修改参数。确定参数后,按esc
退出程序,获取控制台中显示的参数并将其粘贴到阈值元组中。
如何使用它:
非常重要。为了使其正常工作,您必须单击鼠标左键,从
cv2.imshow()
窗口中选择,在本例中为Original image
或Binary图片
q
增加,w
减少蓝色下限阈值a
增加,s
减少绿色阈值下限- ...对于较低和较高颜色 (
BGR
) 阈值依此类推
import numpy as np
import cv2
low_blue, low_green, low_red, upper_blue, upper_green, upper_red = (115, 0, 0, 255, 178, 255)
# Get picture
path = "C:\\Users\\asd\\asd\\P1.png"
frame = cv2.imread(path)
while 1:
lower_color = np.array((low_blue, low_green, low_red))
upper_color = np.array((upper_blue, upper_green, upper_red))
# extract binary image with active blue regions
binary_image = cv2.inRange(frame, lower_color, upper_color)
cv2.imshow('Original image', binary_image)
#erode for the little white contour to dissapear
binary_image = cv2.erode(binary_image, cv2.getStructuringElement(cv2.MORPH_RECT,(3,3)))
binary_image = cv2.dilate(binary_image, cv2.getStructuringElement(cv2.MORPH_RECT,(3,3)))
cv2.imshow('Binary image ', binary_image)
k = cv2.waitKey(5) & 0xFF
if k == 27:
break
if k == ord('q'):
low_blue += 1
if k == ord('w'):
low_blue -= 1
if k == ord('a'):
low_green += 1
if k == ord('s'):
low_green -= 1
if k == ord('z'):
low_red += 1
if k == ord('x'):
low_red -= 1
if k == ord('e'):
upper_blue += 1
if k == ord('r'):
upper_blue -= 1
if k == ord('d'):
upper_green += 1
if k == ord('f'):
upper_green -= 1
if k == ord('c'):
upper_red += 1
if k == ord('v'):
upper_red -= 1
print("low_blue=", low_blue, "low_green=", low_green, "low_red=",low_red, "upper_blue", upper_blue, "upper_green=",
upper_green, "upper_red=",upper_red)
cv2.destroyAllWindows()
结果
来自:
致:
关于python-3.x - 如何使用 PyTesseract 消除图像噪声以改善结果?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61852559/