我编写了程序来识别音乐符号(完全制表符)。主线上有数字。
经过预处理(全局二值化)后,我删除了行。
我想问一下预处理。我的程序在行和数字具有相同颜色的情况下可以正常工作。但是线条通常比数字浅。当我使用阈值较小的二值化时,当我使用阈值较大的二值化时消失了,这是因为有太多的噪点并且数字很粗。
您可以推荐哪种二值化(在OpenCV中)?该怎么办?这个问题有什么解决办法吗?
我将添加一些示例。
我的预处理看起来像这样:
1)读取灰色图像:
2)全局二值化:
cv::threshold(例如127)
角色不是很漂亮……:(但是主要的问题是线条消失了。
3)cv::阈值(230)
我可以看到线条,但charackters又粗又丑。例如,“a”字符有时中间没有空格,依此类推。而且有很多噪音。 :(
还有一个问题...我必须为每个文件设置阈值...。
您对预处理有什么建议吗?
我想要“漂亮”的线条和字符...
(我没有询问代码,只是一些建议和建议。
最佳答案
尝试自适应阈值-从OpenCV网站获取:
In the previous section, we used a global value as threshold value. But it may not be good in all the conditions where image has different lighting conditions in different areas. In that case, we go for adaptive thresholding. In this, the algorithm calculate the threshold for a small regions of the image. So we get different thresholds for different regions of the same image and it gives us better results for images with varying illumination.
It has three ‘special’ input params and only one output argument.
Adaptive Method - It decides how thresholding value is calculated. cv2.ADAPTIVE_THRESH_MEAN_C : threshold value is the mean of neighbourhood area. cv2.ADAPTIVE_THRESH_GAUSSIAN_C : threshold value is the weighted sum of neighbourhood values where weights are a gaussian window. Block Size - It decides the size of neighbourhood area.
C - It is just a constant which is subtracted from the mean or weighted mean calculated.
Below piece of code compares global thresholding and adaptive thresholding for an image with varying illumination:
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread('dave.jpg',0)
img = cv2.medianBlur(img,5)
ret,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
th2 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C,\
cv2.THRESH_BINARY,11,2)
th3 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv2.THRESH_BINARY,11,2)
titles = ['Original Image', 'Global Thresholding (v = 127)',
'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']
images = [img, th1, th2, th3]
for i in xrange(4):
plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
plt.title(titles[i])
plt.xticks([]),plt.yticks([])
plt.show()
http://docs.opencv.org/trunk/doc/py_tutorials/py_imgproc/py_thresholding/py_thresholding.html
这提供了更高级的阈值设置,对于不同区域具有不同的阈值
关于image - 识别音乐符号的程序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28552728/