python - 计算 OCR 精度

我需要计算 OCR 字符准确度
sample 接地值:Non sinking ship is friendship 样本ocr值输入 :non singing ship is finedship关注的领域是:

遗漏字符

多余字符

错位的字符

字符精度定义为实际字符及其位置的数量除以实际字符总数。
我需要一个 python 脚本来找到这种准确性。我的初步实现如下:

ground_value = "Non sinking ship is friendship"
ocr_value = "non singing ship is finedship"
ground_value_characters = (re.sub('\s+', '',
                                      ground_value)).strip()  # remove all spaces from the gr value string
    ocr_value_characters = (re.sub('\s+', '',
                                   ocr_value)).strip()  # remove all the spaces from the ocr string 

 total_characters = float(len(
        ground_value_characters))  

def find_matching_characters(ground, ocr):
  total = 0
  for char in ground:
    if char in ocr:
      total = total + 1
      ocr = ocr.replace(char, '', 1)
  return total

found_characters = find_matching_characters(ground_value_characters,
                                                ocr_value_characters)

accuracy = found_characters/total_characters

我无法得到我所希望的。任何帮助，将不胜感激。

最佳答案

如果你没有接受那个精确的定义(或者如果你想深入研究 python-Levenshtein 的细节)，那么我将如何解决这个问题:pip install python-Levenshtein

from Levenshtein import distance

ground_value = "Non sinking ship is friendship"
ocr_value = "non singing ship is finedship"

print(distance(ground_value, ocr_value))

同library将以相对高性能的方式为您提供汉明距离、操作码和类似功能。
如果这是一个家庭作业，或者你在这里的目的是学习如何实现字符串算法，那么这些都没有用，但如果你只需要一个好的指标，这就是我会使用的。

关于python - 计算 OCR 精度，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/63531985/

python - 计算 OCR 精度

上一篇：swift - 如何在 UIView 中为 TableView 标题部分绘制自定义形状？

下一篇：reactjs - useContext 钩子(Hook)从子组件更新