android-vision - 如何强制 Android 移动视觉阅读整行文本

标签 android-vision text-recognition

我已经按照教程实现了适用于 Android 的 Google 移动视觉。我正在尝试构建一个应用程序,它将扫描收据并找到数字总数。但是,当我扫描以不同格式打印的不同收据时,API 会以任意方式检测 TextBlock。例如,在一张收据中,如果几个文本单词由单个空格分隔,那么它们将被分组到一个 TextBlock 中。但是,如果文本的两个单词被大量空格分隔,则它们将作为独立的 TextBlock 分隔,即使它们出现在同一“行”上。我想要做的是强制 API 将收据的每一整行识别为单个实体。这可能吗?

最佳答案

public ArrayList<T> getAllGraphicsInRow(float rawY) {
    synchronized (mLock) {
        ArrayList<T> row = new ArrayList<>();
        // Get the position of this View so the raw location can be offset relative to the view.
        int[] location = new int[2];
        this.getLocationOnScreen(location);
        for (T graphic : mGraphics) {
            float rawX = this.getWidth();
            for (int i=0; i<rawX; i+=10){
                if (graphic.contains(i - location[0], rawY - location[1])) {
                    if(!row.contains(graphic)) {
                        row.add(graphic);
                    }
                }
            }
        }
        return row;
    }
}

这应该在 GraphicOverlay.java 文件中,并且基本上获取该行中的所有图形。
public static boolean almostEqual(double a, double b, double eps){
    return Math.abs(a-b)<(eps);
}

public static boolean pointAlmostEqual(Point a, Point b){
    return almostEqual(a.y,b.y,10);
}
public static boolean cornerPointAlmostEqual(Point[] rect1, Point[] rect2){
    boolean almostEqual=true;
    for (int i=0; i<rect1.length;i++){
            if (!pointAlmostEqual(rect1[i],rect2[i])){
                almostEqual=false;
            }
        }
    return almostEqual;
}
private boolean onTap(float rawX, float rawY) {
    String priceRegex = "(\\d+[,.]\\d\\d)";
    ArrayList<OcrGraphic> graphics = mGraphicOverlay.getAllGraphicsInRow(rawY);
    OcrGraphic currentGraphics = mGraphicOverlay.getGraphicAtLocation(rawX,rawY);
    if (graphics !=null && currentGraphics!=null) {
        List<? extends Text> currentComponents = currentGraphics.getTextBlock().getComponents();
        final Pattern pattern = Pattern.compile(priceRegex);
        final Pattern pattern1 = Pattern.compile(priceRegex);

        TextBlock text = null;
        Log.i("text results", "This many in the row: " + Integer.toString(graphics.size()));

        ArrayList<Text> combinedComponents = new ArrayList<>();
        for (OcrGraphic graphic : graphics) {
            if (!graphic.equals(currentGraphics)) {
                text = graphic.getTextBlock();
                Log.i("text results", text.getValue());
                combinedComponents.addAll(text.getComponents());
            }
        }

        for (Text currentText : currentComponents) { // goes through components in the row
            final Matcher matcher = pattern.matcher(currentText.getValue()); // looks for
            Point[] currentPoint = currentText.getCornerPoints();

            for (Text otherCurrentText : combinedComponents) {//Looks for other components that are in the same row
                final Matcher otherMatcher = pattern1.matcher(otherCurrentText.getValue()); // looks for
                Point[] innerCurrentPoint = otherCurrentText.getCornerPoints();

                if (cornerPointAlmostEqual(currentPoint, innerCurrentPoint)) {
                    if (matcher.find()) { // if you click on the price
                        Log.i("oh yes", "Item: " + otherCurrentText.getValue());
                        Log.i("oh yes", "Value: " + matcher.group(1));
                        itemList.add(otherCurrentText.getValue());
                        priceList.add(Float.valueOf(matcher.group(1)));
                    }
                    if (otherMatcher.find()) { // if you click on the item
                        Log.i("oh yes", "Item: " + currentText.getValue());
                        Log.i("oh yes", "Value: " + otherMatcher.group(1));
                        itemList.add(currentText.getValue());
                        priceList.add(Float.valueOf(otherMatcher.group(1)));
                    }                      
                    Toast toast = Toast.makeText(this,  " Text Captured!" , Toast.LENGTH_SHORT);
                    toast.show();
                }
            }

        }
        return true;
    }
    return false;
}

这应该在 OcrCaptureActivity.java 中,它将 TextBlock 分解成行并在与该行相同的行中找到块,并检查组件是否都是价格,并相应地打印所有值。

几乎等于中的 eps 值是它检查行中图形的高度的容差。

关于android-vision - 如何强制 Android 移动视觉阅读整行文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42358661/

相关文章:

python - 用于 OCR 的场景文本图像超分辨率

java - 如何使用带有 OpenCV 的 Java 从边界框读取文本

android - Mobile Vision API 可以检测中文、日文和韩文吗?

android - 如何在移动视觉API文本检测中减小相机源的大小

Android-视觉OCR;安卓视觉

android - 如何在生产更新中更改 targetSandboxVersion?

android - Tesseract 在 android 中的最大识别时间

python - TensorFlow - 图像中的文本识别