Java比较器使用两个不同的标准

标签 java comparator

我有以下比较器:

public static class WordComparator implements Comparator<Word> {
    @Override
    public int compare(Word word1, Word word2) {
        //TODO find a better way to determine threshold
        int threshold = 10; //allowed difference in height
        int word1y = (int)Math.round(word1.bbox.y1 * 1.0 / threshold);
        int word2y = (int)Math.round(word2.bbox.y1 * 1.0 / threshold);
        if (word1y == word2y) {
            return word1.bbox.x1 - word2.bbox.x1;
        }
        else {
            return word1y - word2y;
        }
    }
}

在任何Collection<Word>您可以使用此比较器,然后它应该首先根据 y1(y 坐标,此处为 word1.bbox.y1)对单词进行排序,其次基于 x1(x 坐标,此处为 word.bbox.x1)。当前的实现还使用一种机制来规范化彼此在 10 y 范围内的所有单词。

但是我得出的结论是,我当前的代码将无法运行。不过,我现在的问题是:如何制作一个可以比较两个不同字段的比较器?我已经有了返回值等 - 我只需要找到正确的方法即可。

我希望你能帮我解决这个问题。

请求输出示例:

w = [word_50, [188, 1455, 280, 1482, 92, 27], false, Totaal]
w = [word_58, [1324, 1547, 1370, 1573, 46, 26], false, EU]
w = [word_59, [1465, 1546, 1568, 1577, 103, 31], false, 173,50]
w = [word_56, [300, 1558, 329, 1583, 29, 25], false, te]
w = [word_62, [381, 2082, 605, 2119, 224, 37], false, verkrijgbaar!]
w = [word_61, [305, 2093, 369, 2114, 64, 21], false, ons]
w = [word_65, [605, 2114, 650, 2166, 45, 52], false,  ]
w = [word_68, [184, 2258, 319, 2382, 135, 124], false,    ]
w = [word_72, [296, 2278, 349, 2319, 53, 41], false, J]
w = [word_73, [411, 2302, 470, 2322, 59, 20], false, ‚n.]
w = [word_74, [571, 2319, 602, 2320, 31, 1], false, ]
w = [word_76, [434, 2330, 635, 2357, 201, 27], false, Kerstkaarten]
w = [word_77, [338, 2367, 436, 2393, 98, 26], false, Bestel]
w = [word_69, [184, 2382, 338, 2409, 154, 27], false,  ]
w = [word_80, [1805, 2392, 1979, 2413, 174, 21], false, 37.45.08.070]
w = [word_82, [1745, 2430, 1881, 2458, 136, 28], false, Groningen]
w = [word_84, [1666, 2470, 1741, 2492, 75, 22], false, B.T.W.]
w = [word_86, [1795, 2469, 1981, 2492, 186, 23], false, 821.82.468.501]
w = [word_88, [1741, 2547, 1873, 2575, 132, 28], false, Algemene]
w = [word_108, [841, 2584, 1018, 2624, 177, 40], false, Betaling:]
w = [word_111, [1295, 2582, 1336, 2613, 41, 31], false, 14]
w = [word_102, [203, 2590, 261, 2630, 58, 40], false, Wij]
w = [word_107, [640, 2585, 825, 2627, 185, 42], false, opdracht.]
w = [word_90, [1666, 2593, 1695, 2609, 29, 16], false, en]
w = [word_104, [431, 2597, 454, 2620, 23, 23], false, u]
w = [word_106, [570, 2595, 628, 2619, 58, 24], false, uw]
w = [word_92, [1666, 2625, 1709, 2654, 43, 29], false, zijn]
w = [word_96, [1875, 2664, 1933, 2686, 58, 22], false, 1181]
w = [word_116, [561, 2683, 751, 2715, 190, 32], false, factuurnr.]
w = [word_119, [1108, 2678, 1321, 2710, 213, 32], false, vermelden.]
w = [word_114, [265, 2685, 423, 2724, 158, 39], false, betaling]
w = [word_117, [769, 2690, 815, 2713, 46, 23], false, en]
w = [word_98, [1708, 2703, 1739, 2726, 31, 23], false, de]
w = [word_101, [1863, 2703, 1999, 2730, 136, 27], false, Groningen]
w = [word_125, [828, 2772, 1359, 2813, 531, 41], false, administratie@biuemule.nl]
w = [word_123, [555, 2778, 646, 2809, 91, 31], false, deze]
w = [word_121, [309, 2787, 441, 2819, 132, 32], false, vragen]
w = [word_122, [455, 2787, 544, 2809, 89, 22], false, over]
w = [word_124, [660, 2777, 814, 2808, 154, 31], false, factuur:]
w = [word_120, [204, 2782, 298, 2812, 94, 30], false, Voor]
w = [word_100, [1829, 2705, 1853, 2725, 24, 20], false, te]
w = [word_99, [1750, 2704, 1816, 2725, 66, 21], false, K.v‚K.]
w = [word_97, [1668, 2704, 1696, 2733, 28, 29], false, bij]
w = [word_115, [435, 2692, 548, 2724, 113, 32], false, graag]
w = [word_113, [200, 2687, 254, 2727, 54, 40], false, Bij]
w = [word_118, [830, 2682, 1090, 2713, 260, 31], false, debiteurennr.]
w = [word_95, [1754, 2670, 1863, 2687, 109, 17], false, nummer]
w = [word_94, [1666, 2664, 1744, 2687, 78, 23], false, onder]
w = [word_93, [1721, 2624, 1893, 2654, 172, 30], false, gedeponeerd]
w = [word_105, [469, 2595, 559, 2620, 90, 25], false, voor]
w = [word_91, [1709, 2585, 1998, 2614, 289, 29], false, betalingsvoorwaarden]
w = [word_109, [1031, 2585, 1130, 2615, 99, 30], false, netto]
w = [word_103, [274, 2589, 416, 2622, 142, 33], false, danken]
w = [word_112, [1350, 2580, 1481, 2622, 131, 42], false, dagen.]
w = [word_110, [1144, 2583, 1278, 2614, 134, 31], false, binnen]
w = [word_89, [1883, 2547, 2006, 2575, 123, 28], false, leverings-]
w = [word_87, [1666, 2549, 1733, 2570, 67, 21], false, Onze]
w = [word_85, [1754, 2470, 1786, 2492, 32, 22], false, NL]
w = [word_83, [1894, 2430, 2020, 2452, 126, 22], false, 02045251]
w = [word_81, [1666, 2432, 1733, 2453, 67, 21], false, K.v.K.]
w = [word_79, [1666, 2391, 1794, 2414, 128, 23], false, Rabobank]
w = [word_78, [449, 2365, 528, 2398, 79, 33], false, tijdig]
w = [word_70, [528, 2339, 685, 2409, 157, 70], false,   ]
w = [word_75, [225, 2332, 420, 2359, 195, 27], false, INTERCARD]
w = [word_71, [224, 2323, 254, 2324, 30, 1], false, ]
w = [word_67, [635, 2290, 685, 2339, 50, 49], false,  ]
w = [word_66, [349, 2258, 650, 2290, 301, 32], false,  ]
w = [word_63, [425, 2123, 434, 2138, 9, 15], false, \I]
w = [word_64, [206, 2114, 650, 2258, 444, 144], false,   ]
w = [word_60, [248, 2085, 290, 2120, 42, 35], false, Bij]
w = [word_57, [341, 1557, 458, 1583, 117, 26], false, betalen]
w = [word_55, [188, 1558, 288, 1584, 100, 26], false, Totaal]
w = [word_51, [294, 1455, 368, 1480, 74, 25], false, BTW]
w = [word_54, [1536, 1448, 1571, 1473, 35, 25], false, 70]

输入是同一个列表,但顺序是任意的。当前使用的“编码”是:w = [word.id, [word.bbox.x1, word.bbox.y1, word.bbox.x2, word.bbox.y2, word.bbox.width, word.bbox.height], word.isStrong, word.content] .

所以你应该只看 word.bbox.y1word.bbox.x1值。如您所见,它显然不是随机的,它现在被格式化为一种围绕 y 值的抛物线。

最佳答案

你绝对应该看看Apache's CompareToBuilder

然后你可以这样做:

public int compare(Word word1, Word word2) {
    int threshold = 10; //allowed difference in height
    int word1y = (int)Math.round(word1.bbox.y1 * 1.0 / threshold);
    int word2y = (int)Math.round(word2.bbox.y1 * 1.0 / threshold);
    return new CompareToBuilder()
       .append(word1y, word2y)
       .append(word1.bbox.x1, word2.bbox.x1)
       .toComparison();
}

关于Java比较器使用两个不同的标准,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17837313/

相关文章:

java服务器处理多个tcp连接

java - 具有整数值的字符串的排序列表 - Java

java - 对话框关闭后 GWT 按钮保持禁用状态

java - 更新单词的属性集时,它会在 JTextPane 中抛出 IllegalStateException

java - 如何在java中定义相对路径

java - 使用 BufferedReader 从多个 url 读取数据

c++ - 通用类中的语法

java 7 中的 Java 排序(TimSort-MergeSort): Dangerous

java - 创建对象和值之间映射的最小堆

java - <identifier> 地址簿比较器出现预期错误