java - 为什么这个方法的时间复杂度是2*O(n log n) + O(m log m)？

下面的代码尝试检查 searchWords 中的所有单词是否都出现在 newsPaperWords 中。两个列表都可以包含重复项。如果某个单词在 searchWords 中出现 n 次，则它必须在 newsPaperWords 中至少出现 n 次，该方法才能返回 true。我以为时间复杂度是2*O(n) + O(m)但是面试官告诉我是2*O(n log n) + O(m log m )。

/**
 * @param searchWords The words we're looking for. Can contain duplicates
 * @param newsPaperWords  The list to look into
 */
public boolean wordMatch(List<String> searchWords, List<String> newsPaperWords) {
    Map<String, Integer> searchWordCount = getWordCountMap(searchWords);
    Map<String, Integer> newspaperWordCount = getWordCountMap(newsPaperWords);
    for (Map.Entry<String, Integer> searchEntry : searchWordCount.entrySet()) {
        Integer occurrencesInNewspaper = newspaperWordCount.get(searchEntry.getKey());
        if (occurrencesInNewspaper == null || occurrencesInNewspaper < searchEntry.getValue()) {
            return false;
        }
    }
    return true;
}

private Map<String, Integer> getWordCountMap(List<String> words) {
    Map<String, Integer> result = new HashMap<>();
    for (String word : words) {
        Integer occurrencesThisWord = result.get(word);
        if (occurrencesThisWord == null) {
            result.put(word, 1);
        } else {
            result.put(word, occurrencesThisWord + 1);
        }
    }
    return result;
}

据我所知，该方法的时间复杂度为 2*O(n) + O(m)(n 是 searchWords 中的元素数量， m newsPaperWords 中的元素数量):

方法 getWordCountMap() 的复杂度为 O(n)，其中 n 是给定列表中的元素数量。该方法循环列表一次，并假设对 result.get(word) 和 result.put() 的调用为 O(1) >.
然后，在最坏的情况下，对 searchWordCount.entrySet() 的迭代是 O(n)，再次假设调用 Hashmap。 get() 的时间复杂度为 O(1)。

因此，只需添加 O(n) + O(m) 即可构建两个 map ，再加上 O(n) 进行最后一次查看。

阅读后this answer ，将 O(n) 作为 HashMap.get() 最坏情况的复杂度，我可以理解 getWordCountMap() 的复杂度最高可达 O(n*2n)，最终循环可达 O(n*n)，总复杂度为 O(n*2n) + O(m*2m) + O(n*n)。

但是怎么2*O(n log n) + O(m log m)？

最佳答案

由于 JEP 180: Handle Frequent HashMap Collisions with Balanced Trees HashMap.get() 操作的最坏情况将是 O(log n)。引用 JEP 180:

The principal idea is that once the number of items in a hash bucket grows beyond a certain threshold, that bucket will switch from using a linked list of entries to a balanced tree. In the case of high hash collisions, this will improve worst-case performance from O(n) to O(log n).

这将使 getWordCountMap() 方法O(n log n)。

关于java - 为什么这个方法的时间复杂度是2*O(n log n) + O(m log m)？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50027872/

java - 为什么这个方法的时间复杂度是2*O(n log n) + O(m log m)？

上一篇：java - 使用位操作验证 ascii 字符串是否包含唯一字符，而不使用其他数据结构

下一篇：algorithm - 快速计算 x ^ (1/y) mod m(模根)