algorithm - 在 O(n) 相交中排序

Let S1 and S2 be two sets of integers (they are not necessarily disjoint).

We know that |S1| = |S2| = n (i.e. each set has n integers).

Each set is stored in an array of length n, where its integers are sorted in ascending order.

Let k ≥ 1 be an integer.

Design an algorithm to find the k smallest integers in S1 ∩ S2 in O(n) time.

这是我目前所拥有的:

创建一个名为Intersection的新数组
对于 S1 中的每个 e 添加 e 到 hashset O(n) 时间内
对于 S2 中的每个 e 检查 e 是否存在于哈希集中 O(n) 时间<
如果 e 存在于哈希集中，将 e 添加到 Intersection
完成比较后，按计数排序 Intersection O(n) 时间
返回前 k 个整数

因此 O(n) + O(n) + O(n) = O(n)

我走在正确的轨道上吗？

最佳答案

是的，您绝对是在正确的轨道上，但实际上根本没有必要生成哈希表或额外的集合。由于您的两个集合已经排序，您可以简单地通过它们运行索引/指针，寻找共同的元素。

例如，要从两个集合中找到第一个公共(public)元素，请使用以下伪代码:

start at first index of both sets
while more elements in both sets, and current values are different:
    if set1 value is less than set2 value:
        advance set1 index
    else
        advance set2 index

最后，set1 index 将引用一个交点，前提是两个索引都没有超出各自列表中的最后一个元素。然后，您可以在循环中使用该方法来查找第一个 x 交集值。

这是 Python 3 中的概念证明，它为您提供了两个列表中的前三个数字(二的倍数和三的倍数)。完整的交集是 {0, 6, 12, 18, 24} 但您会看到它只会提取其中的前三个:

# Create the two lists to be used for intersection.

set1 = [i * 2 for i in range(15)] ; print(set1) # doubles
set2 = [i * 3 for i in range(15)] ; print(set2) # trebles

idx1 = 0 ; count1 = len(set1)
idx2 = 0 ; count2 = len(set2)

# Only want first three.

need = 3
while need > 0:
    # Continue until we find next intersect or end of a list.

    while idx1 < count1 and idx2 < count2 and set1[idx1] != set2[idx2]:
        # Advance pointer of list with lowest value.

        if set1[idx1] < set2[idx2]:
            idx1 += 1
        else:
            idx2 += 1

    # Break if reached end of a list with no intersect.

    if idx1 >= count1 or idx2 >= count2:
        break

    # Otherwise print intersect and advance to next list candidate.

    print(set1[idx1]) ; need -= 1
    idx1 += 1 ; idx2 += 1

输出如预期的那样:

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28]
[0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42]
0
6
12

如果您在最后需要一个列表而不是仅仅打印出相交点，您只需在循环之前初始化一个空容器并将值附加到它而不是打印它。这会变得有点像您提出的解决方案，但具有不需要哈希表或排序的优势。

关于algorithm - 在 O(n) 相交中排序，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/39444486/

algorithm - 在 O(n) 相交中排序

上一篇：algorithm - 如何找到二维数组中的最大圆

下一篇：c - C 中的 Dijkstra 算法