python - 使用 Python 联合查找实现

所以这就是我想要做的:我有一个包含几个等价关系的列表:

l = [[1, 2], [2, 3], [4, 5], [6, 7], [1, 7]]

我想合并共享一个元素的集合。这是一个示例实现:

def union(lis):
  lis = [set(e) for e in lis]
  res = []
  while True:
    for i in range(len(lis)):
      a = lis[i]
      if res == []:
        res.append(a)
      else:
        pointer = 0 
        while pointer < len(res):
          if a & res[pointer] != set([]) :
            res[pointer] = res[pointer].union(a)
            break
          pointer +=1
        if pointer == len(res):
          res.append(a)
     if res == lis:
      break
    lis,res = res,[]
  return res

然后打印

[set([1, 2, 3, 6, 7]), set([4, 5])]

这做对了，但是当等价关系太大时，速度太慢了。我查了联合查找算法的描述:http://en.wikipedia.org/wiki/Disjoint-set_data_structure 但我在编写 Python 实现代码时仍然遇到问题。

最佳答案

在 `O(n)` 中运行的解决方案时间

def indices_dict(lis):
    d = defaultdict(list)
    for i,(a,b) in enumerate(lis):
        d[a].append(i)
        d[b].append(i)
    return d

def disjoint_indices(lis):
    d = indices_dict(lis)
    sets = []
    while len(d):
        que = set(d.popitem()[1])
        ind = set()
        while len(que):
            ind |= que 
            que = set([y for i in que 
                         for x in lis[i] 
                         for y in d.pop(x, [])]) - ind
        sets += [ind]
    return sets

def disjoint_sets(lis):
    return [set([x for i in s for x in lis[i]]) for s in disjoint_indices(lis)]

工作原理:

>>> lis = [(1,2),(2,3),(4,5),(6,7),(1,7)]
>>> indices_dict(lis)
>>> {1: [0, 4], 2: [0, 1], 3: [1], 4: [2], 5: [2], 6: [3], 7: [3, 4]})

indices_dict给出从等价物 # 到 lis 中索引的映射.例如。 1映射到索引 0和 4在lis .

>>> disjoint_indices(lis)
>>> [set([0,1,3,4], set([2])]

disjoint_indices给出一组不相交的索引列表.每个集合对应于一个等价的索引。例如。 lis[0]和 lis[3]具有相同的等价性但不是 lis[2] .

>>> disjoint_set(lis)
>>> [set([1, 2, 3, 6, 7]), set([4, 5])]

disjoint_set将不相交的索引转换为适当的等价物。

时间复杂度

O(n) 时间复杂度很难看出，但我会尽力解释。在这里我将使用 n = len(lis) .

indices_dict当然在 O(n) 中运行时间因为只有 1 个 for 循环
disjoint_indices是最难看到的。它肯定在 O(len(d)) 中运行d 时外循环停止的时间为空，内部循环删除了 d 的一个元素每次迭代。现在，len(d) <= 2n自 d是从等价#'s 到索引的映射，最多有 2n lis 中的不同等价# .因此，函数运行在O(n)。 .
disjoint_sets由于 3 个组合的 for 循环，很难看到。但是，您会注意到最多 i可以跑遍所有n lis 中的索引和 x遍历 2 元组，因此总复杂度为 2n = O(n)

关于python - 使用 Python 联合查找实现，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/20154368/

python - 使用 Python 联合查找实现

在 `O(n)` 中运行的解决方案时间

工作原理:

时间复杂度

上一篇：python - 在 kivy 中的触摸事件上旋转对象

下一篇：python - 如何使用 GitPython checkout 标签

python - 使用 Python 联合查找实现

在 O(n) 中运行的解决方案时间

工作原理:

时间复杂度

上一篇：python - 在 kivy 中的触摸事件上旋转对象

下一篇：python - 如何使用 GitPython checkout 标签

在 `O(n)` 中运行的解决方案时间