python - 是否有一种快速算法可以将集合的所有分区生成为大小为 2 的子集(以及一个大小为 1 的子集)？

如标题中所述，我试图生成一组大小为 n 的所有分区，其中所有子集的大小均为 2，如果 n 不均匀，则存在新的单例集。我稍微修改了一些用于生成所有分区的 SO 代码来得到这个:

def partitionIntoPairs(collection):
    if len(collection) == 1:
        yield [ collection ]
        return

    first = collection[0]
    for smaller in partition2(collection[1:]):
        for n, subset in enumerate(smaller):
            if len(subset):
                yield smaller[:n] + [[ first ] + subset]  + smaller[n+1:]
        yield [ [ first ] ] + smaller

这可行，但遗憾的是速度太慢了。我的第二个想法是使用 itertools.combinations 生成某个集合的所有对，然后递归地为每个 det 调用该函数，而不删除给定的对，但我猜这甚至更慢。另外，实现不正确，它只返回一个可能的分区，我不确定如何让它返回所有分区:

from itertools import combinations

def partitionIntoPairs2(collection):
    if not collection:
        return []
    elif len(collection) == 1:
        return [(next(iter(collection)))]
    else: 
        pairs = set(combinations(collection, 2))
        for pair in pairs: 
            collection.remove(pair[0])
            collection.remove(pair[1])
            return partition3(collection) + [pair]

我偶然发现了一些用于具有给定数量的集合的分区的算法，以及生成所有可能的分区的算法的各种实现，但据我所知，这些都没有有效地解决我的问题。

因此，提出一个更具体的问题:如果第二种算法是可行的选择，那么正确的实现是什么？当然，有没有更快的方法来做到这一点？如果是，怎么办？

最佳答案

分区应被视为一组，仅顺序不同的两个分区应被视为同一分区。所以数字集只有3个分区(1,2,3,4)。

分区的数量应该是N!/(N/2)!/2^(N/2)。使用斯特林公式，大约为。 Sqrt(2)*(N/e)^(N/2) 其中 e=2.71828...而且非常巨大。

我利用了@VirtualScooter的代码并提供了分区的递归版本，它比他的itertools版本运行得更快(注意这不是苹果与苹果的比较，因为我的分区没有重复)。


import itertools
import timeit
t3 = (1, 2, 3)
t4 = (1, 2, 3, 4)
t6 = (1, 2, 3, 4, 5, 6)

def grouper(iterable, n, fillvalue=None):
    """Collect data into fixed-length chunks or blocks.
        Code from Python itertools page
    """
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return itertools.zip_longest(*args, fillvalue=fillvalue)

def partitionIntoPairs(collection):
    perms = itertools.permutations(collection)
    for p in perms:
        group = list(grouper(p, 2))
        if group[-1][-1] is None:
            group[-1] = (group[-1][0],)
        yield group

def Partition(Indexable):
    if len(Indexable)<=2:
        yield [Indexable]
    elif len(Indexable)%2==1:
        for i,x in enumerate(Indexable):
            for s in Partition(Indexable[:i]+Indexable[i+1:]):
                yield [[x]]+s
    else:
        for i,x in enumerate(Indexable):
            if i==0:
                x0=x
            else:
                for s in Partition(Indexable[1:i]+Indexable[i+1:]):
                    yield [[x0,x]]+s
def comp_timeit(collection, repeats=1_000):
    s1 = f"l1 = list(Partition({collection}))"
    s2 = f"l1 = list(partitionIntoPairs({collection}))"
    t1 = timeit.timeit(s1, globals=globals(),number=repeats)
    t2 = timeit.timeit(s2, globals=globals(),number=repeats)
    print(f"partition, {repeats:_} runs: {t1:,.4f}")
    print(f"itertools, {repeats:_} runs: {t2:,.4f}")
for p in Partition(t4):
    print(p)
comp_timeit(t3)
comp_timeit(t4)
comp_timeit(t6)

关于python - 是否有一种快速算法可以将集合的所有分区生成为大小为 2 的子集(以及一个大小为 1 的子集)？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/66969865/

python - 是否有一种快速算法可以将集合的所有分区生成为大小为 2 的子集(以及一个大小为 1 的子集)？

上一篇：kotlin - 用于 build.gradle.kts 的 Jaxb。如何将其从groovy翻译成kotlin？

下一篇：Anaconda-JupyterLab-spaCy : "No module named spacy" in terminal