python - 在 Python 中从一组数字创建 "slice notation"样式列表

标签 python list set subset slice

假设我有一组大约 100,000 个不同的数字。有些是连续的,有些则不是。

为了演示该问题,这些数字的一小部分可能是:

(a) {1,2,3,4,5,6,7,8,9,11,13,15,45,46,47,3467}

编写该子集的有效方法如下:

(b) 1:9:1,11:15:2,45:47:1,3467

这实际上是 python 和 matlab 切片表示法的扩展版本。

我的问题是:如何从前一种类型的列表中有效地获取Python中后一种表示法的列表?

即,给定(a),如何在Python中有效地获得(b)?

最佳答案

我想我明白了,但以下代码没有经过彻底测试,可能包含错误。

基本上get_partial_slices将尝试创建 partial_slice对象,当(已排序)集合中的下一个数字不是 .fit() 时到切片中它是 .end()编辑并开始下一个切片。

如果切片中只有 1 个项目(或 2 个项目和 step!=1 ),则它表示为单独的数字而不是切片(因此需要 yield from current.end() 因为结束切片可能会导致两个数字而不是一片。)

class partial_slice:
    """heavily relied on by get_partial_slices
This attempts to create a slice from repeatedly adding numbers
once a number that doesn't fit the slice is found use .end()
to generate either the slice or the individual numbers"""
    def __init__(self, n):
        self.start = n
        self.stop = None
        self.step = None
    def fit(self,n):
        "returns True if n fits as the next element of the slice (or False if it does not"
        if self.step is None:
            return True #always take the second element into consideration
        elif self.stop == n:
            return True #n fits perfectly with current stop value
        else:
            return False

    def add(self, n):
        """adds a number to the end of the slice, 
    will raise a ValueError if the number doesn't fit"""
        if not self.fit(n):
            raise ValueError("{} does not fit into the slice".format(n))
        if self.step is None:
            self.step = n - self.start
        self.stop = n+self.step

    def to_slice(self):
        "return slice(self.start, self.stop, self.step)"
        return slice(self.start, self.stop, self.step)
    def end(self):
        "generates at most 3 items, may split up small slices"
        if self.step is None:
            yield self.start
            return
        length = (self.stop - self.start)//self.step
        if length>2:
            #always keep slices that contain more then 2 items
            yield self.to_slice()
            return 
        elif self.step==1 and length==2:
            yield self.to_slice()
            return
        else:
            yield self.start
            yield self.stop - self.step


def get_partial_slices(set_):
    data = iter(sorted(set_))
    current = partial_slice(next(data))
    for n in data:
        if current.fit(n):
            current.add(n)
        else:
            yield from current.end()
            current = partial_slice(n)
    yield from current.end()


test_case = {1,2,3,4,5,6,7,8,9,11,13,15,45,46,47,3467}
result = tuple(get_partial_slices(test_case))

#slice_set_creator is from my other answer,
#this will verify that the result was the same as the test case.
assert test_case == slice_set_creator[result] 

def slice_formatter(obj):
    if isinstance(obj,slice):
        # the actual slice objects, like all indexing in python, doesn't include the stop value
        # I added this part to modify it when printing but not when created because the slice 
        # objects can actually be used in code if you want (like with slice_set_creator)
        inclusive_stop = obj.stop - obj.step
        return "{0.start}:{stop}:{0.step}".format(obj, stop=inclusive_stop)
    else:
        return repr(obj)

print(", ".join(map(slice_formatter,result)))

关于python - 在 Python 中从一组数字创建 "slice notation"样式列表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37924055/

相关文章:

java - 获取列表的 Powerset 的最佳方法(递归)

c++ - 比较两个(非 STL) map 是否相等

python - python中float的底层数据结构

python - numpy函数.fft.fft2()给出错误: “cannot do a non-empty take from an empty axis”(opencv,matplotlib,numpy,python27)

python - MPI4Py 分散 sendbuf 参数类型?

python - 迭代生成器时调用 iter 和 next

arrays - 我可以将数组访问器添加到通用 TypeScript 类吗?

c++ - std::set比较器

c - 如何在c中对字符和数字进行排序?

javascript - 使用 Javascript 重新排列 dl 列表