python - 避免生成器分离器产生 None 值

我从 CSV 文件(现在是 ID 生成器)中提取了数千个 ID，以迭代和处理这些 ID。

为了优化代码，我将这些 ID 分组并一次处理整个批处理。

以下代码以 n 的批量大小对生成器进行分区。

from itertools import zip_longest
def grouper(n, iterable):
    """ Grouping of iterable with n objects
       Attributes
       :n No. of values in a group
       :iterable/string to be iterated
       :return group of string/iterator values
    "grouper(3, 'abcdefg') --> ('a','b','c'), ('d','e','f'), ('g',None, None)"
    """
    return zip_longest(*[iter(iterable)]*n)

例如:

>>>acc_ids = ['ID21', 'ID24', 'ID38', 'ID40', 'ID42', 'ID43', 'ID47', 'ID54', 'ID58']
#--As an iterator
>>>id_generator = (i for i in acc_ids)
>>>batches = grouper(7, id_generator)
>>>batches
<itertools.zip_longest object at 0x7f3beb3313b8>
#This iterator is much similar to the below list and notice padded `None`(s) at the end of last batch:
#[('ID21', 'ID24', 'ID38', 'ID40', 'ID42', 'ID43', 'ID47'), ('ID54', 'ID58', None, None, None, None, None)]

问题是，要从迭代器中删除填充的 None 值，我正在使用 filter

for batch in batches:
    batch = list(filter(None, batch))

此过滤器正在从列表中删除 None 值。因为我在想，而不是添加额外的过滤器，我们是否可以在拆分生成器时阻止生成填充的 None 值...

查询:

是否有任何其他方法可以拆分大型生成器来批量生产没有在最后一批的末尾添加 None/Null 值。
或者
我们能否更改上述函数 grouper 以抑制生成填充的 None 值？

最佳答案

这可能对你有用:

def grouper(n, iterable):
    iter_ = iter(iterbale)
    while True:
        res = tuple(next(iter_) for _ in range(n))
        if not res:
            return
        yield res


acc_ids = ['ID21', 'ID24', 'ID38', 'ID40', 'ID42', 'ID43', 'ID47', 'ID54', 'ID58']
id_generator = iter(acc_ids)
batches = grouper(7, id_generator)
print(list(batches))

输出:

[('ID21', 'ID24', 'ID38', 'ID40', 'ID42', 'ID43', 'ID47'), ('ID54', 'ID58')]

关于python - 避免生成器分离器产生 None 值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41244043/

python - 避免生成器分离器产生 None 值

上一篇：python - 无法在 Jupyter notebook 中导入 opencv 但能够在 Anaconda 的命令行中导入

下一篇：python - 使用 boto3 检查 EC2 实例的停止时间