python - 可以扩展 collections.deque 以构建 "file buffer"吗？

我想在 python 中构建一个循环文件缓冲区来保存文件名(字符串)。缓冲区应具有以下属性。

缓冲区的大小是其名称存储在缓冲区中的文件大小的总和。缓冲区将具有最大允许大小。
添加新文件时，如果缓冲区大小小于最大允许大小，则添加该文件名字符串。否则，最旧的修改文件被推出并添加新文件。如果新添加的文件比缓冲区中已存在的所有文件都旧，则不会发生任何事情。

是否可以为此目的扩展双端队列？

还是我应该从头开始写？有什么设计思路可以用于此目的吗？

谢谢

确定

最佳答案

好的，我相信 Raymond Hettinger 对您的问题的解释是正确的，并且您的评论已经阐明您不关心队列的长度，而是关心所有文件大小的总和。这更有意义，我很高兴我终于明白你的意思了。考虑到这一点，这里有一个基于 heapq 的简单实现，我相信它可以满足您提出的所有要求。通过 putting (timestamp, filename, filesize) tuples on the queue 来使用它，注意当你从队列中get一个项目时，它将是最旧的文件(即具有最小时间戳的文件。)

import heapq

class FilenameQueue(object):
    def __init__(self, times_sizes_names, maxsize):
        self.maxsize = maxsize
        self.size = sum(s for t, s, n in times_sizes_names)
        self.files = list(times_sizes_names)
        heapq.heapify(self.files)
        while self.size > self.maxsize:
            self.get()
    def __len__(self):
        return len(self.files)
    def put(self, time_size_name):
        self.size += time_size_name[1]
        if self.size < self.maxsize:
            heapq.heappush(self.files, time_size_name)
        else:
            time_size_name = heapq.heappushpop(self.files, time_size_name)
            self.size -= time_size_name[1]
    def get(self):
        time_size_name = heapq.heappop(self.files)
        self.size -= time_size_name[1]
        return time_size_name

我添加了一个 __len__ 方法，这样您就可以在从队列中获取数据之前对其进行测试。这是一个用法示例:

>>> f = FilenameQueue(((22, 33, 'f1'), (44, 55, 'f2'), (33, 22, 'f3')), 150)
>>> while f:
...     f.get()
... 
(22, 33, 'f1')
(33, 22, 'f3')
(44, 55, 'f2')
>>> f = FilenameQueue(((22, 33, 'f1'), (44, 55, 'f2'), (33, 22, 'f3')), 150)
>>> f.put((55, 66, 'f4'))
>>> while f:
...     f.get()
... 
(33, 22, 'f3')
(44, 55, 'f2')
(55, 66, 'f4')

请参阅我的编辑历史，了解一个完全不同的解决方案，该解决方案涉及次优的 Queue.PriorityQueue。我忘记了 maxsize 通过阻塞而非丢弃元素来强制限制。那不是很有用!

关于python - 可以扩展 collections.deque 以构建 "file buffer"吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/10167839/

python - 可以扩展 collections.deque 以构建 "file buffer"吗？

上一篇：python - 如何让 Django unittest 将数据提交/保存到数据库

下一篇：python - 将参数传递给函数以进行拟合