由于文件大小较大,我在处理它们时遇到了一个问题,文件的大小正在逐渐增加,并且将来还会继续增加。由于我上传压缩文件的第三方应用程序的限制,我只能使用 deflate 作为压缩选项。
运行脚本的服务器上的内存有限,因此会发生常见的内存问题,这就是为什么我尝试分块读取和分块写入,输出是所需的压缩文件。
到目前为止,我一直在使用此代码片段来压缩文件以减小大小,并且直到现在当文件有两个大文件需要处理/压缩时,它一直工作正常。
with open(file_path_partial, 'rb') as file_upload, open(file_path, 'wb') as file_compressed:
file_compressed.write(zlib.compress(file_upload.read()))
我尝试过一些不同的选项来解决这个问题,但到目前为止,所有这些选项都未能正常工作。
1)
with open(file_path_partial, 'rb') as file_upload:
with open(file_path, 'wb') as file_compressed:
with gzip.GzipFile(file_path_partial, 'wb', fileobj=file_compressed) as file_compressed:
shutil.copyfileobj(file_upload, file_compressed)
2)
BLOCK_SIZE = 64
compressor = zlib.compressobj(1)
filename = file_path_partial
with open(filename, 'rb') as input:
with open(file_path, 'wb') as file_compressed:
while True:
block = input.read(BLOCK_SIZE)
if not block:
break
file_compressed.write(compressor.compress(block))
最佳答案
下面的示例读取 64k block ,修改每个 block 并将其写入 gzip 文件。
这是你想要的吗?
import gzip
with open("test.txt", "rb") as fin, gzip.GzipFile("modified.txt.gz", "w") as fout:
while True:
block = fin.read(65536) # read in 64k blocks
if not block:
break
# comment next line to just write through
block = block.replace(b"a", b"A")
fout.write(block)
关于python - 分块读取大文件,分块压缩和写入,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61880710/