python - python读取大型二进制文件最有效的方法是什么

我有一个大(21 GB)文件，我想将其读入内存，然后传递给一个子例程，该子例程对我透明地处理数据。我在 Centos 6.5 上使用 python 2.6.6，因此无法升级操作系统或 python。目前我正在使用

f = open(image_filename, "rb")
image_file_contents=f.read()
f.close()
transparent_subroutine ( image_file_contents )

速度很慢(约 15 分钟)。在开始读取文件之前，我知道文件有多大，因为我调用 os.stat( image_filename ).st_size

所以如果有意义的话我可以预先分配一些内存。

谢谢

最佳答案

使用发电机

def generator(file_location):

    with open(file_location, 'rb') as entry:

        for chunk in iter(lambda: entry.read(1024 * 8), b''):

            yield chunk


go_to_streaming = generator(file_location)

关于python - python读取大型二进制文件最有效的方法是什么，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/25754837/

上一篇：python - 使用 lxml 设置元素内容会删除尾随空格

下一篇：python - 将多个数字写入 .pdf 时出错

相关文章：

python - 在 Python 中使用 Selenium 浏览分页表

python - 加载共享库 : libssl. so.0.9.8 时出现 django runserver 错误:无法打开共享对象文件:没有这样的文件或目录

python - 文件只写在我的程序末尾

struct - 如何在没有反射的情况下将结构转储到字节数组中？

python - 我想将base64转换为json

python - 使用python在单个pdf页面上保存多个绘图

python - 按住键时如何使 Sprite 移动

c# - 使用C#将文件 move 到目录下

file - 过滤使用 fs::read_dir() 发现的文件或目录

c - 如何使用C在char中存储8位