python - tensorflow - TFRecordWriter 在写入文件时占用太多内存？

我正在处理一个大型数据集，其中有 306400 张图像需要处理。

但我要做的事情非常简单:调整图像大小，然后写入 .TFRecords 文件。

但是，我遇到了内存不足错误。

我无法多次运行脚本，因为无法附加 .TFRecord 文件，所以我必须在一次运行中写入所有数据。

我曾尝试使用多个 for 循环，因为我认为在每个 for 循环之后，使用的内存都会被释放，但看来我错了。

然后我尝试使用 iter() 来获取迭代器，因为对于 dict 对象，使用 dict.iteritems() 可以节省内存，与 dict.iter() 比较。

但没有魔法。

所以现在我不知道如何解决这个问题。

def gen_records(record_name, img_path_file, label_map):
    writer = tf.python_io.TFRecordWriter(record_name)
    classes = []

    with open(label_map, 'r') as f:
        for l in f.readlines():
            classes.append(l.split(',')[0])

    with open(img_path_file, 'r') as f:
        lines = f.readlines()
        num_images = len(lines)
    print 'total number to be written' + str(num_images)
    print 'start writing...'

    patches = []
    with open(img_path_file, 'r') as f:
        for patch in f.readlines():
            patches.append(patch[:-1])

    cnt = 0
    for patch in patches:
        cnt += 1
        # print '[' + str(cnt) + ' / ' + str(num_images) + ']' + 'writing  ' + str()
        img = tf.image.resize_images(np.array(Image.open(patch)), (224, 224), method=tf.image.ResizeMethod.BILINEAR)
        img_raw = np.array(img).tostring()
        label = classes.index(patch.split('/')[1])
        example = tf.train.Example(features=tf.train.Features(feature={
            'label': _int64_feature(int(label)),
            'image': _bytes_feature(img_raw)
        }))

        writer.write(example.SerializeToString())

    writer.close()

如何在每次迭代后“释放”使用的内存？或者如何保存内存？

最佳答案

首先要尝试的是按需加载每张图片。删除加载图片的行(第 15 到 18 行)并在 gen_records 之外定义以下函数:

def generate_patches():
    with open('testfile.txt', 'r') as f:
        for patch in f.readlines():
            yield patch[:-1]

然后将for循环的定义替换为

for patch in generate_patches():
    ...

关于python - tensorflow - TFRecordWriter 在写入文件时占用太多内存？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44482663/

python - tensorflow - TFRecordWriter 在写入文件时占用太多内存？

上一篇：python - 限制python中局部变量的范围

下一篇：python - 为什么我的 for 循环覆盖而不是追加？