python - 在 Python 中对多页 TIFF 页面进行平均

将 16 位 TIFF 图像的许多帧平均为 numpy 数组的最快/内存效率最高的方法是什么？

到目前为止我想到的是下面的代码。令我惊讶的是，方法 2 比方法 1 更快。

但是，对于分析永远不要假设，测试它!所以，我想测试更多。值得一试Wand ？我没有包括在这里，因为在安装 ImageMagick-6.8.9-Q16 和 MAGICK_HOME env var 之后它仍然没有导入...... Python 中多页 tiff 的任何其他库？ GDAL 对此可能有点太多了。

(编辑)我包含了 libtiff。仍然是方法 2 最快且内存效率很高。

from time import time

#import cv2  ## no multi page tiff support
import numpy as np
from PIL import Image
#from scipy.misc import imread  ## no multi page tiff support
import tifffile # http://www.lfd.uci.edu/~gohlke/code/tifffile.py.html
from libtiff import TIFF # https://code.google.com/p/pylibtiff/

fp = r"path/2/1000frames-timelapse-image.tif"

def method1(fp):
    '''
    using tifffile.py by Christoph (Version: 2014.02.05)
    (http://www.lfd.uci.edu/~gohlke/code/tifffile.py.html)
    '''
    with tifffile.TIFFfile(fp) as imfile:
        return imfile.asarray().mean(axis=0)


def method2(fp):
    'primitive peak memory friendly way with tifffile.py'
    with tifffile.TIFFfile(fp) as imfile:

        nframe, h, w = imfile.series[0]['shape']
        temp = np.zeros( (h,w), dtype=np.float64 )

        for n in range(nframe):
            curframe = imfile.asarray(n)
            temp += curframe

        return (temp / nframe)


def method3(fp):
    ' like method2 but using pillow 2.3.0 '
    im = Image.open(fp)

    w, h = im.size
    temp = np.zeros( (h,w), dtype=np.float64 )

    n = 0
    while True:
        curframe = np.array(im.getdata()).reshape(h,w)
        temp += curframe
        n += 1
        try:
            im.seek(n)
        except:
            break

    return (temp / n)


def method4(fp):
    '''
    https://code.google.com/p/pylibtiff/
    documentaion seems out dated.
    '''

    tif = TIFF.open(fp)
    header = tif.info()

    meta = dict()  # extracting meta
    for l in header.splitlines():
        if l:
            if l.find(':')>0:
                parts = l.split(':')
                key = parts[0]
                value = ':'.join(parts[1:])
            elif l.find('=')>0:
                key, value =l.split('=')
            meta[key] = value    

    nframes = int(meta['frames'])
    h = int(meta['ImageLength'])
    w = int(meta['ImageWidth'])

    temp = np.zeros( (h,w), dtype=np.float64 )

    for frame in tif.iter_images():
        temp += frame

    return (temp / nframes)

t0 = time()
avgimg1 = method1(fp)
print time() - t0
# 1.17-1.33 s

t0 = time()
avgimg2 = method2(fp)
print time() - t0
# 0.90-1.53 s  usually faster than method1 by 20%

t0 = time()
avgimg3 = method3(fp)
print time() - t0
# 21 s

t0 = time()
avgimg4 = method4(fp)
print time() - t0
# 1.96 - 2.21 s  # may not be accurate. I got warning for every frame with the tiff file I tested.

np.testing.assert_allclose(avgimg1, avgimg2)
np.testing.assert_allclose(avgimg1, avgimg3)
np.testing.assert_allclose(avgimg1, avgimg4)

最佳答案

简单的逻辑会让我把钱押在方法 1 或 3 上，因为方法 2 和 4 中有 for 循环。 For 循环如果您有更多输入，总是会让您的代码运行得更慢。

我肯定会选择方法 1:整洁、清晰易读...

要真正确定，我会说只是测试它们。如果您不想测试，我会选择方法一。

亲切的问候，

关于python - 在 Python 中对多页 TIFF 页面进行平均，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/23619724/

python - 在 Python 中对多页 TIFF 页面进行平均

上一篇：python - 扩展第一个容器以输出额外的 div 属性

下一篇：python - YouTube API v3 响应缺少视频