python - h5py翻转图像尺寸

从创建我自己的h5数据库开始，我一直在构建一种机器学习算法来识别图像。我一直在关注this教程，它非常有用，但是我一直遇到一个主要错误-在代码的图像处理部分中使用OpenCV时，该程序无法保存已处理的图像，因为它不断翻转高度和我图像的宽度。当我尝试编译时，出现以下错误:

Traceback (most recent call last):
   File "array+and+label+data.py", line 79, in <module>
   hdf5_file["train_img"][i, ...] = img[None]
   File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
   File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
   File "/Users/USER/miniconda2/lib/python2.7/site-packages/h5py/_hl/dataset.py", line 631, in __setitem__
   for fspace in selection.broadcast(mshape):
   File "/Users/USER/miniconda2/lib/python2.7/site-packages/h5py/_hl/selections.py", line 299, in broadcast
   raise TypeError("Can't broadcast %s -> %s" % (target_shape, count))
   TypeError: Can't broadcast (1, 240, 320, 3) -> (1, 320, 240, 3)

我的图片应该全部设置为320 x 240，但是您可以看到它以某种方式被翻转了。研究表明，这是因为OpenCV和NumPy使用不同的高度和宽度约定，但是我不确定如何在不修补我的OpenCV安装补丁的情况下协调此问题。关于如何解决此问题的任何想法？我是Python及其所有库的相对新手(尽管我很了解Java)!

先感谢您!

编辑:为上下文添加更多代码，这与本教程中“加载图像并保存它们”代码示例下的内容非常相似。

我的数组大小:

train_shape = (len(train_addrs), 320, 240, 3)
val_shape = (len(val_addrs), 320, 240, 3)
test_shape = (len(test_addrs), 320, 240, 3)

循环遍历图像地址并调整其大小的代码:

# Loop over training image addresses
  for i in range(len(train_addrs)):
     # print how many images are saved every 1000 images
     if i % 1000 == 0 and i > 1:
     print ('Train data: {}/{}'.format(i, len(train_addrs)))

     # read an image and resize to (320, 240)
     # cv2 load images as BGR, convert it to RGB
     addr = train_addrs[i]
     img = cv2.imread(addr)
     img = cv2.resize(img, (320, 240), interpolation=cv2.INTER_CUBIC)
     img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

     # save the image and calculate the mean so far
     hdf5_file["train_img"][i, ...] = img[None]
     mean += img / float(len(train_labels))

最佳答案

Researching around has shown me that this is because OpenCV and NumPy use different conventions for height and width

不完全是。关于图像的唯一棘手的事情是2D数组/矩阵用(row，col)索引，这与我们可能用于图像的普通笛卡尔坐标(x，y)相反。因此，有时当您在OpenCV函数中指定点时，它希望它们在(x，y)坐标中-并且类似地，它希望在(w，h)中而不是(h， w)像数组一样。 OpenCV的resize()函数内部就是这种情况。您将其传递给(h，w)，但实际上需要(w，h)。从docs for resize() :

dsize – output image size; if it equals zero, it is computed as:
dsize = Size(round(fx*src.cols), round(fy*src.rows))
Either dsize or both fx and fy must be non-zero.

因此，您可以在此处看到列数是第一个维度(宽度)，行数是第二个维度(高度)。

简单的解决方法是在resize()函数中将(h，w)交换为(w，h):

img = cv2.resize(img, (240, 320), interpolation=cv2.INTER_CUBIC)

关于python - h5py翻转图像尺寸，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47615309/

python - h5py翻转图像尺寸

上一篇：opencv - Python TypeError:预期为整数参数， float

下一篇：python - jupyter 上没有模块 cv