python - 如何检测图像上的对象？

我需要 python 解决方案。

我有 40-60 张图片(Happy Holiday 集)。我需要检测所有这些图像上的对象。

我不知道对象的大小、形状、图像上的位置，我没有任何对象模板。我只知道一件事:这个物体几乎存在于所有图像中。我叫它不明飞行物。

示例: enter image description here

如示例所示，从图像到图像，除了 UFO 之外，一切都会发生变化。检测后我需要得到:

左上角的X坐标

左上角的Y坐标

蓝色对象区域的宽度(我在示例中将区域标记为红色矩形)

蓝色对象区域的高度

最佳答案

当您将图像数据作为数组时，您可以使用内置的 numpy 函数轻松快速地完成此操作:

import numpy as np
import PIL

image = PIL.Image.open("14767594_in.png")

image_data = np.asarray(image)
image_data_blue = image_data[:,:,2]

median_blue = np.median(image_data_blue)

non_empty_columns = np.where(image_data_blue.max(axis=0)>median_blue)[0]
non_empty_rows = np.where(image_data_blue.max(axis=1)>median_blue)[0]

boundingBox = (min(non_empty_rows), max(non_empty_rows), min(non_empty_columns), max(non_empty_columns))

print boundingBox

将为您提供第一张图片:

(78, 156, 27, 166)

所以你想要的数据是:

左上角是 (x,y):(27, 78)
宽度:166 - 27 = 139
高度:156 - 78 = 78

我选择“每个蓝色值大于所有蓝色值中值的像素”属于您的对象。我希望这对你有用；如果没有，请尝试其他方法或提供一些不起作用的示例。

编辑我修改了我的代码以使其更通用。由于两个具有相同形状颜色的图像不够通用(正如您的评论所示)，我综合创建了更多样本。

def create_sample_set(mask, N=36, shape_color=[0,0,1.,1.]):
    rv = np.ones((N, mask.shape[0], mask.shape[1], 4),dtype=np.float)
    mask = mask.astype(bool)
    for i in range(N):
        for j in range(3):
            current_color_layer = rv[i,:,:,j]
            current_color_layer[:,:] *= np.random.random()
            current_color_layer[mask] = np.ones((mask.sum())) * shape_color[j]
    return rv

在这里，形状的颜色是可调的。对于 N=26 图像中的每一个，选择随机背景颜色。也可以在背景中加入噪音，这不会改变结果。

然后，我阅读了您的示例图像，从中创建了一个形状蒙版并使用它来创建示例图像。我将它们绘制在网格上。

# create set of sample image and plot them
image = PIL.Image.open("14767594_in.png")
image_data = np.asarray(image)
image_data_blue = image_data[:,:,2]
median_blue = np.median(image_data_blue)
sample_images = create_sample_set(image_data_blue>median_blue)
plt.figure(1)
for i in range(36):
    plt.subplot(6,6,i+1)
    plt.imshow(sample_images[i,...])
    plt.axis("off")
plt.subplots_adjust(0,0,1,1,0,0)

Blue shapes

对于 shape_color 的另一个值(create_sample_set(...) 的参数)，可能如下所示:

Green shapes

接下来，我将使用标准差确定每个像素的可变性。正如您所说，该对象(几乎)位于同一位置的所有图像上。所以这些图像中的可变性会很低，而对于其他像素，它会明显更高。

# determine per-pixel variablility, std() over all images
variability = sample_images.std(axis=0).sum(axis=2)

# show image of these variabilities
plt.figure(2)
plt.imshow(variability, cmap=plt.cm.gray, interpolation="nearest", origin="lower")

最后，就像我的第一个代码片段一样，确定边界框。现在我也提供它的情节。

# determine bounding box
mean_variability = variability.mean()
non_empty_columns = np.where(variability.min(axis=0)<mean_variability)[0]
non_empty_rows = np.where(variability.min(axis=1)<mean_variability)[0]
boundingBox = (min(non_empty_rows), max(non_empty_rows), min(non_empty_columns), max(non_empty_columns))

# plot and print boundingBox
bb = boundingBox
plt.plot([bb[2], bb[3], bb[3], bb[2], bb[2]],
         [bb[0], bb[0],bb[1], bb[1], bb[0]],
         "r-")
plt.xlim(0,variability.shape[1])
plt.ylim(variability.shape[0],0)

print boundingBox
plt.show()

BoundingBox and extracted shape

就是这样。我希望这次足够笼统。

完整的复制和粘贴脚本:

import numpy as np
import PIL
import matplotlib.pyplot as plt


def create_sample_set(mask, N=36, shape_color=[0,0,1.,1.]):
    rv = np.ones((N, mask.shape[0], mask.shape[1], 4),dtype=np.float)
    mask = mask.astype(bool)
    for i in range(N):
        for j in range(3):
            current_color_layer = rv[i,:,:,j]
            current_color_layer[:,:] *= np.random.random()
            current_color_layer[mask] = np.ones((mask.sum())) * shape_color[j]
    return rv

# create set of sample image and plot them
image = PIL.Image.open("14767594_in.png")
image_data = np.asarray(image)
image_data_blue = image_data[:,:,2]
median_blue = np.median(image_data_blue)
sample_images = create_sample_set(image_data_blue>median_blue)
plt.figure(1)
for i in range(36):
    plt.subplot(6,6,i+1)
    plt.imshow(sample_images[i,...])
    plt.axis("off")
plt.subplots_adjust(0,0,1,1,0,0)

# determine per-pixel variablility, std() over all images
variability = sample_images.std(axis=0).sum(axis=2)

# show image of these variabilities
plt.figure(2)
plt.imshow(variability, cmap=plt.cm.gray, interpolation="nearest", origin="lower")

# determine bounding box
mean_variability = variability.mean()
non_empty_columns = np.where(variability.min(axis=0)<mean_variability)[0]
non_empty_rows = np.where(variability.min(axis=1)<mean_variability)[0]
boundingBox = (min(non_empty_rows), max(non_empty_rows), min(non_empty_columns), max(non_empty_columns))

# plot and print boundingBox
bb = boundingBox
plt.plot([bb[2], bb[3], bb[3], bb[2], bb[2]],
         [bb[0], bb[0],bb[1], bb[1], bb[0]],
         "r-")
plt.xlim(0,variability.shape[1])
plt.ylim(variability.shape[0],0)

print boundingBox
plt.show()

关于python - 如何检测图像上的对象？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/14767594/

python - 如何检测图像上的对象？

上一篇：python - 如何在其派生类中覆盖列表的切片功能

下一篇：python - 处理大型 Numpy 数组的技术？