python - 将二进制 numpy 矩阵中的连续 1 block 翻转到一定大小

标签 python numpy matrix binary-data

我正在从事一个图像分析项目。我已将感兴趣的图片(NxM numpy 数组)转换为二进制格式。矩阵中的“1”是感兴趣的区域。存在感兴趣的区域,并且存在不可能代表图像上的特征的噪声。例如,在图像的水平快照中,我对孤立的 1 或 2 组(最多 5 个连续 1)不感兴趣。我想找到一种快速的方法来翻转这些(即使它们=0)。

我的 MWE 用于翻转隔离的 1:

import numpy as np
img = np.random.choice([0,1],size=(1000,1000), p=[1./2,1./2])

#now we take the second derivative of the matrix in the horizontal axis
#since we have a binary matrix, an isolated 1, that is [...010...] is captured
#by a second derivative entry equal to -2
#because ([...010...]->dx->[...1,-1,...]->dx->[...-2...]

ddx_img = np.diff(np.diff(img,1),1)
to_flip = np.where(ddx_img==-2) #returns a tuple of [x,y] matrix entries

# the second derivative eats up an index position on horizontally, so I need to add
# +1 to the horizontal axis of the tuple

temp_copy = to_flip[1].copy() #cannot modify tuple directly, for some reason its read only
temp_copy+=1
to_flip = (to_flip[0],temp_copy)

#now we can flip the entries by adding +1 to the entries to flip and taking mod 2
img[to_flip]=mod(img[to_flip]+1,2)

这在我的机器上大约需要 9 毫秒。我可以做最多 1 秒的例行公事。

我欢迎对代码提出任何批评(我不是一个优秀的 python 程序员),以及关于如何有效扩展此过程以消除连续 1 的孤立岛直至通用大小 S 的孤立岛的任何想法。

提前致谢

编辑:我意识到这个模组是不必要的。当我这样做的时候,我还想翻转太小的 0 岛。可以将 =mod.... 替换为 =0

最佳答案

针对具体问题的案例

编辑后,看来你可以使用一些 slicing从而避免制作中间副本以提高某些性能。这里有两行代码来实现所需的输出 -

# Calculate second derivative
ddx_img = np.diff(np.diff(img,1),1)

# Get sliced version of img excluding the first and last columns 
# and use mask with ddx elements as "-2" to zeros
img[:,1:-1][ddx_img==-2] = 0

运行时测试并验证结果 -

In [42]: A = np.random.choice([0,1],size=(1000,1000), p=[1./2,1./2])

In [43]: def slicing_based(A):
    ...:    img = A.copy()
    ...:    ddx_img = np.diff(np.diff(img,1),1)
    ...:    img[:,1:-1][ddx_img==-2] = 0
    ...:    return img
    ...: 
    ...: 
    ...: def original_approach(A):
    ...: 
    ...:    img = A.copy()
    ...: 
    ...:    ddx_img = np.diff(np.diff(img,1),1)
    ...:    to_flip = np.where(ddx_img==-2)
    ...: 
    ...:    temp_copy = to_flip[1].copy()
    ...:    temp_copy+=1
    ...:    to_flip = (to_flip[0],temp_copy)
    ...: 
    ...:    img[to_flip] = 0
    ...: 
    ...:    return img
    ...: 

In [44]: %timeit slicing_based(A)
100 loops, best of 3: 15.3 ms per loop

In [45]: %timeit original_approach(A)
10 loops, best of 3: 20.1 ms per loop

In [46]: np.allclose(slicing_based(A),original_approach(A))
Out[46]: True
<小时/>

一般情况

为了使解决方案通用,可以使用一些信号处理,特别是2D卷积,如下所示 -

# Define kernel
K1 = np.array([[0,1,1,0]]) # Edit this for different island lengths
K2 = 1-K1

# Generate masks of same shape as img amd based on TRUE and inverted versions of 
# kernels being convolved and those convolved sums being compared against the 
# kernel sums indicating those spefic positions have fulfiled both the ONES 
# and ZEROS criteria
mask1 = convolve2d(img, K1, boundary='fill',fillvalue=0, mode='same')==K1.sum()
mask2 = convolve2d(img==0, K2, boundary='fill',fillvalue=0, mode='same')==K2.sum()

# Use a combined mask to create that expanses through the kernel length 
# and use it to set those in img to zeros
K3 = np.ones((1,K1.size))
mask3 = convolve2d(mask1 & mask2, K3, boundary='fill',fillvalue=0, mode='same')>0
img_out = img*(~mask3)

示例输入、输出 -

In [250]: img
Out[250]: 
array([[0, 1, 1, 1, 0, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 0, 1],
       [1, 0, 1, 1, 1, 1, 0, 0],
       [1, 1, 1, 1, 0, 1, 0, 1],
       [1, 1, 0, 1, 1, 0, 1, 1],
       [1, 0, 1, 1, 1, 1, 1, 1],
       [1, 1, 0, 1, 1, 0, 1, 0],
       [1, 1, 1, 0, 1, 1, 1, 1]])

In [251]: img_out
Out[251]: 
array([[0, 1, 1, 1, 0, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 0, 1],
       [1, 0, 1, 1, 1, 1, 0, 0],
       [1, 1, 1, 1, 0, 1, 0, 1],
       [1, 1, 0, 0, 0, 0, 0, 1],
       [1, 0, 1, 1, 1, 1, 1, 1],
       [1, 1, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 0, 1, 1, 1, 1]])

关于python - 将二进制 numpy 矩阵中的连续 1 block 翻转到一定大小,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31805782/

相关文章:

matlab - 如何在 MATLAB 中为矩阵的特定位置赋值?

python - 我可以多次训练我的分类器吗?

python - numpy从多个文件创建数组

python - 带有图像的 numpy vstack

python - scipy.fftpack 的内存使用情况

java - 为什么 Java 引用在这个程序中没有像预期的那样工作

python - 如何在PIL中的透明图像上绘制unicode字符

python - 远程 : ImportError: No module named gitlab

python - 如何在 Altair 中绘制带有中线的预装箱直方图?

matrix - OSRM距离矩阵