python - 如何根据一行的平均值设置阈值？

我有一个二维数组。我想将每行中大于该行平均值的所有值设置为 0。一些天真的代码是:

new_arr = arr.copy()
for i, row in enumerate(arr):
    avg = np.mean(row)
    for j, pixel in enumerate(row):
        if pixel > avg:
            new_arr[i,j] = 0
        else:
            new_arr[i,j] = 1

这很慢，我想知道是否有某种方法可以使用 Numpy 索引来做到这一点？如果它是整个矩阵的平均值，我可以简单地做:

mask = arr > np.mean(arr)
arr[mask] = 0
arr[np.logical_not(mask)] = 1

是否有某种方法可以使用一维平均值数组或类似的东西来对每行平均值执行此操作？

编辑: 建议的解决方案:

avg = np.mean(arr, axis=0)
mask = arr > avg
new_arr = np.zeros(arr.shape)
arr[mask] = 1

实际上使用的是列平均，这对某些人也可能有用。它相当于:

new_arr = arr.copy()
for i, row in enumerate(arr.T):
    avg = np.mean(row)
    for j, pixel in enumerate(row):
        if pixel > avg:
            new_arr[j,i] = 0
        else:
            new_arr[j,i] = 1

最佳答案

设置

a = np.arange(25).reshape((5,5))

您可以将 keepdims 与 mean 一起使用:

a[a > a.mean(1, keepdims=True)] = 0

array([[ 0,  1,  2,  0,  0],
       [ 5,  6,  7,  0,  0],
       [10, 11, 12,  0,  0],
       [15, 16, 17,  0,  0],
       [20, 21, 22,  0,  0]])

使用 keepdims=True，为 mean 提供以下结果:

array([[ 2.],
       [ 7.],
       [12.],
       [17.],
       [22.]])

这样做的好处是in the docs :

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

关于python - 如何根据一行的平均值设置阈值？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53106728/

上一篇：python - 为什么在使用 NonlinearBlockGS 求解器的耦合组中，第一个组件的输出没有直接传递到第二个组件的输入？

下一篇：python - 在 Python 中处理具有未知默认值的类属性

相关文章：

python - 使用带有文件描述符的 Python Twisted 的示例

python - 当只有年份信息可用时，numpy 将字符串转换为日期时间

python - 在 Python 中遍历多维数组

Python如何以C速度循环遍历numpy中的数组并存储一些位置

python - 打印带有索引的 numpy 数组

Python 和 Dropbox : How to change SmartSync setting for locally created files?

python - 如何 pickle __main__ 中定义的函数/类(python)

python - 没有密码的Django allauth注册

Python 字符串分割

python - 使用 pandas 和 numpy 平均表索引