假设我有一个 2D NumPy 数组:
x = np.random.rand(100, 100000)
我检索按列排序的索引(即,每列独立于其他列进行排序,并返回索引):
idx = np.argsort(x, axis=0)
然后,对于每一列,我需要来自索引 = [10, 20, 30, 40, 50] 的值首先是(该列的)前 5 行,然后是其余的排序值(不是索引!)。
一种天真的方法可能是:
indices = np.array([10, 20, 30, 40, 50])
out = np.empty(x.shape, dtype=int64)
for col in range(x.shape[1]):
# For each column, fill the first few rows with `indices`
out[:indices.shape[0], col] = x[indices, col] # Note that we want the values, not the indices
# Then fill the rest of the rows in this column with the remaining sorted values excluding `indices`
n = indices.shape[0]
for row in range(indices.shape[0], x.shape[0]):
if idx[row, col] not in indices:
out[n, col] = x[row, col] # Again, note that we want the value, not the index
n += 1
最佳答案
方法#1
这是一个基于 previous post
不需要的 idx
——
xc = x.copy()
xc[indices] = (xc.min()-np.arange(len(indices),0,-1))[:,None]
out = np.take_along_axis(x,xc.argsort(0),axis=0)
方法#2
另一个与
np.isin
使用 idx
的掩蔽——mask = np.isin(idx, indices)
p2 = np.take_along_axis(x,idx.T[~mask.T].reshape(x.shape[1],-1).T,axis=0)
out = np.vstack((x[indices],p2))
方法#2- 替代
如果您正在不断编辑到
out
更改除那些之外的所有内容 indices
,数组分配可能适合您-n = len(indices)
out[:n] = x[indices]
mask = np.isin(idx, indices)
lower = np.take_along_axis(x,idx.T[~mask.T].reshape(x.shape[1],-1).T,axis=0)
out[n:] = lower
关于python - 有效地重新排列 2D NumPy 数组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61936423/