Python 矩阵通过一列排序

我有一个 n x 2 整数矩阵。第一列是一系列 0,1,-1,2,-2，但是这些是按照从其组成矩阵编译的顺序排列的。第二列是另一个列表中的索引列表。

我想通过第二列对矩阵进行排序。这相当于在 Excel 中选择两列数据，并通过 B 列(其中数据位于 A 列和 B 列)进行排序。请记住，每行第一列中的相邻数据应与其相应的第二列对应数据保留。我使用以下方法查看了解决方案:

data[np.argsort(data[:, 0])]

但这似乎不起作用。有问题的矩阵如下所示:

matrix([[1, 1],
        [1, 3],
        [1, 7],
        ..., 
        [2, 1021],
        [2, 1040],
        [2, 1052]])

最佳答案

您可以使用 np.lexsort :

numpy.lexsort(keys, axis=-1)

Perform an indirect sort using a sequence of keys.

Given multiple sorting keys, which can be interpreted as columns in a spreadsheet, lexsort returns an array of integer indices that describes the sort order by multiple columns.

In [13]: data = np.matrix(np.arange(10)[::-1].reshape(-1,2))

In [14]: data
Out[14]: 
matrix([[9, 8],
        [7, 6],
        [5, 4],
        [3, 2],
        [1, 0]])

In [15]: temp = data.view(np.ndarray)

In [16]: np.lexsort((temp[:, 1], ))
Out[16]: array([4, 3, 2, 1, 0])

In [17]: temp[np.lexsort((temp[:, 1], ))]
Out[17]: 
array([[1, 0],
       [3, 2],
       [5, 4],
       [7, 6],
       [9, 8]])

请注意，如果您将多个键传递给 np.lexsort，则 last 键为主键。倒数第二个键是第二个键，依此类推。

如上面所示，使用 np.lexsort 需要使用临时数组，因为 np.lexsort 不适用于 numpy 矩阵。自从 temp = data.view(np.ndarray) 创建一个 View ，而不是 data 的副本，它不需要太多额外的内存。然而，

temp[np.lexsort((temp[:, 1], ))]

是一个新数组，它确实需要更多内存。

还有一种按列就地排序的方法。这个想法是将数组视为具有两列的结构化数组。与普通 ndarray 不同，结构化数组有一个 sort 方法，允许您将列指定为键:

In [65]: data.dtype
Out[65]: dtype('int32')

In [66]: temp2 = data.ravel().view('int32, int32')

In [67]: temp2.sort(order = ['f1', 'f0'])

请注意，由于 temp2 是 data 的 View ，因此它不需要分配新内存并复制数组。此外，排序 temp2 同时修改 data:

In [69]: data
Out[69]: 
matrix([[1, 0],
        [3, 2],
        [5, 4],
        [7, 6],
        [9, 8]])

关于Python 矩阵通过一列排序，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/13338110/

Python 矩阵通过一列排序

上一篇：python - 正则表达式: Punctuation and greediness

下一篇：python - YAML 不调用构造函数