python - Pandas 多索引 DataFrame 排序

仅显示我的数据

In [14]: new_df
Out[14]: 
action_type                           1     2    3
user_id                                           
0000110e00f7c85f550b329dc3d76210   31.0   4.0  0.0
00004931fe12d6f678f67e375b3806e3    8.0   4.0  0.0
0000c2b8660766ed74bafd48599255f0    0.0   2.0  0.0
0000d8d4ea411b05e0392be855fe9756   19.0   0.0  3.0
ffff18540a9567b455bd5645873e56d5    1.0   0.0  0.0
ffff3c8cf716efa3ae6d3ecfedb2270b   58.0   2.0  0.0
ffffa5fe57d2ef322061513bf60362ff    0.0   2.0  0.0
ffffce218e2b4af7729a4737b8702950    1.0   0.0  0.0
ffffd17a96348904fe49216ba3c7006f    1.0   0.0  0.0

[9 rows x 3 columns]

In [15]: new_df.columns
Out[15]: Int64Index([1, 2, 3], dtype='int64', name=u'action_type')

In [16]: new_df.index
Out[16]: 
Index([u'0000110e00f7c85f550b329dc3d76210',
       u'00004931fe12d6f678f67e375b3806e3',
       ...
       u'ffffa5fe57d2ef322061513bf60362ff',
       u'ffffce218e2b4af7729a4737b8702950',
       u'ffffd17a96348904fe49216ba3c7006f'],
      dtype='object', name=u'user_id', length=9)

我想要的输出是:

# sort by the action_type value 1

action_type                           1     2    3
user_id
ffff3c8cf716efa3ae6d3ecfedb2270b   58.0   2.0  0.0                                         
0000110e00f7c85f550b329dc3d76210   31.0   4.0  0.0
0000d8d4ea411b05e0392be855fe9756   19.0   0.0  3.0
00004931fe12d6f678f67e375b3806e3    8.0   4.0  0.0
ffff18540a9567b455bd5645873e56d5    1.0   0.0  0.0
ffffce218e2b4af7729a4737b8702950    1.0   0.0  0.0
ffffd17a96348904fe49216ba3c7006f    1.0   0.0  0.0
0000c2b8660766ed74bafd48599255f0    0.0   2.0  0.0
ffffa5fe57d2ef322061513bf60362ff    0.0   2.0  0.0

[9 rows x 3 columns]

# sort by the action_type value 2

action_type                           1     2    3
user_id
00004931fe12d6f678f67e375b3806e3    8.0   4.0  0.0
0000110e00f7c85f550b329dc3d76210   31.0   4.0  0.0
ffff3c8cf716efa3ae6d3ecfedb2270b   58.0   2.0  0.0                                         
0000c2b8660766ed74bafd48599255f0    0.0   2.0  0.0
ffffa5fe57d2ef322061513bf60362ff    0.0   2.0  0.0
0000d8d4ea411b05e0392be855fe9756   19.0   0.0  3.0
ffff18540a9567b455bd5645873e56d5    1.0   0.0  0.0
ffffce218e2b4af7729a4737b8702950    1.0   0.0  0.0
ffffd17a96348904fe49216ba3c7006f    1.0   0.0  0.0

[9 rows x 3 columns]

所以，我想做的是按 action_type 对 DataFrame 进行排序，即 1, 2, 3 或总和其中任意一个(action_type 1+2, 1+3, 2+3, 1+2+3 的总和)

输出应按每个用户的action_type(1、2或3)值或action_type的总和排序(例如action_type 1和action_type 2的总和，以及任何组合，例如每个用户的action_type 1和action_type 3之和、action_type 2和action_type 3之和、action_type 1和action_type 2和action_type 3之和。

例如:

对于用户 ID 0000110e00f7c85f550b329dc3d76210，action_type 1 的值为 31.0，action_type 2 的值为 4，action_type 3 的值为 3。该用户的 action_type 1 和 action_type 2 的总和是 31.0 + 4.0 = 35.0

我已经尝试过 new_df.sortlevel()，但似乎它只是通过 user_id 而不是通过 action_type(1, 2 , 3)

我该怎么做，谢谢!

最佳答案

更新:

如果你想按列排序，只需尝试 sort_values

df.sort_values(column_names)

示例:

In [173]: df
Out[173]:
   1  2  3
0  6  3  8
1  0  8  0
2  3  8  0
3  5  2  7
4  1  2  1

按列降序排序2

In [174]: df.sort_values(by=2, ascending=False)
Out[174]:
   1  2  3
1  0  8  0
2  3  8  0
0  6  3  8
3  5  2  7
4  1  2  1

按列总和降序排序2+3

In [177]: df.assign(sum=df.loc[:,[2,3]].sum(axis=1)).sort_values('sum', ascending=False)
Out[177]:
   1  2  3  sum
0  6  3  8   11
3  5  2  7    9
1  0  8  0    8
2  3  8  0    8
4  1  2  1    3

旧答案:

如果我没猜错的话，你可以这样做:

In [107]: df
Out[107]:
   a  b  c
0  9  1  4
1  0  5  7
2  5  9  8
3  3  9  7
4  1  2  5

In [108]: df.assign(sum=df.sum(axis=1)).sort_values('sum', ascending=True)
Out[108]:
   a  b  c  sum
4  1  2  5    8
1  0  5  7   12
0  9  1  4   14
3  3  9  7   19
2  5  9  8   22

关于python - Pandas 多索引 DataFrame 排序，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/37086228/

python - Pandas 多索引 DataFrame 排序

上一篇：python - python 2.7.6 中的 Django url 模式问题

下一篇：python - Flask迁移自动改变数据库