python - 如何选择 Pandas 中每组的前 3 行？

我得到一个像这样的 pandas 数据框:

    id   prob
0    1   0.5   
1    1   0.6
2    1   0.4
3    1   0.2
4    2   0.3
6    2   0.5
...

我想按“id”对其进行分组，降序排序并获得每组的前 3 个概率。请注意，某些组包含的行数少于 3。最后我想得到一个二维数组，如:

[[1, 0.6, 0.5, 0.4], [2, [0.5, 0.3]]...]

我怎样才能用 pandas 做到这一点？谢谢!

最佳答案

使用sort_values、groupby 和head:

df.sort_values(by=['id','prob'], ascending=[True,False]).groupby('id').head(3).values

输出:

array([[ 1. ,  0.6],
       [ 1. ,  0.5],
       [ 1. ,  0.4],
       [ 2. ,  0.5],
       [ 2. ,  0.3]])

跟随@COLDSPEED 领导:

df.sort_values(by=['id','prob'], ascending=[True,False])\
  .groupby('id').agg(lambda x: x.head(3).tolist())\
  .reset_index().values.tolist()

输出:

[[1, [0.6, 0.5, 0.4]], [2, [0.5, 0.3]]]

关于python - 如何选择 Pandas 中每组的前 3 行？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45992871/

相关文章：

Python:根据条件划分列表的元素