python - 查找 DF 中分组值的排名 1 和排名 2

给定以下 python 数据框:

>>> import pandas
>>> df1 = pandas.DataFrame({"dish"     : ["fish", "chicken", "fish", "chicken", "chicken", "veg","veg"],
...                         "location" : ["central", "central", "north", "north", "south", "central", "north"],
...                         "sales" : [1,3,5,2,4,2,2]})
>>> total_sales = df1.groupby(by="dish").sum().reset_index().set_index(["dish"])
>>> df1["proportion_sales"] = df1.apply((lambda row: row["sales"]/total_sales.loc[row["dish"]]), axis=1)
>>> df1
      dish location  sales  proportion_sales
0     fish  central      1          0.166667
1  chicken  central      3          0.333333
2     fish    north      5          0.833333
3  chicken    north      2          0.222222
4  chicken    south      4          0.444444
5      veg  central      2          0.500000
6      veg    north      2          0.500000

我想找出每个位置排名第一和排名第二的菜肴。例如，在 central 中，chicken 排名第 1，fish 排名第 3。

如何将 dish_rank_in_location df 更新为这样？这就是我所拥有的:

      dish location  sales  proportion_sales  rank
0     fish  central      1          0.166667     1
1  chicken  central      3          0.333333     1
2     fish    north      5          0.833333     1
3  chicken    north      2          0.222222     1
4  chicken    south      4          0.444444     1
5      veg  central      2          0.500000     1
6      veg    north      2          0.500000     1

预期输出:

      dish location  sales  proportion_sales  dish_rank_in_location
0     fish  central      1          0.166667     3
1  chicken  central      3          0.333333     2
2     fish    north      5          0.833333     1
3  chicken    north      2          0.222222     3
4  chicken    south      4          0.444444     1
5      veg  central      2          0.500000     1
6      veg    north      2          0.500000     2

最佳答案

此处使用 groupby + rank 和 ascending=False。

df1['dish_rank_in_location'] = df1.groupby('location')\
               .proportion_sales.rank(method='dense', ascending=False)

df1

      dish location  sales  proportion_sales  dish_rank_in_location
0     fish  central      1          0.166667                    3.0
1  chicken  central      3          0.333333                    2.0
2     fish    north      5          0.833333                    1.0
3  chicken    north      2          0.222222                    3.0
4  chicken    south      4          0.444444                    1.0
5      veg  central      2          0.500000                    1.0
6      veg    north      2          0.500000                    2.0

如果您需要整数形式的排名，您可以随时进行强制转换 -

df1['dish_rank_in_location'].astype(int)

0    3
1    2
2    1
3    3
4    1
5    1
6    2
Name: dish_rank_in_location, dtype: int64

将结果分配回来。

关于python - 查找 DF 中分组值的排名 1 和排名 2，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47828067/

python - 查找 DF 中分组值的排名 1 和排名 2

上一篇：python:使用 xmllint 时子进程通信无限期等待

下一篇：python - python 中绘图中的音频文件长度不正确以及音频绘图上的注释段重叠不正确