问题
在以下数据帧df
中:
import random
import pandas as pd
random.seed(999)
sz = 50
qty = {'one': 1, 'two': 2, 'three': 3}
thing = (random.choice(['one', 'two', 'three']) for _ in range(sz))
order = (random.choice(['ascending', 'descending']) for _ in range(sz))
value = (random.randint(0, 100) for _ in range(sz))
df = pd.DataFrame({'thing': thing, 'order': order, 'value': value})
...我想:
- 按
事物
分组 - 按
顺序拆分
- 按照
事物
的值
按照其顺序
进行排序 - 选取该
事物
的最高数量
预期结果
thing order value
0 one ascending 17
1 one descending 1
2 two ascending 28
3 two ascending 30
4 two descending 13
5 two descending 38
6 three ascending 6
7 three ascending 27
8 three ascending 35
9 three descending 4
10 three descending 5
11 three descending 6
手动编码以获取结果:
one_a = df[(df.thing == 'one') & (df.order == 'ascending')].reset_index(drop=True).sort_values('value', ascending='True').head(qty['one'])
one_d = df[(df.thing == 'one') & (df.order == 'descending')].reset_index(drop=True).sort_values('value', ascending='False').head(qty['one'])
two_a = df[(df.thing == 'two') & (df.order == 'ascending')].reset_index(drop=True).sort_values('value', ascending='True').head(qty['two'])
two_d = df[(df.thing == 'two') & (df.order == 'descending')].reset_index(drop=True).sort_values('value', ascending='False').head(qty['two'])
three_a = df[(df.thing == 'three') & (df.order == 'ascending')].reset_index(drop=True).sort_values('value', ascending='True').head(qty['three'])
three_d = df[(df.thing == 'three') & (df.order == 'descending')].reset_index(drop=True).sort_values('value', ascending='False').head(qty['three'])
print(pd.concat([one_a, one_d, two_a, two_d, three_a, three_d], ignore_index=True))
问题
是否可以使用groupby
、sort_values
和set_index
来实现这一点?
最佳答案
一个问题是分别选择升序
和降序
。我们可以通过反转降序
来解决这个问题:
df.loc[df.order=='descending','value']*= -1
s=(df.sort_values('value').groupby(['thing','order'])
.cumcount()
.reindex(df.index)
)
out = df[s<df['thing'].map(qty)].sort_values(['thing','order'])
out.loc[out.order=='descending', 'value'] *= 1
输出:
thing order value
14 one ascending 17
27 one descending 1
13 three ascending 6
17 three ascending 35
38 three ascending 27
4 three descending 5
23 three descending 4
37 three descending 6
21 two ascending 28
42 two ascending 30
6 two descending 38
9 two descending 13
关于pandas - 对数据框中的顶部行进行分组、拆分和选取,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64864630/