对于一列的每个值(在我下面的示例中为“日期”),我想按另一列中的值(在我的示例中为“值”列)进行排名
我的代码可以运行,但我想知道是否可以在没有 Python for 循环的情况下完成
data = {'code': ['AAA', 'AAA', 'AAA', 'BBB', 'BBB', 'BBB', 'CCC', 'CCC', 'CCC'],
'date': ['2001-01-01', '2001-01-02', '2001-01-03', '2001-01-01', '2001-01-02', '2001-01-03', '2001-01-01', '2001-01-02', '2001-01-03'],
'value': [32, 23, 34, 23, 34, 12, 28, 39, 40]}
df = pd.DataFrame(data)
print(df)
pd.set_option('mode.chained_assignment', None)
result = pd.DataFrame()
for date in df['date'].unique():
sub = df[df['date'] == date]
sub['rank'] = len(sub) - sub['value'].rank() + 1
result = result.append(sub[['code', 'date', 'rank']])
pd.set_option('mode.chained_assignment', 'warn')
df2 = pd.merge(df, result, on=['code', 'date'])
print(df2.sort_values(['date', 'code'])) # within each date, rows are ranked by value
code date value rank
0 AAA 2001-01-01 32 1.0
3 BBB 2001-01-01 23 3.0
6 CCC 2001-01-01 28 2.0
1 AAA 2001-01-02 23 3.0
4 BBB 2001-01-02 34 2.0
7 CCC 2001-01-02 39 1.0
2 AAA 2001-01-03 34 2.0
5 BBB 2001-01-03 12 3.0
8 CCC 2001-01-03 40 1.0
我能否在不通过 Python for 循环迭代的情况下获得相同的结果?
最佳答案
然后让我们用groupby
做rank
df['rank'] = df.groupby('code')['value'].rank()
df
Out[491]:
code date value rank
0 AAA 2001-01-01 32 2.0
1 AAA 2001-01-02 23 1.0
2 AAA 2001-01-03 34 3.0
3 BBB 2001-01-01 23 2.0
4 BBB 2001-01-02 34 3.0
5 BBB 2001-01-03 12 1.0
6 CCC 2001-01-01 28 1.0
7 CCC 2001-01-02 39 2.0
8 CCC 2001-01-03 40 3.0
关于python - pandas dataframe,按另一列中的值排名,不使用 Python FOR 循环,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66809926/