python - 如何根据特定列的值重新排列数据框的行

标签 python pandas indexing

我正在处理一个数据框,其中有一列名为season。每个赛季都有很多场比赛。季节顺序要重新安排。 季节顺序为2017,2008,2009,2010,2011,2012,2013,2014,2015,2016,2018,2019

我想将 2017 年赛季的所有行带到 2016 年赛季的行之后。

数据如下所示,(将id重命名为ma​​tch_id,这里显示的列很少,总共有18列)

    match_id    season  city        winner
0   1           2017    Hyderabad   Sunrisers Hyderabad
1   2           2017    Pune        Rising Pune Supergiant
2   3           2017    Rajkot      Kolkata Knight Riders   
3   4           2017    Indore      Kings XI Punjab
4   5           2017    Bangalore   Royal Challengers Bangalore 

我试过这个,

df.set_index('season')

然后,

df.reindex([2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015,2016, 2017, 2018, 2019])

但是输出很糟糕,

        match_id    season  city    winner
2008    NaN         NaN     NaN     NaN 
2009    NaN         NaN     NaN     NaN
2010    NaN         NaN     NaN     NaN
2011    NaN         NaN     NaN     NaN
2012    NaN         NaN     NaN     NaN
2013    NaN         NaN     NaN     NaN
2014    NaN         NaN     NaN     NaN
2015    NaN         NaN     NaN     NaN
2016    NaN         NaN     NaN     NaN
2017    NaN         NaN     NaN     NaN
2018    NaN         NaN     NaN     NaN
2019    NaN         NaN     NaN     NaN

最佳答案

第一个想法是按有序分类进行排序,按 list 排序:

L =[2008,2009, 2010, 2011, 2012, 2013, 2014, 2015,2016,2017, 2018, 2019]
df['season'] = pd.Categorical(df['season'], ordered=True, categories=L)

df = df.sort_values(['season','match_id'], ignore_index=True)

或者您可以使用枚举创建字典,以便在 key 参数中进行映射:

L =[2008,2009, 2010, 2011, 2012, 2013, 2014, 2015,2016,2017, 2018, 2019]

d = {v: k for k, v in enumerate(L)}
df = df.sort_values('season', key = lambda x: x.map(d), ignore_index=True)

关于python - 如何根据特定列的值重新排列数据框的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69850924/

相关文章:

python - 使用 restart=Never 运行 python 时,kubectl 挂起

mysql - 使用 2 个范围的查询的索引设计

elasticsearch - Elastiscsearch索引作为Logstash中的文件名-引用问题

python - 计算字符串中最小长度的唯一单词

python - 为什么我得到 "AttributeError: ' unicode' object has no attribute 'user' "on some specified url only only?

python - 使用 Pandas 数据框按日期和小时对数据进行分组

python - pandas 列表列的频率计数

r - 如何使用核心 R 操作/访问 "dist"类实例的元素?

python - Docker 容器 - 源文件消失

python - 我的图表上未显示滚动平均值