我有一个 pandas DataFrame,其中包含 2015-2016 到 2019-2020 赛季的 NFL 四分卫数据。数据框看起来像这样
Player Season End Year YPG TD
Tom Brady 2019 322.6 25
Tom Brady 2018 308.1 26
Tom Brady 2017 295.7 24
Tom Brady 2016 308.7 28
Aaron Rodgers 2019 360.4 30
Aaron Rodgers 2018 358.8 33
Aaron Rodgers 2017 357.9 35
Aaron Rodgers 2016 355.2 32
我希望能够创建包含我选择的年份数据和过去三年数据的新列。例如,如果我选择的年份是 2019 年,则生成的 DataFrame 将为(SY 代表所选年份:
Player Season End Year YPG SY YPG SY-1 YPG SY-2 YPG SY-3 TD
Tom Brady 2019 322.6 308.1 295.7 308.7 25
Aaron Rodgers 2019 360.4 358.8 357.9 355.2 30
这就是我尝试这样做的方式:
NFL_Data.loc[NFL_Data['Season End Year'] == (NFL_Data['SY']), 'YPG SY'] = NFL_Data['YPG']
NFL_Data.loc[NFL_Data['Season End Year'] == (NFL_Data['SY']-1), 'YPG SY-1'] = NFL_Data['YPG']
NFL_Data.loc[NFL_Data['Season End Year'] == (NFL_Data['SY']-2), 'YPG SY-2'] = NFL_Data['YPG']
NFL_Data.loc[NFL_Data['Season End Year'] == (NFL_Data['SY']-3), 'YPG SY-3'] = NFL_Data['YPG']
但是,当我运行上面的代码时,它没有正确填充列。大多数行都是 0。我是否以正确的方式解决问题,或者是否有更好的方法来解决它?
(已编辑以包含 TD 列)
最佳答案
第一步是旋转数据框。
pivoted = df.pivot_table(index='Player', columns='Season End Year', values='YPG')
哪个产量
Season End Year 2016 2017 2018 2019
Player
Aaron Rodgers 355.2 357.9 358.8 360.4
Tom Brady 308.7 295.7 308.1 322.6
然后,您可以选择:
pivoted.loc[:, range(year, year-3, -1)]
2019 2018 2017
Player
Aaron Rodgers 360.4 358.8 357.9
Tom Brady 322.6 308.1 295.7
或者按照 Quang 的建议:
pivoted.loc[:, year:year-3:-1]
关于python-3.x - 根据年份创建新的 DataFrame 列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59146734/