我有以下 df:
Symbol Time Close Sessions DR ADR
0 AMD 2019-11-18 39.88 387 1.39 NaN
1 AMD 2019-11-19 41.29 388 2.10 NaN
2 AMD 2019-11-20 40.98 389 1.68 NaN
3 AMD 2019-11-21 39.52 390 2.07 NaN
4 AMD 2019-11-22 39.15 391 1.70 NaN
... ... ... ... ... ... ...
1600 UPST 2021-09-03 247.29 1597 14.13 NaN
1601 UPST 2021-09-07 262.70 1598 21.90 NaN
1602 UPST 2021-09-08 274.33 1599 15.64 NaN
1603 UPST 2021-09-09 289.60 1600 29.16 NaN
1604 UPST 2021-09-10 270.46 1605 25.98 NaN
我想在 ADR
列中获取 DR
的 20 天滚动平均值。
我的代码是:
df_day['ADR'] = df_day.groupby('Sessions')['DR'].rolling(20).mean().reset_index(0,drop=True)
返回:
Symbol Time Close Sessions DR ADR
0 AMD 2019-11-18 39.88 NaN NaN NaN
1 AMD 2019-11-19 41.29 NaN NaN NaN
2 AMD 2019-11-20 40.98 NaN NaN NaN
3 AMD 2019-11-21 39.52 NaN NaN NaN
4 AMD 2019-11-22 39.15 NaN NaN NaN
... ... ... ... ... .. ...
1600 UPST 2021-09-03 247.29 NaN NaN NaN
1601 UPST 2021-09-07 262.70 NaN NaN NaN
1602 UPST 2021-09-08 274.33 NaN NaN NaN
1603 UPST 2021-09-09 289.60 NaN NaN NaN
1604 UPST 2021-09-10 270.46 NaN NaN NaN
最佳答案
问题是您按列进行分组,Sessions
列中每组的唯一值少于 20 多个,因此输出始终为 NaN
。
我认为您需要 groupby
by Symbol
而不是列 Sessions
,但还需要具有 20+
值的组.
df_day['ADR'] = df_day.groupby('Symbol')['DR'].rolling(20).mean().reset_index(0,drop=True)
通过20D
对齐的解决方案是首先创建DatetimeIndex
:
df_day['Time'] = pd.to_datetime(df_day['Time'])
df_day = df_day.set_index('Time')
df_day['ADR'] = df_day.groupby('Symbol')['DR'].rolling(20, freq='D').mean().reset_index(0,drop=True)
编辑:
3天测试:
df_day['ADR'] = df_day.groupby('Symbol')['DR'].rolling(3).mean().reset_index(0,drop=True)
print (df_day)
Symbol Time Close Sessions DR ADR
0 AMD 2019-11-18 39.88 387 1.39 NaN
1 AMD 2019-11-19 41.29 388 2.10 NaN
2 AMD 2019-11-20 40.98 389 1.68 1.723333
3 AMD 2019-11-21 39.52 390 2.07 1.950000
4 AMD 2019-11-22 39.15 391 1.70 1.816667
1600 UPST 2021-09-03 247.29 1597 14.13 NaN
1601 UPST 2021-09-07 262.70 1598 21.90 NaN
1602 UPST 2021-09-08 274.33 1599 15.64 17.223333
1603 UPST 2021-09-09 289.60 1600 29.16 22.233333
1604 UPST 2021-09-10 270.46 1605 25.98 23.593333
df_day['Time'] = pd.to_datetime(df_day['Time'])
df_day = df_day.set_index('Time')
df_day['ADR'] = df_day.groupby('Symbol')['DR'].rolling(3, freq='D').mean().reset_index(0,drop=True)
print (df_day)
Symbol Close Sessions DR ADR
Time
2019-11-18 AMD 39.88 387 1.39 NaN
2019-11-19 AMD 41.29 388 2.10 NaN
2019-11-20 AMD 40.98 389 1.68 1.723333
2019-11-21 AMD 39.52 390 2.07 1.950000
2019-11-22 AMD 39.15 391 1.70 1.816667
2021-09-03 UPST 247.29 1597 14.13 NaN
2021-09-07 UPST 262.70 1598 21.90 NaN
2021-09-08 UPST 274.33 1599 15.64 17.223333
2021-09-09 UPST 289.60 1600 29.16 22.233333
2021-09-10 UPST 270.46 1605 25.98 23.593333
关于python - Pandas : groupby function + rolling mean + reset index returning Nan,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69172667/