python - Pandas MultiIndex Dataframe Groupby Rolling Mean

我想计算第二级数据帧分组的滚动平均值(以下代码示例中的 Key2)。

import pandas as pd
d = {'Key1':[1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6], 'Key2':[2,7,8,5,3,2,7,5,8,7,2,9,8,3,9,2,7,9],'Value':[1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3]}
df = pd.DataFrame(d)
df = df.set_index(['Key1', 'Key2'])
df['MA'] = (df.groupby('Key2')['Value']
                .rolling(window=3)
                .mean()
                .reset_index(level=0, drop=True))

print(df)

预期输出:

           Value        MA
Key1 Key2                 
1    2         1       NaN
     7         2       NaN
     8         3       NaN
2    5         1       NaN
     3         2       NaN
     2         3       NaN
3    7         1       NaN
     5         2       NaN
     8         3       NaN
4    7         1  1.333333
     2         2  2.000000
     9         3       NaN
5    8         1  2.333333
     3         2       NaN
     9         3       NaN
6    2         1  2.000000
     7         2  1.333333
     9         3  3.000000

但实际输出是 NaN。任务好像有问题。
实际输出:

           Value        MA
Key1 Key2                 
1    2         1       NaN
     7         2       NaN
     8         3       NaN
2    5         1       NaN
     3         2       NaN
     2         3       NaN
3    7         1       NaN
     5         2       NaN
     8         3       NaN
4    7         1      NaN
     2         2       NaN
     9         3       NaN
5    8         1      NaN
     3         2       NaN
     9         3       NaN
6    2         1      NaN
     7         2       NaN
     9         3       NaN

Python 3.8 + Pandas 1.2.1。 (也在 Python 3.7.9 + Pandas 1.1.5 上试过)

最佳答案

使用 lambda 函数避免丢失 MultiIndex ，所以分配工作良好:

df['MA'] = df.groupby('Key2')['Value'].apply(lambda x: x.rolling(window=3).mean())
print(df)
           Value        MA
Key1 Key2                 
1    2         1       NaN
     7         2       NaN
     8         3       NaN
2    5         1       NaN
     3         2       NaN
     2         3       NaN
3    7         1       NaN
     5         2       NaN
     8         3       NaN
4    7         1  1.333333
     2         2  2.000000
     9         3       NaN
5    8         1  2.333333
     3         2       NaN
     9         3       NaN
6    2         1  2.000000
     7         2  1.333333
     9         3  3.000000

关于python - Pandas MultiIndex Dataframe Groupby Rolling Mean，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/66202472/

上一篇：r - 按范围对数字向量进行分组

下一篇：android - `suspendCoroutine`抛出`This job has not completed yet`异常？

相关文章：

python - Pandas Group By - 按时间和条件分隔

python - 舍入 pandas.DataFrame 的正确方法？

Python - 如果数字大于 0，则运行平均值

python - 10 分钟了解 Pandas 教程 - to_numpy() 不存在？

python - python列表中的树文件结构？

python - super().__repr__() 和 repr(super()) 有什么区别？

python-3.x - 没有名为 'pandas' 的模块 - Jupyter、Python3 内核、TensorFlow 通过 Docker

python - 根据条件使用 python 中另一列的值创建新列

python - 如何将最后几列从 Pandas 中的字符串类型转换为整数

python - OSX 不断报告 python 意外退出并中止 python