python - Pandas - 多指数均值

我对 pandas 相当陌生，我正在努力在多索引系列中获得适当的平均值。多索引系列目前看起来像这样；

idx = pd.MultiIndex.from_tuples([('foo', 111), ('foo', 222),
                                 ('bar', 111), ('bar', 222), ('bar', 333),
                                 ('baz', 111),
                                 ('qux', 111), ('qux', 222)],
                                names=['ID', 'Account Number'])

df = pd.DataFrame(index=idx, data={'Service 1': 18, 'Service 2': 22, 'Total cost': 40})
df = pd.concat([df], keys=['Cost'], axis=1)

                        Cost                     
                   Service 1 Service 2 Total cost
ID  Account Number                               
foo 111                   18        22         40
    222                   18        22         40
bar 111                   18        22         40
    222                   18        22         40
    333                   18        22         40
baz 111                   18        22         40
qux 111                   18        22         40
    222                   18        22         40

从中提取所有数据的表在帐号级别将成本应用于服务 1 和 2，但它真正需要做的是在 ID 级别应用成本并将成本分配给帐号，所以它是什么应该看起来像这样；

                        Cost                      
                   Service 1  Service 2 Total cost
ID  Account Number                                
foo 111                  9.0  11.000000  20.000000
    222                  9.0  11.000000  20.000000
bar 111                  6.0   7.333333  13.333333
    222                  6.0   7.333333  13.333333
    333                  6.0   7.333333  13.333333
baz 111                 18.0  22.000000  40.000000
qux 111                  9.0  11.000000  20.000000
    222                  9.0  11.000000  20.000000

我已经尝试过df.groupby(['ID']).transform('mean')但这显然给了我原始数据，我不知道如何到达我需要的地方。

感觉我已经解决了这个问题，所以任何帮助将不胜感激。

最佳答案

感谢@ALollz 的编辑。 Its always helpful to have the full Dataframe constructor code incase there is a multi-index

您可以在第一级进行 groupby 并转换 count ，然后除以:

df.div(df.groupby(level=0).transform('count'))

                        Cost                      
                   Service 1  Service 2 Total cost
ID  Account Number                                
foo 111                  9.0  11.000000  20.000000
    222                  9.0  11.000000  20.000000
bar 111                  6.0   7.333333  13.333333
    222                  6.0   7.333333  13.333333
    333                  6.0   7.333333  13.333333
baz 111                 18.0  22.000000  40.000000
qux 111                  9.0  11.000000  20.000000
    222                  9.0  11.000000  20.000000

关于python - Pandas - 多指数均值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59719938/

python - Pandas - 多指数均值

上一篇：r - 如何在漏斗图中具有固定大小但动态悬停文本？

下一篇：angular - 按每个单词的第一个字母过滤包含元素的数组