假设我有一个 DataFrame:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.arange(0, 24).reshape((3, 8)))
df.columns = pd.MultiIndex.from_arrays([
['a1', 'a1', 'a2', 'a2', 'b1', 'b1', 'b2', 'b2'],
['4th', '5th', '4th', '5th', '4th', '5th', '4th', '5th']
])
print(df)
输出:
a1 a2 b1 b2
4th 5th 4th 5th 4th 5th 4th 5th
0 0 1 2 3 4 5 6 7
1 8 9 10 11 12 13 14 15
2 16 17 18 19 20 21 22 23
我想按字典分组:
label_dict = {'a1': 'A', 'a2': 'A', 'b1': 'B', 'b2': 'B'}
res = df.groupby(label_dict, axis=1, level=0).sum()
print(res)
输出:
A B
0 6 22
1 38 54
2 70 86
但我想要的是:
A A B B
4th 5th 4th 5th
0 2 4 10 12
1 18 21 26 28
2 34 36 42 44
有什么想法吗?谢谢!
最佳答案
在 MultiIndex
列中的两个级别使用 rename
和 sum
:
label_dict = {'a1': 'A', 'a2': 'A', 'b1': 'B', 'b2': 'B'}
res = df.rename(columns=label_dict, level=0).sum(level=[0,1], axis=1)
#alternative with groupby
#res = df.rename(columns=label_dict, level=0).groupby(level=[0,1], axis=1).sum()
print(res)
A B
4th 5th 4th 5th
0 2 4 10 12
1 18 20 26 28
2 34 36 42 44
关于 python Pandas : groupby one level of MultiIndex but remain other levels instead,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50624223/