我按多列对数据框进行分组并聚合以获得多个统计信息。如何获得一个完全扁平的结构,其中每个可能的组键组合都以行的形式枚举,每个统计量以列的形式呈现?
import numpy as np
import pandas as pd
cities = ['Berlin', 'Oslo']
days = ['Monday', 'Friday']
data = pd.DataFrame({
'city': np.random.choice(cities, 12),
'day': np.random.choice(days, 12),
'people': np.random.normal(loc=10, size=12),
'cats': np.random.normal(loc=6, size=12)})
grouped = data.groupby(['city', 'day']).agg([np.mean, np.std])
这样我得到:
cats people
mean std mean std
city day
Berlin Friday 6.146924 0.721263 10.445606 0.730992
Monday 5.239267 NaN 9.022811 NaN
Oslo Friday 6.322276 0.866899 11.579813 0.114341
Monday 5.028919 0.815674 10.458439 1.182689
我需要把它弄平:
city day cats_mean cats_std people_mean people_std
Berlin Friday 6.146924 0.721263 10.445606 0.730992
Berlin Monday 5.239267 NaN 9.022811 NaN
Oslo Friday 6.322276 0.866899 11.579813 0.114341
Oslo Monday 5.028919 0.815674 10.458439 1.182689
最佳答案
In [36]: grouped.columns = grouped.columns.map('_'.join)
In [37]: grouped = grouped.reset_index()
In [38]: grouped
Out[38]:
city day cats_mean cats_std people_mean people_std
0 Berlin Friday 5.852991 1.085163 11.078541 0.839688
1 Berlin Monday 6.978343 0.630983 9.876106 1.846204
2 Oslo Friday 6.096773 1.278176 9.710216 0.691672
关于python - 从 groupby 和多重聚合中展平层次索引 pandas.DataFrame,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43849255/