python - Pandas .pivot_table : How to name functions for aggregation

我正在尝试使用几个聚合函数来转换 pandas DataFrame，其中一些是 lambda。每列必须有一个不同的名称，以便通过多个 lambda 函数进行聚合。我尝试了一些在网上找到的想法，但都没有奏效。这是最小的例子:

df = pd.DataFrame({'col1': [1, 1, 2, 3], 'col2': [4, 4, 5, 6], 'col3': [7, 10, 8, 9]})

pivoted_df = df.pivot_table(index = ['col1', 'col2'], values  = 'col3', aggfunc=[('lam1', lambda x: np.percentile(x, 50)), ('lam2', np.percentile(x, 75)]).reset_index()

错误是

AttributeError: 'SeriesGroupBy' object has no attribute 'lam1'

我尝试使用dictionary，它也导致错误。有人可以帮忙吗？谢谢!

最佳答案

明确命名函数:

def lam1(x):
    return np.percentile(x, 50)

def lam2(x):
    return np.percentile(x, 75)

pivoted_df = df.pivot_table(index = ['col1', 'col2'], values  = 'col3',
                            aggfunc=[lam1, lam2]).reset_index()

然后您的聚合系列将被适本地命名:

print(pivoted_df)

   col1  col2  lam1  lam2
0     1     4   8.5  9.25
1     2     5   8.0  8.00
2     3     6   9.0  9.00

docs对于 pd.pivot_table 解释原因:

aggfunc : function, list of functions, dict, default numpy.mean

If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves) If dict is passed, the key is column to aggregate and value is function or list of functions

关于python - Pandas .pivot_table : How to name functions for aggregation，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52870521/

python - Pandas .pivot_table : How to name functions for aggregation

上一篇：python - 如何在 Pandas 中转换数据框

下一篇：python - 将字符串放在字符串中