python - pandas:在 groupby 时为每个子组添加总行(特别是对于非加法方法，例如 `nunique` )

标签 python pandas dataframe pandas-groupby

这样很容易产生多级groupby结果

                Max Speed
Animal Type
Falcon Captive      390.0
       Wild         350.0
Parrot Captive       30.0
       Wild          20.0

代码看起来像df.groupby(['animal', 'type'])['speed'].max()

但是，如果我想向每个子组添加一个总行，以产生类似这样的结果

                Max Speed
Animal Type
Falcon Captive      390.0
       Wild         350.0
       overall      390.0
Parrot Captive       30.0
       Wild          20.0
       overall       30.0

我该怎么做？

之所以添加子级别行，是因为它可以让我在其他同事的BI工具中选择类别。

更新:在上面的示例中，我展示了使用 max()，我还想知道如何使用 user_id.nunique() 来实现。

现在我通过 2 个 groupby 生成结果，然后将它们连接起来。像

df1 = df.groupby(['animal', 'type'])['speed'].max()
df2 = df.groupby(['animal'])['speed'].max()
##### ...  manually add `overall` index to df_2
df_total = pd.concat([df1, df2]).sort_index()

但是好像有点太手动了，有没有更好的方法？

最佳答案

您可以使用 2 个 concat 来执行此操作，从您的 groupby 结果开始。

g = df.groupby(level=0).max()

m = pd.concat([g], keys=['overall'], names=['Type']).swaplevel(0, 1)

pd.concat([df, m], axis=0).sort_index(level=0)

                Max Speed
Animal Type
Falcon Captive      390.0
       Wild         350.0
       overall      390.0
Parrot Captive       30.0
       Wild          20.0
       overall       30.0

关于python - pandas:在 groupby 时为每个子组添加总行(特别是对于非加法方法，例如 `nunique` )，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57147178/

上一篇：python - 对基于 'if' 的列表元素进行子集化

下一篇：python - 如何在 python 中仅展平列表列表的第二级(稍后将其变成字典)？

相关文章：

python - 使用 Scrapy 在 Python 中选择部分文本字段

python - Numpy 条件最大范围

python - 有没有办法制作列表的列表，然后删除该列表中的第一项？

python - 尝试在 python 中使用 hyperopt 调整最近邻居时出错

python - 使用 Django ORM 进行表连接

python - DataFrame 无法通过 : getting following error: tuple indices must be integers or slices, 而不是 str 进行迭代

python - 根据其他列的 if-else 填充 pandas DataFrame 的新列

python - 当 str_ 无法提升 datetime64 时，使用 numpy.where

python - 在 Pandas 数据框中用可变类型替换 NaN

python - Pyspark Dataframe - 如何根据列数组作为输入连接列