Python如何将数据分组为合并数据框?

标签 python pandas dataframe aggregate

这是我现在的代码:

d = {}
for stage in ['doggo', 'floofer', 'puppo', 'pupper']:
    #d[stage] =df.groupby([stage]).agg({'retweet_count': 'sum'})
    d[stage] = df.groupby(stage)['retweet_count'].sum()
stage_retweets = pd.DataFrame.from_dict(d)

它产生这个:

         doggo      floofer     puppo       pupper
None    1387471.0   1517639.0   1472697.0   1444766.0
doggo   159188.0    NaN         NaN         NaN
floofer NaN         29020.0     NaN         NaN
puppo   NaN         NaN         73962.0     NaN
pupper  NaN         NaN         NaN         101893.0

我真正想要制作的是:

         doggo      floofer     puppo       pupper
None    1387471.0   1517639.0   1472697.0   1444766.0
stage   159188.0    29020.0     73962.0     101893.0     

有谁知道如何实现这一点吗?

最佳答案

d = {}
# 1 - Put your stages in a list variable
stages = ['doggo', 'floofer', 'puppo', 'pupper']

for stage in stages:
    d[stage] = df.groupby(stage)['retweet_count'].sum()
stage_retweets = pd.DataFrame.from_dict(d)
print(stage_retweets)

# 2 - Create a column conditionally to detect if the index in stages list or not
# !! important !! make shure you have only one index level otherwise stage_retweets.index.isin(stages) won't work
stage_retweets['is_stage'] = np.where(stage_retweets.index.isin(stages), 'Stage', 'None')
print(stage_retweets)

# 3 - Groupby this new column
stage_retweets = stage_retweets.groupby('is_stage').sum().reset_index()
print(stage_retweets)

关于Python如何将数据分组为合并数据框?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54727150/

相关文章:

python - Vim 对 Python 代码的缩进错误(使用 python-mode 插件)

Python:使用给定列绘制带有 x 轴的 Pandas 数据框的条形图

python - 根据选定的窗口进行数据帧聚合

python - 如果值是第一次出现并且最近一年出现在 Pandas 中,如何创建 0 或 1

python - Python 中的 Tkinter 输入/条目

python - SQLAlchemy for Python, 'Query' 对象没有属性 'fetchone'

python - 如何限制多处理进程的范围?

python - 数据框中两列或多列之间的 If 语句

python - PANDAS:修剪多个数据帧

python - 如何根据 Pandas 中的奇数/偶数日期创建 bool 列?