python - 按对象对两个 Pandas 分组求和

标签 python python-3.x pandas indexing pandas-groupby

我有两个按对象分组的 Pandas ,我想对它们的值求和。我无法弄清楚如何合并这两个数据帧,以便 CALL_BLOCK 列具有该 DOW 的所有十个调用 block 并对值求和。我尝试了几种方法,如重置索引和合并两个数据帧,但我仍然无法获得 CALL_BLOCKS 列的所有十个调用 block 。我会感谢你的帮助。非常感谢。

已编辑

df1 = {('1-100019B', 'a_8:00AM to 9:00AM'): 0.6493506493506493,
 ('1-100019B', 'b_9:00AM to 10:00AM'): 0.7272727272727273,
 ('1-100019B', 'c_10:00AM to 11:00AM'): 0.16883116883116883,
 ('1-100019B', 'd_11:00AM to 12:00PM'): 0.025974025974025976,
 ('1-100019B', 'e_12:00PM to 1:00PM'): 0.38961038961038963,
 ('1-100019B', 'f_1:00PM to 2:00PM'): 0.14285714285714285,
 ('1-100019B', 'g_2:00PM to 3:00PM'): 0.0,
 ('1-100019B', 'h_3:00PM to 4:00PM'): 0.12987012987012986,
 ('1-100019B', 'i_4:00PM to 5:00PM'): 0.0,
 ('1-100019B', 'j_After 5PM'): 0.0}

df2 = 
{('1-100019B', 0, 'a_8:00AM to 9:00AM'): 0.5,
 ('1-100019B', 0, 'b_9:00AM to 10:00AM'): 0.6666666666666666,
 ('1-100019B', 0, 'c_10:00AM to 11:00AM'): 0.25,
 ('1-100019B', 0, 'e_12:00PM to 1:00PM'): 0.3333333333333333,
 ('1-100019B', 0, 'f_1:00PM to 2:00PM'): 0.0,
 ('1-100019B', 0, 'h_3:00PM to 4:00PM'): 1.0}

预期输出:

df = 
CONTACT_ID  DOW  CALL_BLOCKS         
1-100019B   0    a_8:00AM to 9:00AM      1.149
                 b_9:00AM to 10:00AM     1.380
                 c_10:00AM to 11:00AM    0.410
                 d_11:00AM to 12:00PM    0.026
                 e_12:00PM to 1:00PM     0.710
                 f_1:00PM to 2:00PM      0.140
                 g_2:00PM to 3:00PM      0.000
                 h_3:00PM to 4:00PM      1.120
                 i_4:00PM to 5:00PM      0.000
                 j_After 5PM             0.000

最佳答案

使用@jpp 设置,

df1.merge(df2.reset_index('DOW'), on=['CONTACTS_ID','CALL_BLOCKS'], how='outer')\
   .set_index('DOW', append=True).sum(1)

输出:

CONTACTS_ID  CALL_BLOCKS           DOW
1-100019B    a_8:00AM to 9:00AM    0.0    1.149351
             b_9:00AM to 10:00AM   0.0    1.393939
             c_10:00AM to 11:00AM  0.0    0.418831
             d_11:00AM to 12:00PM  NaN    0.025974
             e_12:00PM to 1:00PM   0.0    0.722944
             f_1:00PM to 2:00PM    0.0    0.142857
             g_2:00PM to 3:00PM    NaN    0.000000
             h_3:00PM to 4:00PM    0.0    1.129870
             i_4:00PM to 5:00PM    NaN    0.000000
             j_After 5PM           NaN    0.000000
dtype: float64

关于python - 按对象对两个 Pandas 分组求和,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51366404/

相关文章:

python - 在 for 循环中创建一个新的顺序模型(使用 Keras)

python - 字符串格式对齐

python - 计算由长度不等的二维索引列表给出的 DataFrame 行组的平均值

python - 是否有一个更短的版本可以从字典中查找键?

python 3 : Value not printing

python - 用系列替换 Pandas 数据框中的空值

Python:如何查找模式(第一列 - 对象),按第二列分组,输出到第三列

python - sklearn 代码在笔记本电脑和台式机之间运行行为的变化

python - Python 中的多个参数类

python - 如果您对 API 不满意该怎么办?