python - 从另一个多索引 pandas 数据帧在多索引 pandas 数据帧中添加额外条目

我有一个多索引的 pandas 数据框，我使用了 groupby 方法，然后使用 describe 方法来提供以下内容:

    grouped= self.HK_data.groupby(level=[0,1])
    summary= grouped.describe()

给出:

Antibody        Time                
Customer_Col1A2 0    count  3.000000
                     mean   0.757589
                     std    0.188750
                     min    0.639933
                     25%    0.648732
                     50%    0.657532
                     75%    0.816417
                     max    0.975302
                10   count  3.000000
                     mean   0.716279
                     std    0.061939
                     min    0.665601
                     25%    0.681757
                     50%    0.697913
                     75%    0.741618
                     max    0.785324
                     ...   .........

我使用以下方法计算了SEM:

    SEM=grouped.mean()/(numpy.sqrt(grouped.count()))

给予:

Antibody                 Time          
Customer_Col1A2          0     0.437394
                         10    0.413544
                         120   0.553361
                         180   0.502792
                         20    0.512797
                         240   0.514609
                         30    0.505618
                         300   0.481021
                         45    0.534658
                         5     0.425800
                         60    0.430633
                         90    0.525115
                         ...  .........

如何连接这两个帧，使 SEM 成为汇总统计数据的另一个条目？

所以类似:

Antibody        Time                
Customer_Col1A2 0    count  3.000000
                     mean   0.757589
                     std    0.188750
                     min    0.639933
                     25%    0.648732
                     50%    0.657532
                     75%    0.816417
                     max    0.975302
                     SEM    0.437394
                10   count  3.000000
                     mean   0.716279
                     std    0.061939
                     min    0.665601
                     25%    0.681757
                     50%    0.697913
                     75%    0.741618
                     max    0.785324
                     SEM    0.413544

我已经尝试过 pandas.concat 但这并没有给我我想要的东西。

谢谢!

最佳答案

我认为你首先添加第三级 MultiIndex ，通过 MultiIndex.from_tuples 分配新索引最后一次使用concat与 sort_index :

HK_data = pd.DataFrame({'Antibody':['Customer_Col1A2','Customer_Col1A2','Customer_Col1A2'],
                   'Time':[0,10,10],
                   'Col':[7,8,9]})
HK_data = HK_data.set_index(['Antibody','Time'])
print (HK_data)
                      Col
Antibody        Time     
Customer_Col1A2 0       7
                10      8
                10      9

grouped= HK_data.groupby(level=[0,1])
summary= grouped.describe()
print (summary)
                                 Col
Antibody        Time                
Customer_Col1A2 0    count  1.000000
                     mean   7.000000
                     std         NaN
                     min    7.000000
                     25%    7.000000
                     50%    7.000000
                     75%    7.000000
                     max    7.000000
                10   count  2.000000
                     mean   8.500000
                     std    0.707107
                     min    8.000000
                     25%    8.250000
                     50%    8.500000
                     75%    8.750000
                     max    9.000000

SEM=grouped.mean()/(np.sqrt(grouped.count()))
#change multiindex
new_index = list(zip(SEM.index.get_level_values('Antibody'),
                     SEM.index.get_level_values('Time'), 
                     ['SEM'] * len(SEM.index)))
SEM.index = pd.MultiIndex.from_tuples(new_index, names=('Antibody','Time', None))

print (SEM)
                               Col
Antibody        Time              
Customer_Col1A2 0    SEM  7.000000
                10   SEM  6.010408

df = pd.concat([summary, SEM]).sort_index()
print (df)
                                 Col
Antibody        Time                
Customer_Col1A2 0    25%    7.000000
                     50%    7.000000
                     75%    7.000000
                     SEM    7.000000
                     count  1.000000
                     max    7.000000
                     mean   7.000000
                     min    7.000000
                     std         NaN
                10   25%    8.250000
                     50%    8.500000
                     75%    8.750000
                     SEM    6.010408
                     count  2.000000
                     max    9.000000
                     mean   8.500000
                     min    8.000000
                     std    0.707107

关于python - 从另一个多索引 pandas 数据帧在多索引 pandas 数据帧中添加额外条目，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/40706338/

python - 从另一个多索引 pandas 数据帧在多索引 pandas 数据帧中添加额外条目

上一篇：python - 如何在 pandas dataframe 中编写简单的信号逻辑？

下一篇：python - 如何将列单元格相乘并在没有数字的情况下添加 NaN？