python - 获取数据框中不同子集的最大值和总和值。还绘制每个子集

标签 python pandas dataframe

我有一个数据框“事件”,如图所示:

              DateTime        ModFlow(cfs)     ObsFlow(cfs)  ModVol(f3)   ObsVol(f3)
Event
Event 1     8/15/2016 15:35   11.85926          0           0.039530867   0
Event 1     8/15/2016 10:05   30.05923          0           0.100197433   0
Event 1     8/15/2016 10:00   31.10118          0           0.1036706     0
Event 1     8/15/2016 9:55    32.17444          0           0.107248133   0
Event 1     8/15/2016 4:10    0.6783166      0.5650155      0.002261055   0.001883385
Event 10    6/23/2016 4:35    0.5573569      0.4814242      0.001857856   0.001604747
Event 10    6/23/2016 4:40    0.5536903      0.3544892      0.001845634   0.001181631
Event 10    6/23/2016 4:45    0.5502114      0.368421       0.001834038   0.00122807
Event 10    6/23/2016 4:50    0.5698021      0.501548       0.00189934    0.001671827
Event 10    6/23/2016 4:55    0.7525368      0.879257       0.002508456   0.002930857
Event 11    6/10/2016 8:05    0.6593155      0.6145511      0.002197718   0.002048504
Event 11    6/10/2016 8:10    0.6621117      0.8405573      0.002207039   0.002801858
Event 11    6/10/2016 8:15    0.6578091      0.8173375      0.002192697   0.002724458
Event 11    6/10/2016 8:20    0.6581948      0.871517       0.002193983   0.002905057
Event 12    4/26/2016 22:00   2.307288       2.588235       0.00769096    0.00862745
Event 12    4/26/2016 22:05   2.366998       3.091331       0.007889993   0.010304437
Event 12    4/26/2016 22:10   2.494073       3.278638       0.008313577   0.010928793
Event 12    4/26/2016 22:15   2.746868       3.083591       0.009156227   0.010278637
Event 12    4/26/2016 22:20   3.146326       2.877709       0.010487753   0.009592363
Event 12    4/26/2016 22:30   4.090476       2.354489       0.01363492    0.007848297

Q1)如何获取每个事件的 ModFlow(cfs) 和 ObsFlow(cfs) 的最大值以及每个事件的 ObsFlow(cfs) 列和 ObsVol(f3) 列的总和并将其放入新的数据帧?

所需的输出格式:

              DateTime        Peak ModFlow(cfs)     Peak ObsFlow(cfs)  Total ModVol(f3)   Total ObsVol(f3)
Event
Event 1     8/15/2016 15:35           -                 -                -                  -
Event 2     8/15/2016 10:05           -                 -                -                  -
Event 3     8/15/2016 10:00           -                 -                -                  -
Event 4     8/15/2016 9:55            -                 -                -                  -
Event 5     8/15/2016 4:10            -                 -                -                  -       

另外,我如何绘制“事件”数据框,以便为每个事件获得单独的图?

最佳答案

我认为你需要aggregate通过 firstmaxsum:

df1 = df.groupby(level=0)
        .agg({'DateTime':'first',
              'ModFlow(cfs)':'max',
              'ObsFlow(cfs)':'max',
              'ModVol(f3)':'sum',
              'ObsVol(f3)':'sum'})

#set order of columns
df1 = df1.reindex(columns=['DateTime','ModFlow(cfs)','ObsFlow(cfs)',
                           'ModVol(f3)','ObsVol(f3)'])

df1.columns = ['DateTime','Peak ModFlow(cfs)','Peak ObsFlow(cfs)',
               'Total ModVol(f3)','Total ObsVol(f3)']
print (df1)
                 DateTime  Peak ModFlow(cfs)  Peak ObsFlow(cfs)  \
Event                                                             
Event 1   8/15/2016 15:35           0.565016          32.174440   
Event 10   6/23/2016 4:35           0.879257           0.752537   
Event 11   6/10/2016 8:05           0.871517           0.662112   
Event 12  4/26/2016 22:00           3.278638           4.090476   

          Total ModVol(f3)  Total ObsVol(f3)  
Event                                         
Event 1           0.001883          0.352908  
Event 10          0.008617          0.009945  
Event 11          0.010480          0.008791  
Event 12          0.057580          0.057173 

然后如果需要DataFrame.plot.bar :

df1.plot.bar()

对于第一个DataFrame:

df.groupby(level=0).apply(lambda x: x.plot.bar()) 

关于python - 获取数据框中不同子集的最大值和总和值。还绘制每个子集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42105830/

相关文章:

python - 使用 Django 的 `reverse_lazy` 并将其连接起来

python - 如何区分 lambda 和 def 函数?

python - pandas - 如何生成索引和数据的笛卡尔积?

python - 根据 python/pandas 数据框中单元格的文本内容选择(非索引)列

python - 当我尝试总结按几个标准分组的几列时,R 中的aggregate 和 group_by 有什么区别

python - 我怎么能告诉 Nbconvert : "Please, this time use pdfLaTeX in place of XeLaTeX"

scala - Spark Scala 2.10 元组限制

python - 属性错误 : 'generator' object has no attribute 'to_sql' While creating datframe using generator

python - Pandas 中具有 NaN 值的子集列

python - Pandas 比较两个字符串列以创建第三列