python - 取同一天 Pandas 内的数据平均值

我有一个数据框 df 包含测量日期和测量值(duration，km)

df
Out[20]: 
                          Date duration km
0   2015-03-28 09:07:00.800001    0      0
1   2015-03-28 09:36:01.819998    1      2
2   2015-03-30 09:36:06.839997    1      3
3   2015-03-30 09:37:27.659997    nan    5
4   2015-04-22 09:51:40.440003    3      7
5   2015-04-23 10:15:25.080002    0      nan

如何计算每天的平均持续时间和公里数？我想使用 groupby 和日期取行的平均值...

最佳答案

我想你需要resample :

cols = df.columns.difference(['Date'])
#if possible convert to float
df[cols] = df[cols].astype(float)

#if astype failed, because non numeric data, convert them to NaNs
df[cols] = df[cols].apply(pd.to_numeric, errors='coerce')

#if mixed dtypes
df[cols] = df[cols].astype(str).astype(float)
#alternatively 
#df[cols] = df[cols].astype(str).apply(pd.to_numeric, errors='coerce')

df = df.resample('d', on='Date').mean().dropna(how='all')
print (df)
            duration   km
Date                     
2015-03-28       0.5  1.0
2015-03-30       1.5  4.0
2015-04-22       3.0  7.0
2015-04-23       0.0  0.0

或者:

df = df.set_index('Date').groupby(pd.Grouper(freq='d')).mean().dropna(how='all')
print (df)
            duration   km
Date                     
2015-03-28       0.5  1.0
2015-03-30       1.5  4.0
2015-04-22       3.0  7.0
2015-04-23       0.0  0.0

关于python - 取同一天 Pandas 内的数据平均值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45547228/

python - 取同一天 Pandas 内的数据平均值

上一篇：python - 展平嵌套的 try/except 子句

下一篇：python - 减少 seaborn 时间序列图的线宽