python - 根据时间间隔连接 Pandas 数据帧并计算平均值

有一个棘手的问题:

有两个数据帧“TimeRanges”，其中有关时间范围的信息位于(开始日期和结束日期以及 ID)，如下所示:

ID  StartTime       EndTime
1   01.03.18 12:00  01.03.18 13:00 
2   01.03.18 13:00  01.03.18 13:15 
3   01.03.18 13:30  01.03.18 14:55

第二个数据帧包含时间列，其中时间值以一分钟的频率增加，并且列值如下:

Time            Value
01.03.18 12:00  5.00
01.03.18 12:01  20.00
01.03.18 12:02  5.00
01.03.18 13:10  30.00
01.03.18 14:20  45.00

我试图实现但不知道如何更接近任务的是，我想在数据帧 TimeRanges 中创建新列 AvgValue ，其中将包含那些时间在 StartTime 之间的间隔内的值的mean()函数和 EndTime 例如:

ID  StartTime       EndTime         AvgValue
1   01.03.18 12:00  01.03.18 13:00     10
2   01.03.18 13:00  01.03.18 13:15     30
3   01.03.18 13:30  01.03.18 14:55     45

*值 10 是因为它位于从 01.03.18 12:00 到 01.03.18 13:00 (01.03.18 12:00、01.03.18 12:01、01.03.18 12:02) 的区间内，并且因为其中我们仅计算这些值的平均值。

将采取什么方法来做到这一点？ lambda 函数？还是别的什么？

谢谢

最佳答案

我通过 resample 实现了这一点但它也需要一些摆弄，所以它可能不是最好的解决方案。首先，我们需要索引的类型为 DatetimeIndex、TimedeltaIndex 或 periodIndex。

# set Time to be index
df.set_index('Time', inplace=True)
# change index type to datetime
df.index = pd.to_datetime(df.index)

使用resample - 我已经使用了 60 分钟的规则。您可以在以下链接查看规则resample

new_df = df.resample('60T').mean().reset_index()

现在我们有一个new_df每 60 分钟一次的平均值。我们只需要执行以下操作即可将其设置为您想要的格式。

from datetime import timedelta    
new_df['EndTime'] = new_df['Time'] + timedelta(seconds=3600)

最后重命名列:

new_df.rename(columns={'Time': 'StartTime', 'Value': 'AvgValue'}, inplace=True)

输出:

    StartTime             AvgValue    EndTime
0   2018-01-03 12:00:00   10.0        2018-01-03 13:00:00
1   2018-01-03 13:00:00   30.0        2018-01-03 14:00:00
2   2018-01-03 14:00:00   45.0        2018-01-03 15:00:00

EDIT: This time using the first dataframe (df1) for the time ranges you can do the following
df1['AvgTime'] = df1.T.apply(lambda x: df.loc[x['StartTime']:x['EndTime']].mean()).T

关于python - 根据时间间隔连接 Pandas 数据帧并计算平均值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50510407/

python - 根据时间间隔连接 Pandas 数据帧并计算平均值

上一篇：python - 使用 Tkinter 时如何并行化方法

下一篇：python - 是否可以使用 Colboratory 安装tensorflow.serving