我有一个包含两列的 df,timestamp 和 eventType。
timestamp 按时间顺序排列,eventType 可以是 ['start', 'change', 'end', resolve]。['start', 'change'] denotes the start of an event
['end','resolve'] denotes the end of an event
createdTime actionName
2020-03-16 18:28:14 start
2020-03-17 19:12:42 end
2020-03-18 19:56:10 change
2020-03-19 21:29:13 change
2020-03-20 21:42:06 end
2020-03-21 18:28:14 start
2020-03-21 19:12:42 resolve
2020-03-22 19:56:10 change
2020-03-22 21:29:13 change
2020-03-23 21:42:06 end
我想计算每个 开始/更改事件与下一个结束/解决事件之间的时间差。
- 一个事件在开始之前可以有多个开始/改变状态 已解决,因此事件需要进行初始启动/更改 状态作为第一个开始/更改事件时间。
- 输出需要是 df 中每个事件的时间增量列表
提前致谢:)
编辑 预期结果应该是一个列表,其中包含每个事件所花费的时间。
event_times = ['24:44:28', '49:45.56', '0:44:28', '25:45:56']
最佳答案
迟到总比不到好?
df['createdTime'] = pd.to_datetime(df.createdTime)
starts = ['start', 'change']
ends = ['end','resolve']
prev_status = 'end'
spans = []
for i in range(len(df)):
curr_status = df.actionName[i]
if curr_status in starts and prev_status in starts:
pass
elif curr_status in starts and prev_status in ends:
start_time = df.createdTime[i]
elif curr_status in ends and prev_status in starts:
t = df.createdTime[i] - start_time
hours = t.days * 24 + t.seconds // 3600
minutes = t.seconds % 3600 // 60
seconds = t.seconds % 60
spans.append(f"{hours}:{minutes}:{seconds}")
elif curr_status in ends and prev_status in ends:
raise ValueError (f"Two ends in a row at index {i}.")
else:
raise ValueError (f"Unrecognized action type at index {i}.")
prev_status = curr_status
print(spans)
给予
['24:44:28', '49:45:56', '0:44:28', '25:45:56']
关于Python:根据条件查找数据框中每个事件所花费的时间,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61387313/