python - 如何使用以字符串形式给出的 Pandas 来计算事件的持续时间？

标签 python sql pandas datetime time-series

找到以下有关 calculating time differences using Pandas 的链接后，我仍然试图将这些知识应用到我自己的数据中。我的数据集如下所示:

In [10]: df 
Out[10]:
      id           time
   0  420 1/3/2018 8:32
   1  420 1/3/2018 8:36
   2  420 1/3/2018 8:42
   3  425 1/7/2018 12:35
   4  425 1/7/2018 14:29
   5  425 1/7/2018 16:15
   6  425 1/7/2018 16:36
   7  427 1/11/2018 20:50
   8  428 1/13/2018 16:35
   9  428 1/13/2018 17:36

我想对 ID 执行 groupby 或其他函数，其输出为:

In [11]: pd.groupby(df[id])
Out [11]:

      id   time (duration)
   0  420  0:10
   1  425  4:01
   2  427  0:00
   3  428  1:01

id和time的类型分别是int64和object。使用 python3 和 pandas 0.20。

编辑: 来自 SQL，这似乎在功能上等同于:

select id, max(time) - min(time)
from df
group by id

编辑 2: 感谢大家的快速回复。所有解决方案都给我提供了以下错误的某些版本。不确定与我在这里缺少的特定数据集相关的内容:

TypeError: unsupported operand type(s) for -: 'str' and 'str'

最佳答案

groupby 与 np.ptp

df.groupby('id').time.apply(np.ptp)

id
420   00:10:00
425   04:01:00
427   00:00:00
428   01:01:00
Name: time, dtype: timedelta64[ns]

关于python - 如何使用以字符串形式给出的 Pandas 来计算事件的持续时间？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50845484/

上一篇：python - restructedText 内联文字中的冒号

下一篇：python - Pyforms 库 : WinError5 Access is denied

python - 我如何在 sqlalchemy ORM 中表达这个查询？

python - 属性错误: Unknown property column

python - 在 Pandas Python 中根据唯一列键对数据进行分组并连接(数据透视表)

python - Python 中的模糊 URL 匹配

python - 如何检查从图像中提取的值是否已存在于Python中的txt或csv文件中？

python - 为当前 python 版本 2.x 安装 ipython

MySQL Left Join 丢掉我的计数

mysql - mysql/phpmyadmin 中的 SQL 小数数据类型

python - 为什么python pandas不提供linux whl文件