python - 使用 Pandas Dataframes 根据间隙长度计算事件日期

标签 python date datetime pandas time-series

我对 pandas 比较陌生，并试图找出计算此信息的最佳方法是什么，因此非常感谢您的帮助。本质上我有一个看起来像这样的数据框:

id     activity_date
1      2015-01-01      
1      2015-01-02      
1      2015-01-03      
2      2015-01-02      
2      2015-01-05     
3      2015-01-10

我想计算以下信息“每个帐户活跃了多少天？”，我知道我可以简单地进行计数来获取此信息，但我想应用以下限制，“如果有 n事件日期之间的天数，仅计算该间隔之前的天数”。

例如，当 n = 5 时，以下命令返回的事件天数应为 4，而不是 6

id     activity_date
1      2015-01-01      
1      2015-01-02      
1      2015-01-04
1      2015-01-06
1      2015-01-14
1      2015-01-15

最佳答案

在了解了你想要什么之后，这就简单多了，所以我们计算当前行和前一行之间的差异是否大于5天，给我们一个 bool 系列，我们使用这个过滤器df，然后使用索引值来执行切片:

In [57]:

inactive_index = df[df['activity_date'].diff() > pd.Timedelta(5, 'd')]
inactive_index
Out[57]:
   id activity_date
4   1    2015-01-14

In [18]:

inactive.index
Out[18]:
Int64Index([4], dtype='int64')
In [58]:

df.iloc[:inactive.index[0]]
Out[58]:
   id activity_date
0   1    2015-01-01
1   1    2015-01-02
2   1    2015-01-04
3   1    2015-01-06

关于python - 使用 Pandas Dataframes 根据间隙长度计算事件日期，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/29854616/

上一篇：python - 垂直/水平最近的盒子 R-Tree

下一篇：Python:Scrapy 蜘蛛不返回结果？

相关文章：

ruby - 为什么 Date.new 不调用初始化？

powershell - Powershell-比较日期和CSV日期

python - GPU计算能力3.0，但最低要求的Cuda能力为3.5

python - 什么时候需要在 Python 中的 try..except 中添加 `else` 子句？

excel - 识别模式并提取子串

mysql - 在 Mysql 中向日期添加天数

python - 将字典值分配给变量，其中 key == list[element]

Python - smtplib - 保存为草稿 - Gmail

php - 从 MIN(CAST(.. AS DATETIME)) 输出获取值到 php 错误

c# - 想要从数据表日期列中删除时间