python - 如何找到一个日期和另一个日期之间的差异(如果第二个未完全插入到dataFrame中)

标签 python python-2.7 pandas datetime dataframe

几天来我一直被这个问题困扰着……我不知道如何解决。
我在索引中有一些日期的数据框。我想选择一个等于天数的窗口,例如说5天。我想返回一个dataFrame,其索引中包含我的初始日期,并且在第一列中,索引中的每个日期与过去5天的最近天之间的天数差。

让我们举个例子。

[In] Mydates
[Out] 
2017-04-04   
2017-04-03    
2017-03-31    
2017-03-30   
2017-03-29   
2017-03-28   
2017-03-27   
2017-03-24  
2017-03-23     
2017-03-21   


我要回来

func(window = 5)
    return MyNewdates
[out]         First column
2017-04-04   -5 [diff between 2017-04-04 and 5 days before or closest date in dataset from 5 days before (here 2017-03-30 ), so difference is 0 - 5 =] -5 

2017-04-03   -5  [diff between 2017-04-03 and 5 days before or closest date in dataset from 5 days before (here 2017-03-29), so difference is 0 - 5 =] -5 
2017-03-31    
2017-03-30   -6  [here, there is no 2017-03-25 (5 days before) so the closest date from my window is 2017-03-24 (6 days before), so the difference is 0 - 6 =] -6 

2017-03-29   -5  [diff between 2017-03-29 and 5 days before or closest date in dataset from 5 days before (here 2017-03-24), so difference is 0 - 5 =] -5 
2017-03-28   -5  [diff between 2017-03-29 and 5 days before or closest date in dataset from 5 days before (here 2017-03-23 ), so difference is 0 - 5 =] -5 
2017-03-27   -4  [diff between 2017-03-27 and 5 days before or closest date in dataset from 5 days before (here 2017-03-23 ), so difference is 0 - 4 =] -4
2017-03-24  NAN 
2017-03-23  NAN    
2017-03-21  NAN


等等...

为此,我将所有日期转换为几天。还有其他方法吗?我希望它能使我回想起过去的时差。

希望一切都清楚,如果您有任何疑问,请告诉我!

谢谢!!

最佳答案

IIUC然后根据需要进行以下工作:

In [141]:
import io
import pandas as pd
# read in data
t="""Dates
2017-04-04   
2017-04-03    
2017-03-31    
2017-03-30   
2017-03-29   
2017-03-28   
2017-03-27   
2017-03-24  
2017-03-23   
2017-03-22   
2017-03-21  """
df = pd.read_csv(io.StringIO(t), delim_whitespace=True, parse_dates=[0], index_col=[0])
# define a window func
def func(x, window):
    prev = x - pd.DateOffset(window)
    if df.index.isin([prev]).any() == True:
        return -window
    elif (prev < df.index).all():
        return np.NaN
    else:
        diff = (df.index - prev).to_series().abs() 
        diff_idx = diff.index.get_loc(diff.argmin())
        return - ((x - (x - df.iloc[diff_idx]).name).days)

df.index.to_series().apply(lambda x: func(x, 5))
Out[141]:

Dates
2017-04-04   -5.0
2017-04-03   -5.0
2017-03-31   -4.0
2017-03-30   -6.0
2017-03-29   -5.0
2017-03-28   -5.0
2017-03-27   -5.0
2017-03-24    NaN
2017-03-23    NaN
2017-03-22    NaN
2017-03-21    NaN
Name: Dates, dtype: float64

关于python - 如何找到一个日期和另一个日期之间的差异(如果第二个未完全插入到dataFrame中),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43582829/

相关文章:

python - Docker API:监听事件

Python 导入错误 : No module name parse

python - 嵌套循环未获得所需的输出

python - Pandas - 如何对数据框的子列进行分组?

python - 如何访问 Pandas 数据框中左上角数据的值?

python - pandas DataFrame 中的 if-else 条件引用两行

python - 重新格式化后,带有类型注释的 VS Code 中的语法突出显示不适用于 Python

python - 以编程方式将数据存储 key 字符串转换为新的应用程序 ID? (主/从-> HRD迁移)

python - 使用子进程在 python 脚本中调用带有输入的 python 脚本

Python 2.x optionnal subparsers - 错误参数太少