python - 检查我的时间序列索引数据是否有工作日的缺失值

我有从“2015年1月5日”到“2018年12月28日”的时间序列数据。我观察到一些工作日的日期，但它们的值丢失了。如何查看我的时间范围内缺少多少个工作日？这些日期是什么，以便我可以推断这些日期的值。

示例:

Date    Price    Volume
2018-12-28  172.0   800
2018-12-27  173.6   400
2018-12-26  170.4   500
2018-12-25  171.0   2200
2018-12-21  172.8   800

根据日历，2018年12月21日是星期五。然后，排除周六和周日，数据集应该在列表中包含“24th Dec 2018”，但它丢失了。我需要从范围内识别此类缺失的日期。

到目前为止我的做法: 我尝试使用

pd.date_range('2015-01-05','2018-12-28',freq='W')

识别周数并计算周数。手动从其中提取工作日，以确定丢失日期的数量。但它并没有解决目的，因为我需要识别范围内丢失的日期。

最佳答案

假设这是您的完整数据集:

Date    Price    Volume
2018-12-28  172.0   800
2018-12-27  173.6   400
2018-12-26  170.4   500
2018-12-25  171.0   2200
2018-12-21  172.8   800

并且日期是:

dates = pd.date_range('2018-12-15', '2018-12-31')

首先，确保“日期”列实际上是日期类型:

df['Date'] = pd.to_datetime(df['Date'])

然后将日期设置为索引:

df = df.set_index('Date')

然后reindex with unutbu's solution :

df = df.reindex(dates, fill_value=0.0)

然后重置索引以使其更易于使用:

df = df.reset_index()

现在看起来像这样:

        index  Price  Volume
0  2018-12-15    0.0     0.0
1  2018-12-16    0.0     0.0
2  2018-12-17    0.0     0.0
3  2018-12-18    0.0     0.0
4  2018-12-19    0.0     0.0
5  2018-12-20    0.0     0.0
6  2018-12-21  172.8   800.0
7  2018-12-22    0.0     0.0
8  2018-12-23    0.0     0.0
9  2018-12-24    0.0     0.0
10 2018-12-25  171.0  2200.0
11 2018-12-26  170.4   500.0
12 2018-12-27  173.6   400.0
13 2018-12-28  172.0   800.0
14 2018-12-29    0.0     0.0
15 2018-12-30    0.0     0.0
16 2018-12-31    0.0     0.0

做:

df['weekday'] = df['index'].dt.dayofweek

最后，您的时间范围内缺少多少个工作日:

missing_weekdays = df[(~df['weekday'].isin([5,6])) & (df['Volume'] == 0.0)]

结果:

>>> missing_weekdays
        index  Price  Volume  weekday
2  2018-12-17    0.0     0.0        0
3  2018-12-18    0.0     0.0        1
4  2018-12-19    0.0     0.0        2
5  2018-12-20    0.0     0.0        3
9  2018-12-24    0.0     0.0        0
16 2018-12-31    0.0     0.0        0

关于python - 检查我的时间序列索引数据是否有工作日的缺失值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/60034743/

python - 检查我的时间序列索引数据是否有工作日的缺失值

上一篇：python - 匹配多列并添加到数据框

下一篇：python - 如何在循环期间的任意时刻检查按键按下情况？