当前 df:
Date Power
2011-04-18 17:00:00 243.56
2011-04-18 17:00:01 245.83
2011-04-18 17:00:02 246.02
2011-04-18 17:00:03 245.72
2011-04-18 17:00:04 244.71
2011-04-18 17:00:05 245.93
2011-04-18 17:00:06 243.12
2011-04-18 17:00:07 244.72
2011-04-18 17:00:08 242.44
2011-04-18 17:00:09 246.42
2011-04-18 17:00:10 245.02
... ...
我有带日期和 float 的 df。日期是索引并且是唯一的。 我想根据在下一个 df 中找到的日期创建一个新的 df。
date start date end
0 2011-04-18 17:00:01 2011-04-18 17:00:02
1 2011-04-18 17:00:05 2011-04-18 17:00:06
2 2011-04-18 17:00:08 2011-04-18 17:00:10
... ... ...
我希望得到:
Date Power
2011-04-18 17:00:01 245.83
2011-04-18 17:00:02 246.02
2011-04-18 17:00:05 245.93
2011-04-18 17:00:06 243.12
2011-04-18 17:00:08 242.44
2011-04-18 17:00:09 246.42
2011-04-18 17:00:10 245.02
... ...
换句话说,我想过滤初始 df 并找到第二个 df 中找到的所有日期之间的所有行。
我想到了使用 pandas.DataFrame.between_time。但问题是这仅适用于 1 个给定的开始日期和结束日期。如何在多个不同的日期期间执行此操作?
最佳答案
使用np.logical_or.reduce
列表理解:
L = [df1['Date'].between(s, e) for s, e in df2[['date start','date end']].to_numpy()]
df = df1[np.logical_or.reduce(L)]
print (df)
Date Power
1 2011-04-18 17:00:01 245.83
2 2011-04-18 17:00:02 246.02
5 2011-04-18 17:00:05 245.93
6 2011-04-18 17:00:06 243.12
8 2011-04-18 17:00:08 242.44
9 2011-04-18 17:00:09 246.42
10 2011-04-18 17:00:10 245.02
如果可以使用 DatetimeIndex
:
L = [df1[s:e] for s, e in df2[['date start','date end']].to_numpy()]
df = pd.concat(L)
print (df)
Power
Date
2011-04-18 17:00:01 245.83
2011-04-18 17:00:02 246.02
2011-04-18 17:00:05 245.93
2011-04-18 17:00:06 243.12
2011-04-18 17:00:08 242.44
2011-04-18 17:00:09 246.42
2011-04-18 17:00:10 245.02
L = [(df1.index >= s) & (df1.index <= e)
for s, e in df2[['date start','date end']].to_numpy()]
df = df1[np.logical_or.reduce(L)]
关于python - Pandas 选择多个日期时间之间的数据框行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69766227/