python - 使用PeriodIndex对pandas系列进行切片

我有一些pandas series，PeriodIndex的频率不同。我想根据另一个 PeriodIndex 来过滤这些索引，其频率原则上未知(在下面的示例中直接指定为 selectionA 或 selectionB >，但实际上是从另一个系列中剥离出来的)。

我发现了 3 种方法，每种方法都有其缺点，如下例所示。有更好的办法吗？

import numpy as np
import pandas as pd

y = pd.Series(np.random.random(4),  index=pd.period_range('2018', '2021', freq='A'), name='speed')
q = pd.Series(np.random.random(16), index=pd.period_range('2018Q1', '2021Q4', freq='Q'), name='speed')
m = pd.Series(np.random.random(48), index=pd.period_range('2018-01', '2021-12', freq='M'), name='speed')

selectionA = pd.period_range('2018Q3', '2020Q2', freq='Q') #subset of y, q, and m
selectionB = pd.period_range('2014Q3', '2015Q2', freq='Q') #not subset of y, q, and m

#Comparing some options: 
#1: filter method
#2: slicing
#3: selection based on boolean comparison

#1: problem when frequencies unequal: always returns empty series
yA_1 = y.filter(selectionA, axis=0) #Fail: empty series
qA_1 = q.filter(selectionA, axis=0) 
mA_1 = m.filter(selectionA, axis=0) #Fail: empty series
yB_1 = y.filter(selectionB, axis=0) 
qB_1 = q.filter(selectionB, axis=0) 
mB_1 = m.filter(selectionB, axis=0)

#2: problem when frequencies unequal: wrong selection and error instead of empty result
yA_2 = y[selectionA[0]:selectionA[-1]]  
qA_2 = q[selectionA[0]:selectionA[-1]] 
mA_2 = m[selectionA[0]:selectionA[-1]] #Fail: selects 22 months instead of 24
yB_2 = y[selectionB[0]:selectionB[-1]] #Fail: error
qB_2 = q[selectionB[0]:selectionB[-1]] 
mB_2 = m[selectionB[0]:selectionB[-1]] #Fail: error

#3: works, but very verbose
yA_3 =y[(y.index >= selectionA[0].start_time) & (y.index <= selectionA[-1].end_time)]
qA_3 =q[(q.index >= selectionA[0].start_time) & (q.index <= selectionA[-1].end_time)]
mA_3 =m[(m.index >= selectionA[0].start_time) & (m.index <= selectionA[-1].end_time)]
yB_3 =y[(y.index >= selectionB[0].start_time) & (y.index <= selectionB[-1].end_time)]
qB_3 =q[(q.index >= selectionB[0].start_time) & (q.index <= selectionB[-1].end_time)]
mB_3 =m[(m.index >= selectionB[0].start_time) & (m.index <= selectionB[-1].end_time)]

非常感谢

最佳答案

我通过将 start_time 和 end_time 添加到切片范围来解决这个问题:

yA_2fixed = y[selectionA[0].start_time: selectionA[-1].end_time]
qA_2fixed = q[selectionA[0].start_time: selectionA[-1].end_time] 
mA_2fixed = m[selectionA[0].start_time: selectionA[-1].end_time] #now has 24 rows
yB_2fixed = y[selectionB[0].start_time: selectionB[-1].end_time] #doesn't fail; returns empty series
qB_2fixed = q[selectionB[0].start_time: selectionB[-1].end_time] 
mB_2fixed = m[selectionB[0].start_time: selectionB[-1].end_time] #doesn't fail; returns empty series

但是如果有更简洁的方式来写这个，我仍然洗耳恭听。我特别想知道是否可以以对 PeriodIndex 更“原生”的方式进行此过滤，即不首先将其转换为 datetime 实例start_time 和 end_time 属性。

关于python - 使用PeriodIndex对pandas系列进行切片，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57273790/

python - 使用PeriodIndex对pandas系列进行切片

上一篇：python - 导入破坏了 pytest 的 VSCode 测试

下一篇：python - 如何在不同的图像上绘制给定的坐标？