抱歉大家的标题,但这确实是我想做的。
这里有一个表格来解释更多。粗线代表年份,细线代表星期。
对于预期的输出。它的格式实际上并不重要。我所需要的是,如果我询问一对YEAR/WEEK
的日期,我会得到相应的日期窗口。
例如,如果我执行 some_window_function(2022, 5)
我应该得到下面的结果(它对应于RED WINDOW)
DATE
YEAR WEEK
2020 30 Friday, July 24, 2020
2022 5 Wednesday, February 2, 2022
5 Thursday, February 3, 2022
5 Friday, February 4, 2022
7 Tuesday, February 15, 2022
例如,如果我执行 some_window_function(2022, 7)
我应该得到下面的结果(它对应于BLUE WINDOW)
DATE
YEAR WEEK
2022 5 Friday, February 4, 2022
2022 7 Tuesday, February 15, 2022
7 Wednesday, February 16, 2022
7 Thursday, February 17, 2022
2023 44 Tuesday, October 31, 2023
使用的数据框是这样的:
df = pd.DataFrame({'YEAR': [2020, 2020, 2020, 2020, 2020, 2020, 2020, 2022, 2022, 2022, 2022, 2022, 2022, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023], 'WEEK': [29, 29, 29, 30, 30, 30, 30, 5, 5, 5, 7, 7, 7, 44, 44, 44, 44, 45, 45, 45, 46, 46, 46, 46], 'DATE': ['Monday, July 13, 2020', 'Thursday, July 16, 2020', 'Friday, July 17, 2020', 'Monday, July 20, 2020', 'Tuesday, July 21, 2020', 'Thursday, July 23, 2020', 'Friday, July 24, 2020', 'Wednesday, February 2, 2022', 'Thursday, February 3, 2022', 'Friday, February 4, 2022', 'Tuesday, February 15, 2022', 'Wednesday, February 16, 2022', 'Thursday, February 17, 2022', 'Tuesday, October 31, 2023', 'Wednesday, November 02, 2023', 'Friday, November 03, 2023', 'Sunday, November 05, 2023', 'Monday, November 06, 2023', 'Tuesday, November 07, 2023', 'Wednesday, November 08, 2023', 'Monday, November 13, 2023', 'Tuesday, November 14, 2023', 'Wednesday, November 15, 2023', 'Thursday, November 16, 2023']})
我编写了下面的代码,但它提供了与我的输入类似的数据框:
def make_windows(group):
if group.name == df.loc[df['YEAR'] == group.name, 'WEEK'].min():
group.at[group.index[-1]+1, 'DATE'] = df.at[group.index[-1]+1, 'DATE']
return group.ffill()
elif group.name < df.loc[df['YEAR']== group.name, 'WEEK'].max():
group.at[group.index[-1]+1, 'DATE'] = df.at[group.index[-1]+1, 'DATE']
return group.iloc[1:].ffill()
else:
return group.iloc[1:].ffill()
results = df.groupby('YEAR').apply(make_windows)
最佳答案
看起来您可以对“年/周”使用一个简单的掩码,并将其在上面/下面展开一行(假设已排序的日期):
df = df.sort_values(by=['YEAR', 'WEEK'])
def some_window_function(year, week):
mask = df['YEAR'].eq(year) & df['WEEK'].eq(week)
return df[mask|mask.shift()|mask.shift(-1)]
some_window_function(2022, 5)
输出:
YEAR WEEK DATE
6 2020 30 Friday, July 24, 2020
7 2022 5 Wednesday, February 2, 2022
8 2022 5 Thursday, February 3, 2022
9 2022 5 Friday, February 4, 2022
10 2022 7 Tuesday, February 15, 2022
关于python - 如何根据双方最近的可用日期制作重叠的几周窗口?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/77521686/