我有一个来自 .csv 文件的数据集,其标题 created_at
、text
和 lable
如下
created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport
2021-07-31,RRR Wins the worldcup,Sport
2021-08-01,OOO Wins the worldcup,Sport
2021-08-02,JJJ Wins the worldcup,Sport
2021-08-03,YYY Wins the worldcup,Sport
2021-08-04,KKK Wins the worldcup,Sport
2021-08-05,YYY Wins the worldcup,Sport
2021-08-06,GGG Wins the worldcup,Sport
2021-08-07,FFF Wins the worldcup,Sport
2021-08-08,SSS Wins the worldcup,Sport
2021-08-09,XYZ Wins the worldcup,Sport
2021-08-10,PQR Wins the worldcup,Sport
如何将这些保存到基于周的 .csv 文件中。 例如:我只想将上述数据集的前 7 天值(从 2021-07-24 到 2021-07-30)和 week2.csv(2021-07-31 到 2021-08)保存到 week1.csv 文件中-05)等等
week1.csv
created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport
最佳答案
IIUC 您可以计算一周并使用groupby
:
group = pd.to_datetime(df['created_at']).dt.to_period('W-FRI')
for i, (g, d) in enumerate(df.groupby(group), start=1):
print(f'saving week {i}: {g}')
d.to_csv(f'week{i}.csv')
注意。使用以周五结束的周作为句点。
要从第一天使用开始以编程方式计算此值:
s = pd.to_datetime(df['created_at'])
dow = (s.iloc[0]-pd.Timedelta('1d')).strftime("%a")
group = s.dt.to_period(f'W-{dow}')
输出:
saving week 1: 2021-07-24/2021-07-30
saving week 2: 2021-07-31/2021-08-06
saving week 3: 2021-08-07/2021-08-13
文件:
week1.csv
created_at text label
0 2021-07-24 Newzeland Wins the worldcup Sport
1 2021-07-25 ABC Wins the worldcup Sport
2 2021-07-26 Hello the worldcup Sport
3 2021-07-27 Cricket worldcup Sport
4 2021-07-28 Rugby worldcup Sport
5 2021-07-29 LLL Wins Sport
6 2021-07-30 MMM Wins the worldcup Sport
week2.csv
created_at text label
7 2021-07-31 RRR Wins the worldcup Sport
8 2021-08-01 OOO Wins the worldcup Sport
9 2021-08-02 JJJ Wins the worldcup Sport
10 2021-08-03 YYY Wins the worldcup Sport
11 2021-08-04 KKK Wins the worldcup Sport
12 2021-08-05 YYY Wins the worldcup Sport
13 2021-08-06 GGG Wins the worldcup Sport
week3.csv
created_at text label
14 2021-08-07 FFF Wins the worldcup Sport
15 2021-08-08 SSS Wins the worldcup Sport
16 2021-08-09 XYZ Wins the worldcup Sport
17 2021-08-10 PQR Wins the worldcup Sport
关于python - 如何基于周保存到 .csv 文件中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72085344/