python - 如何基于周保存到 .csv 文件中

标签 python pandas dataframe algorithm

我有一个来自 .csv 文件的数据集,其标题 created_attextlable 如下

created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport
2021-07-31,RRR Wins the worldcup,Sport
2021-08-01,OOO Wins the worldcup,Sport
2021-08-02,JJJ Wins the worldcup,Sport
2021-08-03,YYY Wins the worldcup,Sport
2021-08-04,KKK Wins the worldcup,Sport
2021-08-05,YYY Wins the worldcup,Sport
2021-08-06,GGG Wins the worldcup,Sport
2021-08-07,FFF Wins the worldcup,Sport
2021-08-08,SSS Wins the worldcup,Sport
2021-08-09,XYZ Wins the worldcup,Sport
2021-08-10,PQR Wins the worldcup,Sport

如何将这些保存到基于周的 .csv 文件中。 例如:我只想将上述数据集的前 7 天值(从 2021-07-24 到 2021-07-30)和 week2.csv(2021-07-31 到 2021-08)保存到 week1.csv 文件中-05)等等

week1.csv

created_at,text,label
2021-07-24,Newzeland Wins the worldcup,Sport
2021-07-25,ABC Wins the worldcup,Sport
2021-07-26,Hello the worldcup,Sport
2021-07-27,Cricket worldcup,Sport
2021-07-28,Rugby worldcup,Sport
2021-07-29,LLL Wins,Sport
2021-07-30,MMM Wins the worldcup,Sport

最佳答案

IIUC 您可以计算一周并使用groupby:

group = pd.to_datetime(df['created_at']).dt.to_period('W-FRI')

for i, (g, d) in enumerate(df.groupby(group), start=1):
    print(f'saving week {i}: {g}')
    d.to_csv(f'week{i}.csv')

注意。使用以周五结束的周作为句点。

要从第一天使用开始以编程方式计算此值:

s = pd.to_datetime(df['created_at'])
dow = (s.iloc[0]-pd.Timedelta('1d')).strftime("%a")
group = s.dt.to_period(f'W-{dow}')

输出:

saving week 1: 2021-07-24/2021-07-30
saving week 2: 2021-07-31/2021-08-06
saving week 3: 2021-08-07/2021-08-13

文件:

week1.csv
   created_at                         text  label
0  2021-07-24  Newzeland Wins the worldcup  Sport
1  2021-07-25        ABC Wins the worldcup  Sport
2  2021-07-26           Hello the worldcup  Sport
3  2021-07-27             Cricket worldcup  Sport
4  2021-07-28               Rugby worldcup  Sport
5  2021-07-29                     LLL Wins  Sport
6  2021-07-30        MMM Wins the worldcup  Sport

week2.csv
    created_at                   text  label
7   2021-07-31  RRR Wins the worldcup  Sport
8   2021-08-01  OOO Wins the worldcup  Sport
9   2021-08-02  JJJ Wins the worldcup  Sport
10  2021-08-03  YYY Wins the worldcup  Sport
11  2021-08-04  KKK Wins the worldcup  Sport
12  2021-08-05  YYY Wins the worldcup  Sport
13  2021-08-06  GGG Wins the worldcup  Sport

week3.csv
    created_at                   text  label
14  2021-08-07  FFF Wins the worldcup  Sport
15  2021-08-08  SSS Wins the worldcup  Sport
16  2021-08-09  XYZ Wins the worldcup  Sport
17  2021-08-10  PQR Wins the worldcup  Sport

关于python - 如何基于周保存到 .csv 文件中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72085344/

相关文章:

python - 使用随机 'nicknames' 对 pandas 名称列进行匿名化

python: raise ("customException") - 为什么没有堆栈跟踪?

python - 在 Django 模型字段中定义 CSS 样式

python - 在python中从字符串创建变量

python - 使用聚合函数计数的 Dataframe 上的 Pandas Timegrouper

返回包含数据框中所有变量的最大值的列

python - 在数据透视后对多索引的 pandas 数据帧数据重新排序

python - 用阶乘之和计算第 n 个斐波那契数?

python - 一堆立方体的体积

python - 将 Pandas 数据框中的两列展开为列表列表