我有一个包含两列的数据框; 销售
和日期
。
dataset.head(10)
Date Sales
0 2015-01-02 34988.0
1 2015-01-03 32809.0
2 2015-01-05 9802.0
3 2015-01-06 15124.0
4 2015-01-07 13553.0
5 2015-01-08 14574.0
6 2015-01-09 20836.0
7 2015-01-10 28825.0
8 2015-01-12 6938.0
9 2015-01-13 11790.0
我想将 Date
列从 yyyy-mm-dd
(例如 2015-06-01
)转换为 yyyy -ww
(例如 2015-23
),所以我运行以下代码:
dataset["Date"] = pd.to_datetime(dataset["Date"]).dt.strftime('%Y-%V')
然后我根据周数按我的销售额
进行分组,即
data = dataset.groupby(['Date'])["Sales"].sum().reset_index()
data.head(10)
Date Sales
0 2015-01 67797.0
1 2015-02 102714.0
2 2015-03 107011.0
3 2015-04 121480.0
4 2015-05 148098.0
5 2015-06 132152.0
6 2015-07 133914.0
7 2015-08 136160.0
8 2015-09 185471.0
9 2015-10 190793.0
现在我想根据 Date
列创建一个日期范围,因为我根据周预测销售额:
ds = data.Date.values
ds_pred = pd.date_range(start=ds.min(), periods=len(ds) + num_pred_weeks,
freq="W")
但是我收到以下错误:无法将字符串转换为时间戳
,我不太确定如何修复。因此,如果我使用 2015-01-01
作为日期导入的开始日期,我不会收到错误,这让我意识到我使用的函数是错误的。但是,我不确定如何?
我希望基本上有一个日期范围,从本周开始一直到 future 52 周。
最佳答案
我认为问题是想要创建最少的 dataset["Date"]
列,并用 YYYY-VV
格式的字符串填充。但要传递到 date_range
需要格式 YYYY-MM-DD
或日期时间对象。
我找到了this :
Several additional directives not required by the C89 standard are included for convenience. These parameters all correspond to ISO 8601 date values. These may not be available on all platforms when used with the strftime() method. The ISO 8601 year and ISO 8601 week directives are not interchangeable with the year and week number directives above. Calling strptime() with incomplete or ambiguous ISO 8601 directives will raise a ValueError.
%V ISO 8601 week as a decimal number with Monday as the first day of the week. Week 01 is the week containing Jan 4.
Pandas 0.24.2 错误,格式为 YYYY-VV
:
dataset = pd.DataFrame({'Date':['2015-06-01','2015-06-02']})
dataset["Date"] = pd.to_datetime(dataset["Date"]).dt.strftime('%Y-%V')
print (dataset)
Date
0 2015-23
1 2015-23
ds = pd.to_datetime(dataset['Date'], format='%Y-%V')
print (ds)
ValueError: 'V' is a bad directive in format '%Y-%V'
可能的解决方案是使用 %U
或 %W,检查 this :
%U Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0.
%W Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0.
dataset = pd.DataFrame({'Date':['2015-06-01','2015-06-02']})
dataset["Date"] = pd.to_datetime(dataset["Date"]).dt.strftime('%Y-%U')
print (dataset)
Date
0 2015-22
1 2015-22
ds = pd.to_datetime(dataset['Date'] + '-1', format='%Y-%U-%w')
print (ds)
0 2015-06-01
1 2015-06-01
Name: Date, dtype: datetime64[ns]
或者在日期时间中使用原始 DataFrame 中的数据:
dataset = pd.DataFrame({'Date':['2015-06-01','2015-06-02'],
'Sales':[10,20]})
dataset["Date"] = pd.to_datetime(dataset["Date"])
print (dataset)
Date Sales
0 2015-06-01 10
1 2015-06-02 20
data = dataset.groupby(dataset['Date'].dt.strftime('%Y-%V'))["Sales"].sum().reset_index()
print (data)
Date Sales
0 2015-23 30
num_pred_weeks = 5
ds = data.Date.values
ds_pred = pd.date_range(start=dataset["Date"].min(), periods=len(ds) + num_pred_weeks, freq="W")
print (ds_pred)
DatetimeIndex(['2015-06-07', '2015-06-14', '2015-06-21',
'2015-06-28',
'2015-07-05', '2015-07-12'],
dtype='datetime64[ns]', freq='W-SUN')
关于python - Pandas 日期范围对于 yyyy-w 返回 "could not convert string to Timestamp",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56326386/