我的 Postgres 数据库在此处以该格式保存日期 2019-05-22 18:01:38.425533+00
。对于我的回归模型,我必须使用该日期。因此我尝试用 df['created'] = pd.to_datetime(df.created) 来转换它。我是否选择了正确的格式来处理我的数据?如果我绘制数据,它会在此处呈现该图像。数据的值介于 0 - 200 之间,这似乎不正确。
# Load data
def load_event_data():
df = pd.read_csv('event_data.csv')
df['created'] = pd.to_datetime(df.created)
return df
event_data = load_event_data()
print("The defined index is", event_data.index.name)
# Visualize data
plt.figure(figsize=(15, 6))
plt.plot(event_data.index, event_data.tickets_sold_sum)
plt.xlabel("Date")
plt.ylabel("Rentals")
这里有一些示例数据:https://docs.google.com/spreadsheets/d/1cJAcamytX4zmQBpbQZYIi-HK5T0JlAJ5Dx3b1D6adxQ/edit?usp=sharing
最佳答案
这是我尝试过的:
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> df = pd.read_csv("d.csv")
>>> df
created event_id tickets_sold tickets_sold_sum
0 2019-05-22 18:01:38.425533+00 1 90 90
1 2019-05-22 18:02:17.867726+00 1 40 130
2 2019-05-22 18:02:32.44182+00 1 13 143
3 2019-05-22 18:03:07.093599+00 1 0 143
4 2019-05-22 18:03:22.857492+00 1 10 153
5 2019-05-22 18:04:07.453356+00 1 14 167
6 2019-05-22 18:04:24.382271+00 1 14 181
7 2019-05-22 18:04:34.670751+00 1 7 188
8 2019-05-22 18:05:04.781586+00 1 10 198
9 2019-05-22 18:05:28.475102+00 1 2 200
10 2019-05-22 18:05:41.469483+00 1 0 200
11 2019-05-22 18:06:04.184309+00 1 19 219
12 2019-05-22 18:06:07.344332+00 1 18 237
13 2019-05-22 18:06:21.596053+00 1 9 246
14 2019-05-22 18:06:29.980078+00 1 20 266
15 2019-05-22 18:06:36.33118+00 1 11 277
16 2019-05-22 18:06:46.557717+00 1 15 292
17 2019-05-22 18:06:50.681479+00 1 10 302
18 2019-05-22 18:07:07.288164+00 1 17 319
19 2019-05-22 18:07:12.296925+00 1 11 330
20 2019-05-22 18:07:42.836565+00 1 5 335
21 2019-05-22 18:07:56.903366+00 1 17 352
22 2019-05-22 18:09:03.798696+00 1 13 365
23 2019-05-22 18:09:20.485152+00 1 9 374
24 2019-05-22 18:10:22.913068+00 1 14 388
25 2019-05-22 18:10:30.922313+00 1 9 397
26 2019-05-22 18:11:36.149465+00 1 12 409
27 2019-05-22 18:11:45.23962+00 1 13 422
28 2019-05-22 18:11:48.826544+00 1 4 426
>>> df.set_index("created",inplace=True)
>>> df
event_id tickets_sold tickets_sold_sum
created
2019-05-22 18:01:38.425533+00 1 90 90
2019-05-22 18:02:17.867726+00 1 40 130
2019-05-22 18:02:32.44182+00 1 13 143
2019-05-22 18:03:07.093599+00 1 0 143
2019-05-22 18:03:22.857492+00 1 10 153
2019-05-22 18:04:07.453356+00 1 14 167
2019-05-22 18:04:24.382271+00 1 14 181
2019-05-22 18:04:34.670751+00 1 7 188
2019-05-22 18:05:04.781586+00 1 10 198
2019-05-22 18:05:28.475102+00 1 2 200
2019-05-22 18:05:41.469483+00 1 0 200
2019-05-22 18:06:04.184309+00 1 19 219
2019-05-22 18:06:07.344332+00 1 18 237
2019-05-22 18:06:21.596053+00 1 9 246
2019-05-22 18:06:29.980078+00 1 20 266
2019-05-22 18:06:36.33118+00 1 11 277
2019-05-22 18:06:46.557717+00 1 15 292
2019-05-22 18:06:50.681479+00 1 10 302
2019-05-22 18:07:07.288164+00 1 17 319
2019-05-22 18:07:12.296925+00 1 11 330
2019-05-22 18:07:42.836565+00 1 5 335
2019-05-22 18:07:56.903366+00 1 17 352
2019-05-22 18:09:03.798696+00 1 13 365
2019-05-22 18:09:20.485152+00 1 9 374
2019-05-22 18:10:22.913068+00 1 14 388
2019-05-22 18:10:30.922313+00 1 9 397
2019-05-22 18:11:36.149465+00 1 12 409
2019-05-22 18:11:45.23962+00 1 13 422
2019-05-22 18:11:48.826544+00 1 4 426
>>> plt.figure(figsize=(15, 6))
<Figure size 1500x600 with 0 Axes>
>>> plt.plot(df.index[:10], df.tickets_sold_sum[:10])
[<matplotlib.lines.Line2D object at 0x0000022C7FBF5898>]
>>> plt.xlabel("Date")
Text(0.5,0,'Date')
>>> plt.ylabel("Rentals")
Text(0,0.5,'Rentals')
>>> plt.show()
关于python - 转换 python datetime.datetime,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56473581/