python - 转换 python datetime.datetime

标签 python pandas machine-learning

我的 Postgres 数据库在此处以该格式保存日期 2019-05-22 18:01:38.425533+00。对于我的回归模型,我必须使用该日期。因此我尝试用 df['created'] = pd.to_datetime(df.created) 来转换它。我是否选择了正确的格式来处理我的数据?如果我绘制数据,它会在此处呈现该图像。数据的值介于 0 - 200 之间,这似乎不正确。

# Load data
def load_event_data():
    df = pd.read_csv('event_data.csv')
    df['created'] = pd.to_datetime(df.created)
    return df

event_data = load_event_data()
print("The defined index is", event_data.index.name)

# Visualize data
plt.figure(figsize=(15, 6))
plt.plot(event_data.index, event_data.tickets_sold_sum)
plt.xlabel("Date")
plt.ylabel("Rentals")

这里有一些示例数据:https://docs.google.com/spreadsheets/d/1cJAcamytX4zmQBpbQZYIi-HK5T0JlAJ5Dx3b1D6adxQ/edit?usp=sharing

enter image description here

最佳答案

这是我尝试过的:

>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> df = pd.read_csv("d.csv")
>>> df
                          created  event_id  tickets_sold  tickets_sold_sum
0   2019-05-22 18:01:38.425533+00         1            90                90
1   2019-05-22 18:02:17.867726+00         1            40               130
2    2019-05-22 18:02:32.44182+00         1            13               143
3   2019-05-22 18:03:07.093599+00         1             0               143
4   2019-05-22 18:03:22.857492+00         1            10               153
5   2019-05-22 18:04:07.453356+00         1            14               167
6   2019-05-22 18:04:24.382271+00         1            14               181
7   2019-05-22 18:04:34.670751+00         1             7               188
8   2019-05-22 18:05:04.781586+00         1            10               198
9   2019-05-22 18:05:28.475102+00         1             2               200
10  2019-05-22 18:05:41.469483+00         1             0               200
11  2019-05-22 18:06:04.184309+00         1            19               219
12  2019-05-22 18:06:07.344332+00         1            18               237
13  2019-05-22 18:06:21.596053+00         1             9               246
14  2019-05-22 18:06:29.980078+00         1            20               266
15   2019-05-22 18:06:36.33118+00         1            11               277
16  2019-05-22 18:06:46.557717+00         1            15               292
17  2019-05-22 18:06:50.681479+00         1            10               302
18  2019-05-22 18:07:07.288164+00         1            17               319
19  2019-05-22 18:07:12.296925+00         1            11               330
20  2019-05-22 18:07:42.836565+00         1             5               335
21  2019-05-22 18:07:56.903366+00         1            17               352
22  2019-05-22 18:09:03.798696+00         1            13               365
23  2019-05-22 18:09:20.485152+00         1             9               374
24  2019-05-22 18:10:22.913068+00         1            14               388
25  2019-05-22 18:10:30.922313+00         1             9               397
26  2019-05-22 18:11:36.149465+00         1            12               409
27   2019-05-22 18:11:45.23962+00         1            13               422
28  2019-05-22 18:11:48.826544+00         1             4               426
>>> df.set_index("created",inplace=True)
>>> df
                               event_id  tickets_sold  tickets_sold_sum
created
2019-05-22 18:01:38.425533+00         1            90                90
2019-05-22 18:02:17.867726+00         1            40               130
2019-05-22 18:02:32.44182+00          1            13               143
2019-05-22 18:03:07.093599+00         1             0               143
2019-05-22 18:03:22.857492+00         1            10               153
2019-05-22 18:04:07.453356+00         1            14               167
2019-05-22 18:04:24.382271+00         1            14               181
2019-05-22 18:04:34.670751+00         1             7               188
2019-05-22 18:05:04.781586+00         1            10               198
2019-05-22 18:05:28.475102+00         1             2               200
2019-05-22 18:05:41.469483+00         1             0               200
2019-05-22 18:06:04.184309+00         1            19               219
2019-05-22 18:06:07.344332+00         1            18               237
2019-05-22 18:06:21.596053+00         1             9               246
2019-05-22 18:06:29.980078+00         1            20               266
2019-05-22 18:06:36.33118+00          1            11               277
2019-05-22 18:06:46.557717+00         1            15               292
2019-05-22 18:06:50.681479+00         1            10               302
2019-05-22 18:07:07.288164+00         1            17               319
2019-05-22 18:07:12.296925+00         1            11               330
2019-05-22 18:07:42.836565+00         1             5               335
2019-05-22 18:07:56.903366+00         1            17               352
2019-05-22 18:09:03.798696+00         1            13               365
2019-05-22 18:09:20.485152+00         1             9               374
2019-05-22 18:10:22.913068+00         1            14               388
2019-05-22 18:10:30.922313+00         1             9               397
2019-05-22 18:11:36.149465+00         1            12               409
2019-05-22 18:11:45.23962+00          1            13               422
2019-05-22 18:11:48.826544+00         1             4               426
>>> plt.figure(figsize=(15, 6))
<Figure size 1500x600 with 0 Axes>
>>> plt.plot(df.index[:10], df.tickets_sold_sum[:10])
[<matplotlib.lines.Line2D object at 0x0000022C7FBF5898>]
>>> plt.xlabel("Date")
Text(0.5,0,'Date')
>>> plt.ylabel("Rentals")
Text(0,0.5,'Rentals')
>>> plt.show()

我将这些值截断为 10,因为我无法清楚地显示它们。这是图片:
outpput images

关于python - 转换 python datetime.datetime,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56473581/

相关文章:

python - 如何从 .t​​frecords 文件中选取 TensorFlow 中的特定记录?

python - 填写数独板-回溯解题

python - 无法理解如何使用 findNextSiblings

python - Pandas 将 Nan 列值更改为 True 或 False

python - 做点积时的 NumPy 精度

python - SGDClassifier.partial_fit 返回错误 "classes should include labels"

python - 有没有办法通过使用后缀号作为迭代器来对不同声明的字符串变量求和?

python - 使用 Pandas 的独特连接数据框

python - 替换两列中的值pandas一个条件

c# - Microsoft.ML 中的 "Entry point ' Trainers.FastTreeRegressor ' not found"