python - 将日期时间转换为统一的 15 分钟格式，并从日期时间中提取年、月、日、小时列

我的数据框 df 如下所示:

Code DateTime             Reading      
801  2011-01-15 08:30:00  0.0
801  2011-01-15 07:45:00  0.5
801  2011-01-16 06:30:00  5.0
801  2011-02-05 05:30:00  0.0
801  2011-02-08 00:45:00  10.0

2011 年全年依此类推。这没有特定的时间间隔。因此我想固定15分钟的时间间隔，并获得从2011-01-01 00:00:00开始到2011-12-31 23:45:00的连续统一数据，相应的读数应该是'0.0 ' 对于所有新添加的行。必须保留现有的读数。

此外，我想添加 4 列“年”、“月”、“日”、“小时”，这些列必须从“日期时间”列中提取。

我的输出应该如下所示:

Code DateTime             Year Month Day Hour Reading      
801  2011-01-01 00:00:00  2011   1    1   0     0.0
801  2011-01-01 00:15:00  2011   1    1   0     0.0
801  2011-01-01 00:30:00  2011   1    1   0     0.0
801  2011-01-01 00:45:00  2011   1    1   0     0.0
801  2011-01-01 01:00:00  2011   1    1   1     0.0
.
.
.
801  2011-12-31 23:45:00  2011   12   31  23    0.0

有人可以指导我完成这个吗？

最佳答案

您可以使用dt 访问器来访问时间戳中的年、月、日和小时。您可以使用 date_range 获取日期范围，并将频率设置为每 15 分钟一行的 15min。对于您想要的输出，您可以执行以下操作。

df['DateTime'] = pd.to_datetime(df['DateTime'])
# Create a  year month, day and time dataframe
new = pd.DataFrame({"Year": df["DateTime"].dt.year, "Month": df["DateTime"].dt.month,"Day":df["DateTime"].dt.day,"Hour":df["DateTime"].dt.hour})
# Set index to datetime after concatinating both dataframes
df = pd.concat((df,new),axis=1).set_index(df['DateTime'])

#Create a time dataframe 
time_df = pd.DataFrame({"DateTime":pd.date_range(start='2011-01-01 00:00:00', end='2011-12-31 23:45:00',freq="15min"),"Code":801,"Reading":0})

#Create a data frame of year, month, day and time 
k = pd.DataFrame({"Year": time_df["DateTime"].dt.year, "Month": time_df["DateTime"].dt.month,"Day":time_df["DateTime"].dt.day,"Hour":time_df["DateTime"].dt.hour})

# Set index to datetime after concatinating both dataframes 
time_df = pd.concat((time_df,k),axis=1).set_index(time_df['DateTime'])

# Create a new dataframe concatinating previous two dataframes by specifying proper axis
orginal_df = pd.concat((df,time_df),axis=0)

# Remove the duplicates 
orginal_df = orginal_df[~orginal_df.index.duplicated(keep='first')]

#Sort the dataframe by time
orginal_df = orginal_df.sort_index()

#Reset the index
orginal_df = orginal_df.reset_index(drop=True)

输出

       Code            DateTime  Reading  Day  Hour  Month  Year
0       801 2011-01-01 00:00:00      0.0    1     0      1  2011
1       801 2011-01-01 00:15:00      0.0    1     0      1  2011
2       801 2011-01-01 00:30:00      0.0    1     0      1  2011
3       801 2011-01-01 00:45:00      0.0    1     0      1  2011
4       801 2011-01-01 01:00:00      0.0    1     1      1  2011
5       801 2011-01-01 01:15:00      0.0    1     1      1  2011
6       801 2011-01-01 01:30:00      0.0    1     1      1  2011
.
.
.
1375   801 2011-01-15 07:45:00      0.5   15     7      1  2011
.
.
1378   801 2011-01-15 08:30:00      0.0   15     8      1  2011
.
.
35039   801 2011-12-31 23:45:00      0.0   31    23     12  2011

If you want the order you can use

orginal_df[['Code','DateTime','Year','Month','Day','Hour','Reading']]

       Code            DateTime  Year  Month  Day  Hour  Reading
0       801 2011-01-01 00:00:00  2011      1    1     0      0.0
1       801 2011-01-01 00:15:00  2011      1    1     0      0.0
2       801 2011-01-01 00:30:00  2011      1    1     0      0.0
3       801 2011-01-01 00:45:00  2011      1    1     0      0.0
4       801 2011-01-01 01:00:00  2011      1    1     1      0.0
5       801 2011-01-01 01:15:00  2011      1    1     1      0.0

关于python - 将日期时间转换为统一的 15 分钟格式，并从日期时间中提取年、月、日、小时列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45051310/

python - 将日期时间转换为统一的 15 分钟格式，并从日期时间中提取年、月、日、小时列

上一篇：从 python 解释器调用时的 Python argparse

下一篇：python - 在python中实现defaultdict类型集合的二叉搜索树