python - 通过添加开始和结束日期扩展数据框并用时间戳和 NaN 填充它

标签 python pandas timestamp

我得到了以下数据:

                        data
timestamp
2012-06-01 17:00:00     9
2012-06-01 20:00:00     8
2012-06-01 13:00:00     9
2012-06-01 10:00:00     9

并想按时间降序排列,在数据的顶部和底部添加开始和结束日期,这样它看起来像这样:

                        data
timestamp
2012-06-01 00:00:00     NaN
2012-06-01 10:00:00     9
2012-06-01 13:00:00     9
2012-06-01 17:00:00     9
2012-06-01 20:00:00     8
2012-06-02 00:00:00     NaN

最后,我想扩展数据集,以一小时为单位涵盖从开始到结束的所有时间,用包含“None”/“NaN”作为数据的缺失时间戳填充数据框。 到目前为止,我有以下代码:

df2 = pd.DataFrame({'data':temperature, 'timestamp': pd.DatetimeIndex(timestamp)}, dtype=float)
df2.set_index('timestamp',inplace=True)
df3 = pd.DataFrame({ 'timestamp': pd.Series([ts1, ts2]), 'data': [None, None]})
df3.set_index('timestamp',inplace=True)
print(df3)
merged = df3.append(df2)
print(merged)

打印输出如下:

df3:
                     data
timestamp                
2012-06-01 00:00:00     None
2012-06-02 00:00:00     None


merged:
                     data
timestamp                
2012-06-01 00:00:00     NaN
2012-06-02 00:00:00     NaN
2012-06-01 17:00:00     9
2012-06-01 20:00:00     8
2012-06-01 13:00:00     9
2012-06-01 10:00:00     9

我试过:

merged = merged.asfreq('H')

但这返回了一个不令人满意的结果:

                     data
2012-06-01 00:00:00   NaN
2012-06-01 01:00:00   NaN
2012-06-01 02:00:00   NaN
2012-06-01 03:00:00   NaN
2012-06-01 04:00:00   NaN
2012-06-01 05:00:00   NaN
2012-06-01 06:00:00   NaN
2012-06-01 07:00:00   NaN
2012-06-01 08:00:00   NaN
2012-06-01 09:00:00   NaN
2012-06-01 10:00:00     9

数据框的其余部分在哪里?为什么它只包含第一个有效值之前的数据?

非常感谢您的帮助。提前致谢

最佳答案

首先使用您想要的时间戳索引创建一个空数据框,然后与您的原始数据集进行左合并:

df2 = pd.DataFrame(index = pd.date_range('2012-06-01','2012-06-02', freq='H'))
df3 = pd.merge(df2, df, left_index = True, right_index = True, how = 'left')
df3 
Out[103]: 
                               timestamp  value
2012-06-01 00:00:00                  NaN    NaN
2012-06-01 01:00:00                  NaN    NaN
2012-06-01 02:00:00                  NaN    NaN
2012-06-01 03:00:00                  NaN    NaN
2012-06-01 04:00:00                  NaN    NaN
2012-06-01 05:00:00                  NaN    NaN
2012-06-01 06:00:00                  NaN    NaN
2012-06-01 07:00:00                  NaN    NaN
2012-06-01 08:00:00                  NaN    NaN
2012-06-01 09:00:00                  NaN    NaN
2012-06-01 10:00:00  2012-06-01 10:00:00      9
2012-06-01 11:00:00                  NaN    NaN
2012-06-01 12:00:00                  NaN    NaN
2012-06-01 13:00:00  2012-06-01 13:00:00      9
2012-06-01 14:00:00                  NaN    NaN
2012-06-01 15:00:00                  NaN    NaN
2012-06-01 16:00:00                  NaN    NaN
2012-06-01 17:00:00  2012-06-01 17:00:00      9
2012-06-01 18:00:00                  NaN    NaN
2012-06-01 19:00:00                  NaN    NaN
2012-06-01 20:00:00  2012-06-01 20:00:00      8
2012-06-01 21:00:00                  NaN    NaN
2012-06-01 22:00:00                  NaN    NaN
2012-06-01 23:00:00                  NaN    NaN
2012-06-02 00:00:00                  NaN    NaN

关于python - 通过添加开始和结束日期扩展数据框并用时间戳和 NaN 填充它,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30712831/

相关文章:

qt - 如何获取当前时间戳?

Python SQL DB 字符串文字和转义

python - 在 Python 中的数据框中对所有可能的列组合应用函数——更好的方法

php - 在 PHP 5.3 中定义时间戳的新方法?

python - 如何使用日期作为 x 轴绘制数据框

python - 如何向 Datetime Multiindex Panda Dataframe 添加行

java - 从纪元时间/时间戳中删除毫秒 | java

python - 根据系列或数组中的索引访问 pandas 字符串列字符

R 阶函数的 Python 等效项

python - 如何从图中删除渲染器?