python - Pandas 滴答数据按小时平均并绘制历史的每一周

标签 python matplotlib pandas

我一直在关注这里的答案:

Pandas: how to plot yearly data on top of each other

它采用时间序列并在新图上绘制每一天的最后一个数据点。图中的每条线代表一周的数据(例如每周 5 个数据点):

enter image description here

我使用了下面的代码来做到这一点:

#Chart by last price
daily = ts.groupby(lambda x: x.isocalendar()[1:]).agg(lambda s: s[-1])
daily.index = pd.MultiIndex.from_tuples(daily.index, names=['W', 'D'])
dofw = "Mon Tue Wed Thu Fri Sat Sun".split()
grid = daily.unstack('D').rename(columns=lambda x: dofw[x-1])
grid[-5:].T.plot()

我想做的不是按一天中的最后一个数据点进行汇总,而是按小时进行汇总(因此对每小时的数据进行平均)并绘制每周的每小时数据图表。因此图表看起来与链接图像中的图表相似,只是每行每天有 24 个数据点,而不是每行每天只有一个数据点

有什么方法可以将 Pandas DataFrame 粘贴到这篇文章中吗?当我单击复制粘贴时,它会粘贴为列表

编辑:

最终代码考虑了最近一周的不完整数据以用于制图:

# First we read the DataFrame and resample it to get a mean on every hour
df = pd.read_csv(r"MYFILE.csv", header=None,
                 parse_dates=[0], index_col=0).resample('H', how='mean').dropna()
# Then we add a week field so we can filter it by the week
df['week']= df.index.map(lambda x: x.isocalendar()[1])
start_range = list(set(df['week']))[-3]
end_range = list(set(df['week']))[-1]
# Create week labels
weekdays = 'Mon Tue Wed Thu Fri Sat Sun'.split()

# Create the figure
fig, ax = plt.subplots()

# For every week we want to plot
for week in range(start_range,end_range+1):
    # Select out the week
    dfw = df[df['week'] == week].copy()
    # Here we align all the weeks to span over the same time period so they
    # can be shown on the graph one over the other, and not one next to
    # the other.
    dfw['timestamp'] = dfw.index.values - (week * np.timedelta64(1, 'W'))
    dfw = dfw.set_index(['timestamp'])
    # Then we plot our data
    ax.plot(dfw.index, dfw[1], label='week %s' % week)
    # Now to set the x labels. First we resample the timestamp to have
    # a date frequency, and set it to be the xtick values
    if week == end_range:
        resampled = resampled.index + pd.DateOffset(weeks=1)
    else:        
        resampled = dfw.resample('D')
   # newresampled = resampled.index + pd.DateOffset(weeks=1)
    ax.set_xticks(resampled.index.values)
    # But change the xtick labels to be the weekdays.
    ax.set_xticklabels(weekdays)
# Plot the legend
plt.legend()

最佳答案

解决方案在代码中解释。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# First we read the DataFrame and resample it to get a mean on every hour
df = pd.read_csv('trayport.csv', header=None,
                 parse_dates=[0], index_col=0).resample('H', how='mean').dropna()
# Then we add a week field so we can filter it by the week
df['week']= df.index.map(lambda x: x.isocalendar()[1])

# Create week labels
weekdays = 'Mon Tue Wed Thu Fri Sat Sun'.split()

# Create the figure
fig, ax = plt.subplots()

# For every week we want to plot
for week in range(1, 4):
    # Select out the week
    dfw = df[df['week'] == week].copy()
    # Here we align all the weeks to span over the same time period so they
    # can be shown on the graph one over the other, and not one next to
    # the other.
    dfw['timestamp'] = dfw.index.values - (week * np.timedelta64(1, 'W'))
    dfw = dfw.set_index(['timestamp'])
    # Then we plot our data
    ax.plot(dfw.index, dfw[1], label='week %s' % week)
    # Now to set the x labels. First we resample the timestamp to have
    # a date frequency, and set it to be the xtick values
    resampled = dfw.resample('D')
    ax.set_xticks(resampled.index.values)
    # But change the xtick labels to be the weekdays.
    ax.set_xticklabels(weekdays)
# Plot the legend
plt.legend()

结果如下:

enter image description here

关于python - Pandas 滴答数据按小时平均并绘制历史的每一周,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18250353/

相关文章:

python - 具有不同标记和颜色的 matplotlib 散点图

python - 如何在 odoo-11 中卸载模块时同时删除数据库表?

python - IPython 中 %time 和 %timeit 之间的不一致

python - 在 django 中使用莫里斯图

python - Matplotlib 重用由另一个脚本创建的图形

python - 无法使用 matplotlib 绘制实时图形

Python Pandas Groupby 行为

python - 通过每组两个数据元素生成行的有效方法是什么?

python - 在 CreateView 上制作表单字段的正确 Django 方法,但在 UpdateView 上是可选的?

python - Pandas : dropping multiple columns and keeping only ones with numeric data