python - 将多个值绘制为范围 - matplotlib

标签 python pandas dataframe matplotlib plot

我正在尝试确定生成一组显示为范围的线 的最有效方法。我希望能制作出类似的东西:

enter image description here

我会尽力解释。抱歉,如果我错过任何信息。我设想x轴小时时间戳范围(上午8点-上午9点-上午10点等)。总范围介于 8:00:00 和 27:00:00 之间。 y 轴 是任意时间点发生的值的计数绘图中的范围将代表发生的maxmin平均值值。

下面列出了 df 示例:

import pandas as pd
import matplotlib.pyplot as plt

d = ({
    'Time1' : ['8:00:00','9:30:00','9:40:00','10:25:00','12:30:00','1:31:00','1:35:00','2:45:00','4:50:00'],                 
    'Occurring1' : ['1','2','3','4','5','5','6','6','7'],           
    'Time2' : ['8:10:00','9:34:00','9:48:00','10:40:00','1:30:00','2:31:00','3:35:00','3:45:00','4:55:00'],                 
    'Occurring2' : ['1','2','2','3','4','5','5','6','7'], 
    'Time3' : ['9:00:00','9:34:00','9:58:00','10:45:00','10:50:00','12:31:00','1:35:00','2:15:00','3:55:00'],                 
    'Occurring3' : ['1','2','3','4','4','5','6','7','8'],                     
     })

df = pd.DataFrame(data = d)

所以这个df代表了3组不同的数据。发生的时间、值甚至条目数量都可能有所不同。

下面是一个初始示例。尽管我不确定是否需要重新考虑我的方法。滚动方程在这里有用吗?用于评估 df 中每小时出现的 maxminavg 值的数量(8: 00:00-9:00:00)。

以下是完整的初步尝试:

import pandas as pd
import matplotlib.pyplot as plt

d = ({
    'Time1' : ['8:00:00','9:30:00','9:40:00','10:25:00','12:30:00','1:31:00','1:35:00','2:45:00','4:50:00'],                 
    'Occurring1' : ['1','2','3','4','5','5','6','6','7'],           
    'Time2' : ['8:10:00','9:34:00','9:48:00','10:40:00','1:30:00','2:31:00','3:35:00','3:45:00','4:55:00'],                 
    'Occurring2' : ['1','2','2','3','4','5','5','6','7'], 
    'Time3' : ['9:00:00','9:34:00','9:58:00','10:45:00','10:50:00','12:31:00','1:35:00','2:15:00','3:55:00'],                 
    'Occurring3' : ['1','2','3','4','4','5','6','7','8'],                     
     })

df = pd.DataFrame(data = d)

fig, ax = plt.subplots(figsize = (10,6))

ax.plot(df['Time1'], df['Occurring1'])
ax.plot(df['Time2'], df['Occurring2'])
ax.plot(df['Time3'], df['Occurring3'])

plt.show()

最佳答案

要获得所需的结果,您需要克服一些困难。首先,您需要创建一个常规时间网格,在其上插入 y 数据(出现次数)。然后,您可以获得插值数据的最小值、最大值和平均值。下面的代码演示了如何执行此操作:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import griddata

# Example data
d = ({
    'Time1' : ['8:00:00','9:30:00','9:40:00','10:25:00','12:30:00','1:31:00','1:35:00','2:45:00','4:50:00'],
    'Occurring1' : ['1','2','3','4','5','5','6','6','7'],
    'Time2' : ['8:10:00','9:34:00','9:48:00','10:40:00','1:30:00','2:31:00','3:35:00','3:45:00','4:55:00'],
    'Occurring2' : ['1','2','2','3','4','5','5','6','7'],
    'Time3' : ['9:00:00','9:34:00','9:58:00','10:45:00','10:50:00','12:31:00','1:35:00','2:15:00','3:55:00'],
    'Occurring3' : ['1','2','3','4','4','5','6','7','8'],
})

# Create dataframe, explicitly define dtypes
df = pd.DataFrame(data=d)
df = df.astype({
    "Time1": np.datetime64,
    "Occurring1": np.int,
    "Time2": np.datetime64,
    "Occurring2": np.int,
    "Time3": np.datetime64,
    "Occurring3": np.int,
})

# Create 1D vectors of time data
all_times = df[["Time1", "Time2", "Time3"]].values

# Representation of 1 minute in time
t_min = np.timedelta64(int(60*1e9), "ns")
# Create a regular time grid with 10 minute spacing
time_grid = np.arange(all_times.min(), all_times.max(), 10*t_min, dtype="datetime64")

# Storage buffer for interpolated occurring data
occurrences_grid = np.zeros((3, len(time_grid)))

# Loop over all occurrence data and interpolate to regular grid
for i in range(3):
    occurrences_grid[i] = griddata(
        points=df["Time%i" % (i+1)].values.astype("float"),
        values=df["Occurring%i" % (i+1)],
        xi=time_grid.astype("float"),
        method="linear"
    )

# Get min, max, and mean values of interpolated data
occ_min = np.min(occurrences_grid, axis=0)
occ_max = np.max(occurrences_grid, axis=0)
occ_mean = np.mean(occurrences_grid, axis=0)

# Plot interpolated data
plt.fill_between(time_grid, occ_min, occ_max, color="slategray")
plt.plot(time_grid, occ_mean, c="white")
plt.xticks(rotation=60)
plt.tight_layout()
plt.show()

结果(x-标签格式不正确):

enter image description here

关于python - 将多个值绘制为范围 - matplotlib,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54320438/

相关文章:

python - 如何在 Pandas 数据框中按值获取准确的行号和列号(即单元格地址)

python - 如何保持时间戳的毫秒分量,即使它的值为零?

r - 将两个相同大小的数据框依次合并为一列

r - 涉及数据框中相互依赖列的计算

python - 使用 dictConfig 的 Django 日志记录找不到 "logging"模块

python - 升级到python 3.0的技巧?

python - MySQL-python 安装程序找不到正确的 Python 版本

python - 运行时警告 Pandas

r - R 中的自定义属性传播

python - MapReduce using hadoop streaming via python - 将列表从映射器传递到缩减器并将其作为列表读取