我正在尝试确定生成一组显示为范围的线
图
的最有效方法。我希望能制作出类似的东西:
我会尽力解释。抱歉,如果我错过任何信息。我设想x轴
是小时
的时间戳
范围(上午8点-上午9点-上午10点等)。总范围介于 8:00:00 和 27:00:00 之间。 y 轴
是任意时间点发生的值的计数
。 绘图
中的范围将代表发生的max
、min
和平均值
值。
下面列出了 df
示例:
import pandas as pd
import matplotlib.pyplot as plt
d = ({
'Time1' : ['8:00:00','9:30:00','9:40:00','10:25:00','12:30:00','1:31:00','1:35:00','2:45:00','4:50:00'],
'Occurring1' : ['1','2','3','4','5','5','6','6','7'],
'Time2' : ['8:10:00','9:34:00','9:48:00','10:40:00','1:30:00','2:31:00','3:35:00','3:45:00','4:55:00'],
'Occurring2' : ['1','2','2','3','4','5','5','6','7'],
'Time3' : ['9:00:00','9:34:00','9:58:00','10:45:00','10:50:00','12:31:00','1:35:00','2:15:00','3:55:00'],
'Occurring3' : ['1','2','3','4','4','5','6','7','8'],
})
df = pd.DataFrame(data = d)
所以这个df
代表了3组不同的数据
。发生的时间、值甚至条目数量都可能有所不同。
下面是一个初始示例。尽管我不确定是否需要重新考虑我的方法。滚动方程在这里有用吗?用于评估 df
中每小时出现的 max
、min
、avg
值的数量(8: 00:00-9:00:00)。
以下是完整的初步尝试:
import pandas as pd
import matplotlib.pyplot as plt
d = ({
'Time1' : ['8:00:00','9:30:00','9:40:00','10:25:00','12:30:00','1:31:00','1:35:00','2:45:00','4:50:00'],
'Occurring1' : ['1','2','3','4','5','5','6','6','7'],
'Time2' : ['8:10:00','9:34:00','9:48:00','10:40:00','1:30:00','2:31:00','3:35:00','3:45:00','4:55:00'],
'Occurring2' : ['1','2','2','3','4','5','5','6','7'],
'Time3' : ['9:00:00','9:34:00','9:58:00','10:45:00','10:50:00','12:31:00','1:35:00','2:15:00','3:55:00'],
'Occurring3' : ['1','2','3','4','4','5','6','7','8'],
})
df = pd.DataFrame(data = d)
fig, ax = plt.subplots(figsize = (10,6))
ax.plot(df['Time1'], df['Occurring1'])
ax.plot(df['Time2'], df['Occurring2'])
ax.plot(df['Time3'], df['Occurring3'])
plt.show()
最佳答案
要获得所需的结果,您需要克服一些困难。首先,您需要创建一个常规时间网格,在其上插入 y 数据(出现次数)。然后,您可以获得插值数据的最小值、最大值和平均值。下面的代码演示了如何执行此操作:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import griddata
# Example data
d = ({
'Time1' : ['8:00:00','9:30:00','9:40:00','10:25:00','12:30:00','1:31:00','1:35:00','2:45:00','4:50:00'],
'Occurring1' : ['1','2','3','4','5','5','6','6','7'],
'Time2' : ['8:10:00','9:34:00','9:48:00','10:40:00','1:30:00','2:31:00','3:35:00','3:45:00','4:55:00'],
'Occurring2' : ['1','2','2','3','4','5','5','6','7'],
'Time3' : ['9:00:00','9:34:00','9:58:00','10:45:00','10:50:00','12:31:00','1:35:00','2:15:00','3:55:00'],
'Occurring3' : ['1','2','3','4','4','5','6','7','8'],
})
# Create dataframe, explicitly define dtypes
df = pd.DataFrame(data=d)
df = df.astype({
"Time1": np.datetime64,
"Occurring1": np.int,
"Time2": np.datetime64,
"Occurring2": np.int,
"Time3": np.datetime64,
"Occurring3": np.int,
})
# Create 1D vectors of time data
all_times = df[["Time1", "Time2", "Time3"]].values
# Representation of 1 minute in time
t_min = np.timedelta64(int(60*1e9), "ns")
# Create a regular time grid with 10 minute spacing
time_grid = np.arange(all_times.min(), all_times.max(), 10*t_min, dtype="datetime64")
# Storage buffer for interpolated occurring data
occurrences_grid = np.zeros((3, len(time_grid)))
# Loop over all occurrence data and interpolate to regular grid
for i in range(3):
occurrences_grid[i] = griddata(
points=df["Time%i" % (i+1)].values.astype("float"),
values=df["Occurring%i" % (i+1)],
xi=time_grid.astype("float"),
method="linear"
)
# Get min, max, and mean values of interpolated data
occ_min = np.min(occurrences_grid, axis=0)
occ_max = np.max(occurrences_grid, axis=0)
occ_mean = np.mean(occurrences_grid, axis=0)
# Plot interpolated data
plt.fill_between(time_grid, occ_min, occ_max, color="slategray")
plt.plot(time_grid, occ_mean, c="white")
plt.xticks(rotation=60)
plt.tight_layout()
plt.show()
结果(x-标签格式不正确):
关于python - 将多个值绘制为范围 - matplotlib,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54320438/