python - 如何使用 matplotlib 和 pandas 绘制日期时间与值的线性趋势线?

标签 python pandas matplotlib

我想生成每天平均 CPU 使用率的线性最佳拟合趋势线。

我的数据如下所示:

host_df_means['cpu_usage_percent']
history_datetime
2020-03-03     9.727273
2020-03-04     9.800000
2020-03-05     9.727273
2020-03-06    10.818182
2020-03-07     9.500000
2020-03-08    10.909091
2020-03-09    15.000000
2020-03-10    14.333333
2020-03-11    15.333333
2020-03-12    16.000000
2020-03-13    21.000000
2020-03-14    28.833333
Name: cpu_usage_percent, dtype: float64

然后我用以下方法绘制:
plot = host_df_means['cpu_usage_percent'].plot()
plot.set_xlim([datetime.date(2020, 3, 3), datetime.date(2020, 3, 31)])
plot;

这创造了这样的情节

line chart of cpu usage

所以现在我想为 future 添加一条趋势线,如下所示:

line chart of cpu usage with trendline

最佳答案

将您的数据保留为 pd.DataFrame ,诀窍是将日期转换为可用于执行线性回归的数字类型。

import datetime
import matplotlib.pyplot as plt
import pandas as pd
import scipy.stats as stats
from io import StringIO

# Set up data as in question

host_df_means = pd.read_csv(StringIO("""
        2020-03-03     9.727273
        2020-03-04     9.800000
        2020-03-05     9.727273
        2020-03-06    10.818182
        2020-03-07     9.500000
        2020-03-08    10.909091
        2020-03-09    15.000000
        2020-03-10    14.333333
        2020-03-11    15.333333
        2020-03-12    16.000000
        2020-03-13    21.000000
        """),
        sep='\s+', header=None, parse_dates=[0], index_col=0)
host_df_means.columns = ['cpu_usage_percent']
host_df_means.index.name = 'history_datetime'

fig, ax = plt.subplots(1, 1)
ax.plot(host_df_means.index, host_df_means)
ax.set_xlim([datetime.date(2020, 3, 3), datetime.date(2020, 3, 31)])

# To perform the linear regression we need the dates to be numeric
host_df_means.index = host_df_means.index.map(datetime.date.toordinal)
# Perform linear regression
slope, y0, r, p, stderr = stats.linregress(host_df_means.index,
                                           host_df_means['cpu_usage_percent'])

# x co-ordinates for the start and end of the line
x_endpoints = pd.DataFrame([host_df_means.index[0], host_df_means.index[-1]])

# Compute predicted values from linear regression
y_endpoints = y0 + slope * x_endpoints
# Overlay the line
ax.plot(x_endpoints, y_endpoints, c='r')
ax.set_xlabel('history_datetime')

enter image description here

关于python - 如何使用 matplotlib 和 pandas 绘制日期时间与值的线性趋势线?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60704798/

相关文章:

python - 根据条件合并行

python - Pandas 按层次多重索引分组,不丢失其他索引

安装了 python 3.2,但 MAC 无法识别它

python - Pygame - 尝试射击时崩溃

python - 如何更改 pandas 系列 datetime64 dtype 中所有实例的值

python - 如何将 numpy 数组转换为内存中的 pil 图像?

matplotlib - Jupyter 如何绘制彼此相邻的 2 个 df

python - 如何使用 Pandas Series 绘制两个不同长度/开始日期的时间序列?

python - 在 Airflow 中,如何使用上下文将参数传递给 on_success_callback 函数处理程序?

python - 如何编写正确的 setup.py