python - 如何随时间按类别绘制

标签 python pandas matplotlib

我有两列,分类列和年份,我正在尝试绘制。我正在尝试获取每年每个类别的总和来创建一个多类时间序列图。

ax = data[data.categorical=="cat1"]["categorical"].plot(label='cat1')
data[data.categorical=="cat2"]["categorical"].plot(ax=ax, label='cat3')
data[data.categorical=="cat3"]["categorical"].plot(ax=ax, label='cat3')
plt.xlabel("Year")
plt.ylabel("Number per category")
sns.despine()

但是我收到一条错误消息,指出没有要绘制的数字数据。我正在寻找与上述类似的东西,也许是 data[data.categorical=="cat3"]["categorical"].lambda x : (1 for x in data.categorical)

我将使用以下列表作为示例。

categorical = ["cat1","cat1","cat2","cat3","cat2","cat1","cat3","cat2","cat1","cat3","cat3","cat3","cat2","cat1","cat2","cat3","cat2","cat2","cat3","cat1","cat1","cat1","cat3"]

year = [2013,2014,2013,2015,2014,2014,2013,2014,2014,2015,2015,2013,2014,2014,2013,2014,2015,2015,2015,2013,2014,2015,2013]

我的目标是获得类似于下图的东西 enter image description here

最佳答案

我不太愿意将其称为“解决方案”,因为它基本上只是基本 Pandas 功能的总结,在您在帖子中找到时间序列图的同一文档中对此进行了解释。但是鉴于 groupby 和绘图存在一些混淆,演示可能有助于解决问题。

我们可以对 groupby() 进行两次调用。
第一个 groupby() 使用 count 聚合获取每年类别出现的计数。
第二个 groupby() 用于绘制每个类别的时间序列。

首先,生成一个示例数据框:

import pandas as pd
categorical = ["cat1","cat1","cat2","cat3","cat2","cat1","cat3","cat2",
               "cat1","cat3","cat3","cat3","cat2","cat1","cat2","cat3",
               "cat2","cat2","cat3","cat1","cat1","cat1","cat3"]
year = [2013,2014,2013,2015,2014,2014,2013,2014,2014,2015,2015,2013,
        2014,2014,2013,2014,2015,2015,2015,2013,2014,2015,2013]
df = pd.DataFrame({'categorical':categorical,
                   'year':year})

   categorical  year
 0        cat1  2013
 1        cat1  2014
                 ...
21        cat1  2015
22        cat3  2013

现在获取每年每个类别的计数:

# reset_index() gives a column for counting, after groupby uses year and category
ctdf = (df.reset_index()
          .groupby(['year','categorical'], as_index=False)
          .count()
          # rename isn't strictly necessary here, it's just for readability
          .rename(columns={'index':'ct'})
       )

   year categorical  ct
0  2013        cat1   2
1  2013        cat2   2
2  2013        cat3   3
3  2014        cat1   5
4  2014        cat2   3
5  2014        cat3   1
6  2015        cat1   1
7  2015        cat2   2
8  2015        cat3   4

最后,绘制每个类别的时间序列,按颜色标注:

from matplotlib import pyplot as plt
fig, ax = plt.subplots()

# key gives the group name (i.e. category), data gives the actual values
for key, data in ctdf.groupby('categorical'):
    data.plot(x='year', y='ct', ax=ax, label=key)

time series plot by category

关于python - 如何随时间按类别绘制,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43832311/

相关文章:

python - 如何使用 usb 调制解调器从 python 发送和接收短信?

python - pandas 绘图错误 TypeError : Empty 'DataFrame' : no numeric data to plot

python - pandas 列中的连续值

python - 具有多个值的 Pandas 系列如何正确绘制它

python - Pandas 系列的直方图值

python - AWS Athena 未在 Boto3 中被识别?

python - 将值范围映射到字符串

python - 向具有特定索引名称的 Pandas DataFrame 添加新行

python - 有没有办法迭代绘制某些内容,以便新的绘图覆盖以前的绘图?

python - 使用带有参数的映射器函数重命名数据框列