r - 创建每月平均值,然后使用 ggplot 绘制它们

标签 r ggplot2 aggregate

我被一个看似简单的问题困住了。我有一个数据框,其中包含来自多个气象站的每月值。

这是原始数据的示例,称为“cn”:

      months  station temp_davg_c temp_dmax_c temp_dmin_c rain_mm snow_cm precip_mm       date
1    Jan courtney         3.0         5.6         0.3   216.0    15.9     231.8 2010-01-01
2    Feb courtney         3.6         7.1         0.0   134.8     9.3     144.1 2010-02-01
3    Mar courtney         5.7        10.0         1.3   127.0    11.3     138.3 2010-03-01
4    Apr courtney         9.1        14.3         3.9    90.7     0.1      90.7 2010-04-01
5    May courtney        12.5        18.1         6.8    53.0     0.0      53.0 2010-05-01
6    Jun courtney        15.5        21.0         9.9    53.0     0.0      53.0 2010-06-01

通过这样做,我可以根据原始数据绘制所有站点。然而,14 条不同的线有什么意义......

ggplot(data = cn, 
            aes(x = factor(months), y = temp_davg_c, colour = station))        
            geom_line(aes(group = station)) +
            xlab("Months")+ ylab("Temperature [°C]")+
            scale_x_discrete(limits=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"))

所以我想获得各站的月平均值/mmin/max。这是我遇到的第一个问题: 当我使用聚合来获取月平均值时,如何将数据保存在新的 df 中,然后我们可以在 dfs 中不包含 df 的情况下进行绘图?

为了获得每月平均值/最小值/最大值,我这样做了:

stats <- aggregate(cn[,3], list(cn$months), FUN=mean,na.rm=TRUE)
names(stats)[1] <- "months" # rename
names(stats)[2] <- "avg"    #rename
stats$max <- (aggregate(cn[,4], list(cn$months), FUN=max,na.rm=TRUE)[2])
stats$min <- (aggregate(cn[,5], list(cn$months), FUN=min,na.rm=TRUE)[2])

假设这不是问题,我如何重新排序 df 以便我可以按顺序排列月份?我知道我可以通过这样做来更改因子的顺序:

factor(stats$months, levels=month.name)

那么我该如何在我创建的包含所有统计数据的 df 中执行此操作呢? 由于我无法弄清楚这一点,因此我后来在 ggplot 函数中使用了“scale_x_discrete”,但我想知道如何做到这一点。

现在到最后一个问题,我将如何绘制所有站点的每月总体平均值/最小值/最大值,以便对于温度我只得到三行?

假设 df 中的 df 不是问题,我尝试了这个,假设我会让我的 df 看起来像这样:

   months       avg    max    min
1     Apr  8.561538 14.3  2.6
2     Aug 17.453846 26.1 10.9
3     Dec  3.075000  6.4 -0.8
4     Feb  3.892308  7.8 -0.7
5     Jan  3.269231  6.8 -0.8
6     Jul 17.446154 25.6 10.8
7     Jun 15.069231 21.9  9.0
8     Mar  5.876923 10.6  0.7
9     May 12.076923 18.6  6.0
10    Nov  5.215385  9.0  0.6
11    Oct  9.230769 14.4  3.8
12    Sep 14.100000 22.4  7.4

ggplot(stats,aes(months))+
geom_line(aes(y=avg)) +
  geom_line(aes(y=min)) +
  geom_line(aes(y=max)) +
  xlab("Months")+ ylab("Temperature [°C]")+
  scale_x_discrete(limits=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"))

我在这里缺少什么?任何帮助表示感谢..

干杯 桑德拉

PS:这是我的 cn 的 dput

structure(list(months = structure(c(5L, 4L, 8L, 1L, 9L, 7L, 6L, 
2L, 12L, 11L, 10L, 3L, 5L, 4L, 8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 
10L, 3L, 5L, 4L, 8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 
4L, 8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 4L, 8L, 1L, 
9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 4L, 8L, 1L, 9L, 7L, 6L, 
2L, 12L, 11L, 10L, 3L, 5L, 4L, 8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 
10L, 3L, 5L, 4L, 8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 
4L, 8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 4L, 8L, 1L, 
9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 4L, 8L, 1L, 9L, 7L, 6L, 
2L, 12L, 11L, 10L, 3L, 5L, 4L, 8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 
10L, 3L, 5L, 4L, 8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 
4L, 8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L), .Label = c("Apr", 
"Aug", "Dec", "Feb", "Jan", "Jul", "Jun", "Mar", "May", "Nov", 
"Oct", "Sep"), class = "factor"), station = structure(c(7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 13L, 
13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 12L, 12L, 
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 
11L, 11L, 11L, 11L, 11L, 11L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 
14L, 14L, 14L, 14L, 14L), .Label = c("albernirob", "blackcreeek", 
"campbellrivairp", "campbellrivsurf", "capemudge", "comoxairp", 
"courtney", "mudbay", "oysterriver", "powriv", "powrivairp", 
"qualicumhatch", "qualicumriverres", "stillwater"), class = "factor"), 
    temp_davg_c = c(3, 3.6, 5.7, 9.1, 12.5, 15.5, 17.9, 17.6, 
    14.2, 9, 5.1, 3.1, 2.8, 3.4, 5.4, 8.5, 11.7, 14.8, 17.1, 
    16.9, 13.6, 8.6, 5, 2.8, 2.4, 3.2, 5.2, 8, 11.6, 14.7, 17.3, 
    17.2, 13.7, 8.6, 4.4, 2.1, 2.6, 3.8, 5.9, 7.4, 11.5, 14.3, 
    16.2, 17.2, 12.7, 8.1, 4.1, NA, 4.1, 4.6, 6.3, 8.8, 12.1, 
    14.9, 17.2, 17.1, 14.2, 9.6, 5.8, 3.8, 3.9, 4.3, 6.1, 8.8, 
    12.4, 15.5, 18, 17.9, 14.5, 9.5, 5.7, 3.5, 3.5, 4, 5.9, 8.6, 
    12.1, 15.1, 17.5, 17.4, 14.1, 9.3, 5.3, 3.1, 3.3, 3.8, 5.6, 
    8.3, 12, 15.1, 17.3, 17.2, 13.6, 8.9, 5.2, 3.2, 3.9, 4.2, 
    5.9, 8.6, 12, 14.9, 17.1, 16.7, 13.6, 9.2, 5.6, 3.5, 2.8, 
    3.7, 5.8, 8.5, 11.9, 14.9, 17.3, 17.4, 14.1, 9.2, 4.9, 2.6, 
    2, 3, 5.7, 8.5, 12.3, 15.5, 18.3, 18.5, 15.3, 9.8, 4.6, 1.8, 
    4.6, 5.1, 7, 9.6, 13, 15.8, 18.4, 18.6, 15.6, 10.8, 6.8, 
    4.3, 3.6, 3.9, 5.9, 8.6, 11.9, 14.9, 17.2, 17.2, 14.1, 9.4, 
    5.3, 3.1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), 
    temp_dmax_c = c(5.6, 7.1, 10, 14.3, 18.1, 21, 23.8, 23.7, 
    20.1, 13, 8, 5.4, 5.8, 7.4, 10, 13.9, 17.3, 20.2, 22.9, 22.8, 
    19.6, 13, 8.4, 5.4, 5.5, 7.2, 9.7, 13.2, 17, 20.1, 23, 23.3, 
    19.8, 13.1, 7.7, 4.9, 5.6, 7.5, 10.6, 12.2, 16.7, 19.5, 21.6, 
    23.2, 18, 12.3, 7.5, NA, 6.6, 7.6, 9.8, 12.9, 16.5, 19.5, 
    22.1, 22, 18.6, 12.8, 8.5, 6.2, 6.4, 7.4, 9.6, 12.9, 16.6, 
    19.8, 22.8, 22.7, 19, 12.9, 8.5, 5.9, 6.2, 7.5, 10.1, 13.5, 
    17.2, 20.3, 23.1, 23.1, 19.5, 13.4, 8.3, 5.6, 6.2, 7.4, 9.8, 
    13.2, 17.1, 20.2, 22.6, 22.5, 18.9, 12.8, 8.3, 5.8, 6.5, 
    7.5, 9.9, 12.9, 16.7, 19.6, 22.3, 22.1, 18.7, 13, 8.5, 5.9, 
    5.5, 7.4, 10.1, 13.5, 17.2, 20.3, 23.1, 23.5, 20, 13.3, 7.8, 
    5, 4.3, 6.6, 10.5, 14.2, 18.6, 21.9, 25.6, 26.1, 22.4, 14.4, 
    7.3, 3.8, 6.8, 7.8, 10.4, 13.5, 17.1, 19.8, 22.7, 22.9, 19.5, 
    13.6, 9, 6.4, 5.8, 6.9, 9.4, 12.8, 16.5, 19.4, 22.1, 22.3, 
    18.7, 12.6, 7.7, 5.3, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA), temp_dmin_c = c(0.3, 0, 1.3, 3.9, 6.8, 9.9, 
    11.9, 11.5, 8.2, 5, 2.1, 0.7, -0.3, -0.6, 0.9, 3.1, 6.1, 
    9.3, 11.3, 10.9, 7.5, 4.2, 1.6, 0.2, -0.8, -0.7, 0.7, 2.8, 
    6.2, 9.3, 11.5, 11.1, 7.6, 4, 1, -0.8, -0.5, 0, 1.3, 2.6, 
    6.2, 9, 10.8, 11.1, 7.4, 3.8, 0.6, NA, 1.6, 1.5, 2.8, 4.7, 
    7.7, 10.3, 12.2, 12.2, 9.7, 6.4, 3.1, 1.4, 1.4, 1.2, 2.5, 
    4.6, 8, 11.1, 13.3, 13, 9.9, 6, 2.9, 0.9, 0.7, 0.5, 1.7, 
    3.7, 6.9, 9.8, 11.8, 11.7, 8.6, 5.3, 2.3, 0.5, 0.3, 0.1, 
    1.5, 3.4, 6.9, 9.8, 11.7, 11.7, 8.2, 5, 2, 0.5, 1.2, 0.8, 
    2, 4.1, 7.3, 10.1, 11.8, 11.3, 8.4, 5.3, 2.7, 0.9, 0.1, 0.1, 
    1.4, 3.5, 6.6, 9.4, 11.5, 11.2, 8.2, 5, 1.9, 0.2, -0.3, -0.6, 
    0.7, 2.7, 6, 9, 10.9, 10.9, 8, 5, 1.8, -0.3, 2.3, 2.4, 3.6, 
    5.6, 8.8, 11.8, 14, 14.3, 11.6, 8, 4.6, 2.2, 1.2, 0.9, 2.3, 
    4.3, 7.3, 10.4, 12.3, 12.1, 9.4, 6.1, 2.8, 0.9, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA), rain_mm = c(216, 134.8, 
    127, 90.7, 53, 53, 29.9, 35.4, 45.7, 146.9, 232.3, 236.4, 
    216, 166.8, 149.3, 105, 72.8, 63.2, 42.3, 43.1, 54, 171.8, 
    256.2, 247.3, 194.6, 135.5, 128.4, 91.6, 68.4, 62.9, 39.4, 
    44.6, 55.2, 161, 222.1, 204.2, 186, 140.2, 120.3, 87.2, 58.2, 
    51.3, 35.1, 39, 52.9, 154.8, 228.4, 218.5, 215.2, 135.1, 
    130.8, 93.6, 70.2, 61.1, 39.5, 45.6, 58.7, 168.6, 241, 220.8, 
    159.1, 107.8, 95.7, 64.4, 45.6, 42.8, 26.7, 29.2, 41.8, 122.7, 
    191.9, 168.9, 256.9, 174.1, 151.6, 98, 56.6, 45.2, 26, 37.6, 
    53.6, 189.7, 285.2, 256.7, 182, 144.2, 139.3, 87.2, 64.6, 
    54.7, 36.4, 39, 48.9, 152.9, 228.4, 215.9, 200.6, 131.1, 
    116.3, 79.4, 51.3, 45.3, 26, 34.6, 46.3, 146.8, 214, 180.7, 
    219.3, 150.4, 141, 101.1, 72.1, 62.8, 41.9, 49.5, 59.3, 180.8, 
    249.9, 234.2, 317, 222.7, 215.6, 143.6, 87.8, 62.2, 31, 46.4, 
    61.4, 218.3, 345.2, 323.2, 132, 88.4, 92.4, 70.8, 70.9, 57.4, 
    36.5, 42.3, 51.4, 117.5, 154.9, 134.5, 145.7, 101.9, 104.2, 
    83.2, 76.6, 67.6, 37.5, 45.3, 54.7, 125.5, 171.6, 146.5, 
    185.2, 125.5, 127.8, 99.6, 92.4, 73.7, 46, 50.7, 64.6, 152.1, 
    212.6, 178.5), snow_cm = c(15.9, 9.3, 11.3, 0.1, 0, 0, 0, 
    0, 0, 0.2, 6, 12.1, 17.3, 10, 6.7, 0.2, 0, 0, 0, 0, 0, 1.1, 
    6.4, 16, 23.3, 14.4, 11.7, 0.5, 0, 0, 0, 0, 0, 1.2, 10.5, 
    22.6, 13.2, 8.4, 7.6, 0, 0, 0, 0, 0, 0, 0.8, 7.3, 14.3, 13.8, 
    6.4, 6.3, 0.2, 0, 0, 0, 0, 0, 0.6, 6, 14.7, 11.9, 6, 9.9, 
    0.2, 0, 0, 0, 0, 0, 0.1, 8.2, 18.7, 12.9, 13.3, 8.2, 0, 0, 
    0, 0, 0, 0, 1.1, 4.8, 15.2, 14.9, 7.8, 4.6, 0, 0, 0, 0, 0, 
    0, 0.9, 4.1, 8.6, 10.4, 8.8, 4.3, 0, 0, 0, 0, 0, 0, 0.4, 
    4.2, 9.2, 14.8, 10.1, 7.1, 0.1, 0, 0, 0, 0, 0, 0.5, 7.2, 
    16.5, 22.6, 16.9, 8.2, 0.6, 0, 0, 0, 0, 0, 1.6, 8, 21.4, 
    6.1, 4.6, 3.8, 0, 0, 0, 0, 0, 0, 0.2, 3.4, 4.2, 13.6, 7.8, 
    6.8, 0.1, 0, 0, 0, 0, 0, 0.3, 6.5, 11.5, 8.1, 4.8, 2.7, 0, 
    0, 0, 0, 0, 0, 0.2, 4.4, 9), precip_mm = c(231.8, 144.1, 
    138.3, 90.7, 53, 53, 29.9, 35.4, 45.7, 147.1, 238.3, 248.5, 
    233.3, 176.8, 155.9, 105.2, 72.8, 63.2, 42.3, 43.1, 54, 172.9, 
    262.6, 263.3, 217.5, 149.5, 140, 92.1, 68.4, 62.9, 39.4, 
    44.6, 55.2, 162.2, 231.9, 225.7, 198.9, 148.6, 127.9, 87.2, 
    58.2, 51.3, 35.1, 39, 52.9, 155.6, 235.7, 232.8, 229.1, 141.4, 
    137.1, 93.8, 70.2, 61.1, 39.5, 45.6, 58.7, 169.2, 246.9, 
    235.5, 171.9, 114.3, 105.7, 64.6, 45.6, 42.8, 26.7, 29.2, 
    41.8, 122.8, 200.5, 187.9, 269.9, 187.4, 159.8, 98, 56.6, 
    45.2, 26, 37.6, 53.6, 190.8, 290, 272, 196.9, 151.9, 143.9, 
    87.2, 64.6, 54.7, 36.4, 39, 48.9, 153.8, 232.6, 224.5, 211, 
    139.9, 120.6, 79.4, 51.3, 45.3, 26, 34.6, 46.3, 147.2, 218.1, 
    189.8, 234.1, 160.4, 148, 101.2, 72.1, 62.8, 41.9, 49.5, 
    59.3, 181.3, 257.1, 250.7, 339.5, 239.6, 223.8, 144.2, 87.8, 
    62.2, 31, 46.4, 61.4, 219.8, 353.2, 344.6, 138.1, 93.1, 96.1, 
    70.8, 70.9, 57.4, 36.5, 42.3, 51.4, 117.7, 158.3, 138.7, 
    158.9, 109.4, 110.7, 83.3, 76.6, 67.6, 37.5, 45.3, 54.7, 
    125.8, 178, 157.8, 193.3, 130.3, 130.6, 99.6, 92.4, 73.7, 
    46, 50.7, 64.6, 152.3, 216.9, 187.5), date = structure(c(14610, 
    14641, 14669, 14700, 14730, 14761, 14791, 14822, 14853, 14883, 
    14914, 14944, 14610, 14641, 14669, 14700, 14730, 14761, 14791, 
    14822, 14853, 14883, 14914, 14944, 14610, 14641, 14669, 14700, 
    14730, 14761, 14791, 14822, 14853, 14883, 14914, 14944, 14610, 
    14641, 14669, 14700, 14730, 14761, 14791, 14822, 14853, 14883, 
    14914, 14944, 14610, 14641, 14669, 14700, 14730, 14761, 14791, 
    14822, 14853, 14883, 14914, 14944, 14610, 14641, 14669, 14700, 
    14730, 14761, 14791, 14822, 14853, 14883, 14914, 14944, 14610, 
    14641, 14669, 14700, 14730, 14761, 14791, 14822, 14853, 14883, 
    14914, 14944, 14610, 14641, 14669, 14700, 14730, 14761, 14791, 
    14822, 14853, 14883, 14914, 14944, 14610, 14641, 14669, 14700, 
    14730, 14761, 14791, 14822, 14853, 14883, 14914, 14944, 14610, 
    14641, 14669, 14700, 14730, 14761, 14791, 14822, 14853, 14883, 
    14914, 14944, 14610, 14641, 14669, 14700, 14730, 14761, 14791, 
    14822, 14853, 14883, 14914, 14944, 14610, 14641, 14669, 14700, 
    14730, 14761, 14791, 14822, 14853, 14883, 14914, 14944, 14610, 
    14641, 14669, 14700, 14730, 14761, 14791, 14822, 14853, 14883, 
    14914, 14944, 14610, 14641, 14669, 14700, 14730, 14761, 14791, 
    14822, 14853, 14883, 14914, 14944), class = "Date")), .Names = c("months", 
"station", "temp_davg_c", "temp_dmax_c", "temp_dmin_c", "rain_mm", 
"snow_cm", "precip_mm", "date"), row.names = c(NA, -168L), class = "data.frame")

最佳答案

如果使用 dplyrtidyr 将数据直接通过管道传输到 ggplot(),您可以一次性完成此操作:

library(dplyr)
library(tidyr)
library(ggplot2)

correct_order <- c("Jan","Feb","Mar","Apr","May","Jun",
                   "Jul","Aug","Sep","Oct","Nov","Dec")

cn %>% group_by(months) %>%
        summarise(min = min(temp_dmin_c, na.rm = TRUE),
                  max = max(temp_dmax_c, na.rm = TRUE),
                  avg = mean(temp_davg_c,na.rm = TRUE)) %>%
        gather(metric, value, -months) %>%
        ggplot(.,aes(x = months, y = value, 
                     group = metric, color = metric)) + 
        scale_x_discrete(limits=correct_order) + 
        geom_line()

enter image description here

关于r - 创建每月平均值,然后使用 ggplot 绘制它们,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42613610/

相关文章:

r - 将具有相似名称的列相乘

r - 如何在简单的 ggplot2 散点图中干净地标记点?

R ggplot2 : Setting additional specific axis tick marks

R plotly : Cannot re-arrange x-axis when axis type is category

php - 如何检索聚合对象?

r - 在 3 维中绘制 SVM

r - 我可以使用 mutate() 和 across() 根据许多其他列来改变许多列吗?

R 查找两个美国邮政编码列之间的距离

python - Pandas 聚合动态列名

django - Elasticsearch 聚合[python]