r - CI 累积值

标签 r ggplot2 tidyverse

我测量了两种土壤类型丙酮和乙醛排放量:AB 94 天 3 次```。

我已经计算并可视化了随时间推移的累积排放量,但是我在放置 CI 95% 波段 (+/-2 SD) 或什至简单的误差线时遇到了问题。

这是我设法做到的:

df_cum <- df%>%
  group_by(soil_type, compound, days)%>%
  summarise(mean=mean(emission))%>%
  mutate(cum_emission=cumsum(mean)) 


  plot <- ggplot(df_cum, aes(x = days, y = cum_emission, colour=soil_type)) + 
    geom_line(size = 1)+
    geom_point()+
    scale_colour_manual(values=c("#00AFBB", "brown")) + 
    scale_size_manual(values=c(8.4, 1.7))+
    labs(x = "Time (days)", 
         y = "Cumulated production (umol/g dw soil)", 
         title = "Cumulated production") + 
    labs(shape="", color="") +
    facet_wrap(~compound)+
    theme_bw()

给这个:

enter image description here

我希望能够在我的绘图中放置误差线或使用 CI 绘制类似于此的绘图:

enter image description here

数据如下所示:

df <- structure(list(days = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 
10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 
10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 94, 
94, 94, 94, 94, 94, 94, 94, 94, 94, 94, 94, 94, 94, 94, 94, 94, 
94, 94, 94, 94, 94, 94, 94, 94, 94), soil = c(6, 6, 12, 12, 2, 
2, 1, 1, 14, 14, 4, 4, 33, 33, 38, 38, 34, 34, 37, 37, 36, 36, 
13, 13, 32, 32, 5, 5, 3, 3, 35, 35, 6, 6, 12, 12, 2, 2, 1, 1, 
14, 14, 4, 4, 33, 33, 38, 38, 34, 34, 37, 37, 36, 36, 13, 13, 
32, 32, 5, 5, 3, 3, 35, 35, 6, 6, 12, 12, 2, 2, 4, 4, 33, 33, 
38, 38, 34, 34, 37, 37, 36, 36, 13, 13, 5, 5, 3, 3, 35, 35), 
    soil_type = c("B", "B", "A", "A", "B", "B", "B", "B", "A", 
    "A", "B", "B", "B", "B", "B", "B", "A", "A", "B", "B", "A", 
    "A", "A", "A", "B", "B", "A", "A", "B", "B", "B", "B", "B", 
    "B", "A", "A", "B", "B", "B", "B", "A", "A", "B", "B", "B", 
    "B", "B", "B", "A", "A", "B", "B", "A", "A", "A", "A", "B", 
    "B", "A", "A", "B", "B", "B", "B", "B", "B", "A", "A", "B", 
    "B", "B", "B", "B", "B", "B", "B", "A", "A", "B", "B", "A", 
    "A", "A", "A", "A", "A", "B", "B", "B", "B"), compound = c("Acetone", 
    "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
    "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", 
    "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
    "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", 
    "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
    "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", 
    "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
    "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", 
    "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
    "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", 
    "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
    "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", 
    "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
    "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", 
    "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
    "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", 
    "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
    "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde"), emission = c(0.001, 
    0.001, 0.009, 0.004, 0.029, 0.032, 0.066, 0.057, 0.015, 0.015, 
    0, 0.003, 0.015, 0.011, 0.016, 0.005, 0.046, 0.011, 0.004, 
    0.005, 0.015, 0.003, 0.025, 0.012, 0.001, 0.001, 0.004, 0, 
    0.012, 0.002, 0.003, 0.002, 0.006, 0, 0.008, 0.001, 0.061, 
    0.055, 0.076, 0.056, 0.056, 0.074, 0, 0, 0.018, 0.02, 0.015, 
    0.001, 0.064, 0, 0.012, 0.004, 0.009, 0, 0.399, 0.037, 0.002, 
    0.001, 0.116, 0, 0.139, 0, 0.004, 0.001, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0.001, 0, 0, 0, 0, 0, 0, 0, 0.001, 0.005, 
    0.001, 0, 0.002, 0)), row.names = c(NA, -90L), class = c("tbl_df", 
"tbl", "data.frame"))
new_df <- structure(list(daysincubated4 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 4L, 4L, 
4L, 4L, 4L, 4L), soil = c(6L, 6L, 12L, 12L, 2L, 2L, 1L, 1L, 14L, 
14L, 4L, 4L, 13L, 13L, 5L, 5L, 3L, 3L, 6L, 6L, 12L, 12L, 2L, 
2L, 1L, 1L, 14L, 14L, 4L, 4L, 6L, 6L, 12L, 12L, 2L, 2L, 1L, 1L, 
14L, 14L, 4L, 4L, 13L, 13L, 5L, 5L, 3L, 3L, 13L, 13L, 5L, 5L, 
3L, 3L), soil_type = c("SOC<10", "SOC<10", "SOC>10", "SOC>10", 
"SOC<10", "SOC<10", "SOC<10", "SOC<10", "SOC>10", "SOC>10", "SOC<10", 
"SOC<10", "SOC>10", "SOC>10", "SOC>10", "SOC>10", "SOC<10", "SOC<10", 
"SOC<10", "SOC<10", "SOC>10", "SOC>10", "SOC<10", "SOC<10", "SOC<10", 
"SOC<10", "SOC>10", "SOC>10", "SOC<10", "SOC<10", "SOC<10", "SOC<10", 
"SOC>10", "SOC>10", "SOC<10", "SOC<10", "SOC<10", "SOC<10", "SOC>10", 
"SOC>10", "SOC<10", "SOC<10", "SOC>10", "SOC>10", "SOC>10", "SOC>10", 
"SOC<10", "SOC<10", "SOC>10", "SOC>10", "SOC>10", "SOC>10", "SOC<10", 
"SOC<10"), compound = c("Acetone", "Acetaldehyde", "Acetone", 
"Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
"Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", 
"Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
"Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", 
"Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
"Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", 
"Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
"Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", 
"Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", 
"Acetone", "Acetaldehyde", "Acetone", "Acetaldehyde", "Acetone", 
"Acetaldehyde"), emission = c(0.0011, 0.001, 0.0094, 0.0042, 
0.0289, 0.0318, 0.0656, 0.0571, 0.0152, 0.0146, 0, 0.0033, 0.0247, 
0.0117, 0.0038, 3e-04, 0.0124, 0.0016, 8e-04, 1e-04, 0.0188, 
0.0139, 0.0728, 0.0818, 0.0883, 0.0731, 0.0251, 0.0774, 0, 0, 
0.0061, 2e-04, 0.0084, 7e-04, 0.061, 0.0551, 0.0761, 0.0559, 
0.0563, 0.0742, 0, 0, 0.3989, 0.0367, 0.1163, 4e-04, 0.1386, 
4e-04, 0.0461, 0.0495, 0.089, 0.001, 0.1414, 0.001)), row.names = c(NA, 
-54L), class = c("tbl_df", "tbl", "data.frame"))

最佳答案

如果我理解正确,您基本上需要估计不同化合物中土壤类型累积线的置信区间,如果我们查看您的数据:

ggplot(df,aes(x=days,y=cumsum(emission),col=soil_type)) + 
geom_line() + facet_grid(compound ~ soil) + theme_bw() 

enter image description here

您要做的是估计同一行(复合)中相同颜色的线条之间的误差。为此,我们从您的示例中排除了 1 和 14,因为它不完整。基本上,我们正在尝试将您的所有测量值转换为累积值,按土壤 x 化合物分组。

我们可以使用stat_summary()来计算均值和se:

df %>% filter(!soil %in% c(1,14)) %>% 
group_by(soil,compound) %>% 
mutate(c_emission = cumsum(emission)) %>% 
ggplot(aes(x = days, y = c_emission, col=soil_type,fill=soil_type)) + 
stat_summary(geom="line", fun = mean) + 
stat_summary(geom="ribbon",alpha=0.05,linetype="dotted")+ 
facet_wrap(~compound)

enter image description here

我们在上面绘制的是每个时间点的平均值,+ 1* 标准误差。如果你想要 95% 的置信区间

df %>% filter(!soil %in% c(1,14)) %>% 
group_by(soil,compound) %>% 
mutate(c_emission = cumsum(emission)) %>% 
ggplot(aes(x = days, y = c_emission, col=soil_type,fill=soil_type)) +
stat_summary(geom="line",fun = mean) + 
stat_summary(geom="ribbon",alpha=0.05,linetype="dotted",
fun.data = mean_cl_normal)+ 
facet_wrap(~compound)

enter image description here

置信区间很大,因为您的标准误差很大。当您有更多样本时可能会更好。

关于r - CI 累积值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66801918/

相关文章:

r - NaN 通过在 R 中创建矩阵的幂

r - 如何检查R环境中是否存在数据框?

r - 在 geom_smooth、ggplot2 中设置不同的线型

r - 使用 geom_point() 恒定标签大小,同时使用可变点大小

r - ggplot2:如何在x轴上显示完整时间戳(以毫秒为单位)?

R - (Tidyverse) 将多个观测值压缩为一个

r - 有没有办法使用 gt 包按行(如迷你图)动态嵌入 ggplot 图像?

r - 如何使用 Shiny 的按钮在NavBar选项卡之间切换

r - 如何计算和测试总和并重复操作

r - 使用 R 和 lubridate 生成时间序列