我是新人。我需要生成一个小标题,其中每个变量按一个因子分组,并用平均值和标准差进行描述,并用“±”分隔。
让我们使用 iris 数据集。
iris %>%
group_by(Species) %>%
summarise(across(everything(), list(Mean=mean,dev.st=sd))) %>%
pivot_longer(cols = -Species, names_to = c(".value", "variable"), names_sep = "_")
我该如何继续? 预先感谢您
最佳答案
您可以使用更新的 dplyr::reframe
(取代 dplyr::summarize
)并添加此组合汇总统计信息 (comb
) 到您的函数列表:
library(dplyr)
library(tidyr)
iris %>%
group_by(Species) %>%
reframe(across(everything(),
list(Mean = ~ as.character(mean(.x)),
dev.sd = ~ as.character(sd(.x)),
comb = ~ paste(mean(.x), sd(.x), sep = " ± ")))) %>%
pivot_longer(cols = -Species, names_to = c(".value", "variable"),
names_sep = "_")
# (from comment) if you only wanted the combined column and want
# them at two significant digits, you could adjust:
iris %>%
group_by(Species) %>%
reframe(across(everything(),
list(comb = ~ paste(sprintf("%.2f", mean(.x)),
sprintf("%.2f", sd(.x)), sep = " ± ")))) %>%
pivot_longer(cols = -Species, names_to = c(".value", "variable"),
names_sep = "_")
#' In this case you get the exact same thing if you replace `reframe` with
#' `summarize`, but the latter is being replaced by `reframe`
#' by `dplyr` moving forward
注意与pivot_longer
结合,所有元素需要在同一个类中,因此将它们转换为字符。如果将其保持宽,则不必在摘要统计信息中添加 as.character()
位。
输出
Species variable Sepal.Length Sepal.Width Petal.Length Petal.Width
<fct> <chr> <chr> <chr> <chr> <chr>
1 setosa Mean 5.006 3.428 1.462 0.246
2 setosa dev.sd 0.352489687213451 0.379064369096289 0.173663996480184 0.105385589380046
3 setosa comb 5.006 ± 0.352489687213451 3.428 ± 0.379064369096289 1.462 ± 0.173663996480184 0.246 ± 0.105385589380046
4 versicolor Mean 5.936 2.77 4.26 1.326
5 versicolor dev.sd 0.516171147063863 0.313798323378411 0.469910977239958 0.197752680004544
6 versicolor comb 5.936 ± 0.516171147063863 2.77 ± 0.313798323378411 4.26 ± 0.469910977239958 1.326 ± 0.197752680004544
7 virginica Mean 6.588 2.974 5.552 2.026
8 virginica dev.sd 0.635879593274432 0.322496638172637 0.551894695663983 0.274650055636667
9 virginica comb 6.588 ± 0.635879593274432 2.974 ± 0.322496638172637 5.552 ± 0.551894695663983 2.026 ± 0.274650055636667
关于r - 生成一个以 "±"分隔的描述性统计表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/76055595/