我正在寻找一种旋转 dplyr 的简单方法的 tibble
摘要。
假设我正在做这样的事情,
# install.packages(c("dplyr"), dependencies = TRUE)
library(dplyr)
mtcars %>%
group_by(am) %>%
summarise(
n = n(),
Mean_disp = mean(disp),
Mean_hp = mean(hp),
Mean_qsec = mean(qsec),
Mean_drat = mean(drat)
)
#> # A tibble: 2 x 6
#> am n Mean_disp Mean_hp Mean_qsec Mean_drat
#> <dbl> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 0 19 290.3789 160.2632 18.18316 3.286316
#> 2 1 13 143.5308 126.8462 17.36000 4.050000
但是,我想要的是或多或少像这样的输出,
#> # A tibble: 5 x 2
#> am <dbl> 0 1
#> n <int> 19 13
#> Mean_disp <dbl> 290.3789 143.5308
#> Mean_hp <dbl> 160.2631 126.8462
#> Mean_qsec <dbl> 18.183158 17.36000
#> Mean_drat <dbl> 3.286316 4.050000
我意识到我可以使用 t()
,但是这会将 tibble 转换为列表并弄乱了格式。
最佳答案
也许聚集然后再次传播:
library(dplyr)
library(tidyr)
mtcars %>%
group_by(am) %>%
summarise(
n = n(),
Mean_disp = mean(disp),
Mean_hp = mean(hp),
Mean_qsec = mean(qsec),
Mean_drat = mean(drat)) %>%
gather(key = key, value = value, -am) %>%
spread(key = am, value = value)
# # A tibble: 5 x 3
# key `0` `1`
# * <chr> <dbl> <dbl>
# 1 Mean_disp 290.378947 143.5308
# 2 Mean_drat 3.286316 4.0500
# 3 Mean_hp 160.263158 126.8462
# 4 Mean_qsec 18.183158 17.3600
# 5 n 19.000000 13.0000
另一种选择,在 group_by 之前收集,然后对所有选定的列取平均值,然后再次传播(但不确定如何添加 n()
):
mtcars %>%
select(am, disp, hp, qsec, drat) %>%
gather(key = key, value = value, -am) %>%
group_by(am, key) %>%
summarise(myMean = mean(value)) %>%
spread(key = am, value = myMean)
# # A tibble: 4 x 3
# key `0` `1`
# * <chr> <dbl> <dbl>
# 1 disp 290.378947 143.5308
# 2 drat 3.286316 4.0500
# 3 hp 160.263158 126.8462
# 4 qsec 18.183158 17.3600
关于r - 旋转 dplyr 的 tibble 摘要的简单方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48282782/