我有一个数据:
df_1 <- data.frame(
x = replicate(4, runif(30, 20, 100)),
y = sample(1:3, 30, replace = TRUE)
)
以下函数工作:
library(tidyverse)
df_1 %>%
select(-y) %>%
rowwise() %>%
mutate(var = sum(c(x.1, x.3)))
但是,以下函数(对于所有变量)不起作用:
与。
:
df_1 %>%
select(-y) %>%
rowwise() %>%
mutate(var = sum(.))
使用select_if
:
df_1 %>%
select(-y) %>%
rowwise() %>%
mutate(var = sum(select_if(., is.numeric)))
两种方法都返回:
Source: local data frame [30 x 5]
Groups: <by row>
# A tibble: 30 x 5
x.1 x.2 x.3 x.4 var
<dbl> <dbl> <dbl> <dbl> <dbl>
1 32.7 42.7 50.1 20.8 7091.
2 75.9 71.3 83.6 77.6 7091.
3 49.6 28.7 97.0 59.7 7091.
4 47.4 96.1 31.9 79.7 7091.
5 54.2 47.1 81.7 41.6 7091.
6 27.9 58.1 97.4 25.9 7091.
7 61.8 78.3 52.6 67.7 7091.
8 85.4 51.3 38.8 82.0 7091.
9 27.9 72.6 68.9 25.2 7091.
10 87.2 42.1 27.6 73.9 7091.
# ... with 20 more rows
7091
是一个不正确的总和。
如何调整这个功能?
最佳答案
这可以使用 purrr::pmap
来完成,它将参数列表传递给接受“点”的函数。由于 mean
、sd
等大多数函数都使用矢量,因此您需要将调用与 domain lifter 配对。 :
df_1 %>% select(-y) %>% mutate( var = pmap(., lift_vd(mean)) )
# x.1 x.2 x.3 x.4 var
# 1 70.12072 62.99024 54.00672 86.81358 68.48282
# 2 49.40462 47.00752 21.99248 78.87789 49.32063
df_1 %>% select(-y) %>% mutate( var = pmap(., lift_vd(sd)) )
# x.1 x.2 x.3 x.4 var
# 1 70.12072 62.99024 54.00672 86.81358 13.88555
# 2 49.40462 47.00752 21.99248 78.87789 23.27958
sum
函数直接接受点,所以你不需要提升它的域:
df_1 %>% select(-y) %>% mutate( var = pmap(., sum) )
# x.1 x.2 x.3 x.4 var
# 1 70.12072 62.99024 54.00672 86.81358 273.9313
# 2 49.40462 47.00752 21.99248 78.87789 197.2825
一切都符合标准的 dplyr
数据处理,因此所有这三个都可以组合为 mutate
的单独参数:
df_1 %>% select(-y) %>%
mutate( v1 = pmap(., lift_vd(mean)),
v2 = pmap(., lift_vd(sd)),
v3 = pmap(., sum) )
# x.1 x.2 x.3 x.4 v1 v2 v3
# 1 70.12072 62.99024 54.00672 86.81358 68.48282 13.88555 273.9313
# 2 49.40462 47.00752 21.99248 78.87789 49.32063 23.27958 197.2825
关于r - 在所有变量中应用 `dplyr::rowwise`,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55922514/