r - 如何使用 purrr 迭代 lm reg 中协变量和结果的每个组合

对我来说，一个常见的情况是，我需要运行基本相同的回归模型，但针对一系列不同的结果，并且对于敏感性分析，我同时需要迭代不同的协变量集。

我对 R 还很陌生，但是使用下面的 purrr 我可以迭代结果和协变量，但它当然会并行地遍历成对的列表，当我需要它遍历每个组合时每个列表。

对于如何迭代结果和协变量的所有组合有哪些选项？

另外，有谁知道为什么下面的代码不适用于“map2”？我收到错误消息“as_mapper(.f, ...):参数“.f”丢失，没有默认值”

library(dplyr)
library(purrr)

dataset <- tibble(
    y1=rnorm(n=100),
    y2=rnorm(n=100),
    x1=rnorm(n=100),
    x2=rnorm(n=100))


outcomes <- dataset %>%
    select(y1,y2)

covars <- dataset %>%
    select(x1,x2)

paramlist <- list(covarL,outcomeL)

paramlist %>%
    pmap(~lm(.y ~ .x,data=dataset))

最佳答案

在更大的 tidyverse 中，有很多方法可以做到这一点。我是dplyr::rowwise的粉丝对于这种计算。我们可以使用colnames而不是实际数据，然后创建一个像 tibble 这样的矩阵与 tidyr::expand_grid其中包含结果和协变量的所有组合。然后我们可以使用dplyr::rowwise并使用lm里面list()与 reformulate 一起它接受字符串作为输入。要获得结果，我们可以使用 broom::tidy .

library(dplyr)
library(purrr)
library(tidyr)

dataset <- tibble(
  y1=rnorm(n=100),
  y2=rnorm(n=100),
  x1=rnorm(n=100),
  x2=rnorm(n=100))

outcomes <- dataset %>%
  select(y1,y2) %>% colnames

covars <- dataset %>%
  select(x1,x2) %>% colnames

paramlist <- expand_grid(outcomes, covars)

paramlist %>%
  rowwise %>% 
  mutate(mod = list(lm(reformulate(outcomes, covars), data = dataset)),
         res = list(broom::tidy(mod)))

#> # A tibble: 4 x 4
#> # Rowwise: 
#>   outcomes covars mod    res             
#>   <chr>    <chr>  <list> <list>          
#> 1 y1       x1     <lm>   <tibble [2 x 5]>
#> 2 y1       x2     <lm>   <tibble [2 x 5]>
#> 3 y2       x1     <lm>   <tibble [2 x 5]>
#> 4 y2       x2     <lm>   <tibble [2 x 5]>

^{由 reprex package 于 2021 年 9 月 6 日创建(v2.0.1)}

我们可以用 {purrr} 代替 dplyr::rowwise 做同样的事情:

paramlist %>% 
  mutate(mod = map2(outcomes, covars, ~ lm(reformulate(.y, .x), data = dataset)),
         res = map(mod, broom::tidy)) 

#> # A tibble: 4 x 4
#>   outcomes covars mod    res             
#>   <chr>    <chr>  <list> <list>          
#> 1 y1       x1     <lm>   <tibble [2 x 5]>
#> 2 y1       x2     <lm>   <tibble [2 x 5]>
#> 3 y2       x1     <lm>   <tibble [2 x 5]>
#> 4 y2       x2     <lm>   <tibble [2 x 5]>

^{由 reprex package 于 2021 年 9 月 6 日创建(v2.0.1)}

另一个纯 {purrr} 解决方案是使用嵌套 map称呼。由于它是嵌套的，我们需要 flatten我们可以使用map(summary)之前的结果在他们身上。

# outcomes and covars are the same strings as above

outcomes %>% 
  map(~ map(covars, function(.y) lm(reformulate(.y, .x), data = dataset))) %>% 
  flatten %>% 
  map(summary)

#> [[1]]
#> 
#> Call:
#> lm(formula = reformulate(.y, .x), data = dataset)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -2.20892 -0.56744 -0.08498  0.55445  2.10146 
#> 
#> Coefficients:
#>               Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.0009328  0.0923062  -0.010    0.992
#> x1          -0.0809739  0.0932059  -0.869    0.387
#> 
#> Residual standard error: 0.9173 on 98 degrees of freedom
#> Multiple R-squared:  0.007643,   Adjusted R-squared:  -0.002483 
#> F-statistic: 0.7548 on 1 and 98 DF,  p-value: 0.3871
#> 
#> 
#> [[2]]
#> 
#> Call:
#> lm(formula = reformulate(.y, .x), data = dataset)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -2.11442 -0.59186 -0.08153  0.61642  2.10575 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.02048    0.09461  -0.216    0.829
#> x2          -0.05159    0.10805  -0.477    0.634
#> 
#> Residual standard error: 0.9197 on 98 degrees of freedom
#> Multiple R-squared:  0.002321,   Adjusted R-squared:  -0.007859 
#> F-statistic: 0.228 on 1 and 98 DF,  p-value: 0.6341
#> 
#> 
#> [[3]]
#> 
#> Call:
#> lm(formula = reformulate(.y, .x), data = dataset)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -2.3535 -0.7389 -0.2023  0.6236  3.8627 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.08178    0.10659  -0.767    0.445
#> x1          -0.08476    0.10763  -0.788    0.433
#> 
#> Residual standard error: 1.059 on 98 degrees of freedom
#> Multiple R-squared:  0.006289,   Adjusted R-squared:  -0.003851 
#> F-statistic: 0.6202 on 1 and 98 DF,  p-value: 0.4329
#> 
#> 
#> [[4]]
#> 
#> Call:
#> lm(formula = reformulate(.y, .x), data = dataset)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -2.4867 -0.7020 -0.1935  0.5869  3.7574 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.06575    0.10875  -0.605    0.547
#> x2           0.12388    0.12420   0.997    0.321
#> 
#> Residual standard error: 1.057 on 98 degrees of freedom
#> Multiple R-squared:  0.01005,    Adjusted R-squared:  -5.162e-05 
#> F-statistic: 0.9949 on 1 and 98 DF,  p-value: 0.321

^{由 reprex package 于 2021 年 9 月 6 日创建(v2.0.1)}

关于r - 如何使用 purrr 迭代 lm reg 中协变量和结果的每个组合，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/69077597/

r - 如何使用 purrr 迭代 lm reg 中协变量和结果的每个组合

上一篇：Alpine Linux 上的 imagemagick wmf 支持

下一篇：json - 从 Powershell 中的超大文件中提取多行正则表达式