r - 将 GLM 变量名称和值与术语分开

标签 r dplyr glm broom

我试图将术语列分成两列,即回归中使用的变量和类别的值。

  library(MASS)
#> Warning: package 'MASS' was built under R version 3.5.1
  library(dplyr)
#> Warning: package 'dplyr' was built under R version 3.5.1
#> 
#> Attaching package: 'dplyr'
#> The following object is masked from 'package:MASS':
#> 
#>     select
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
  library(broom)
#> Warning: package 'broom' was built under R version 3.5.1
as_tibble(Titanic) %>%  dplyr::mutate(y_n = if_else(Survived == "Yes", 1, 0)) %>% 
  glm(y_n ~ Class + n + Age + Sex, data = .) %>%  broom::tidy() %>%  print(n = 10)
#> # A tibble: 7 x 5
#>   term        estimate std.error statistic p.value
#>   <chr>          <dbl>     <dbl>     <dbl>   <dbl>
#> 1 (Intercept)  0.567    0.245       2.31    0.0294
#> 2 Class2nd    -0.00528  0.276      -0.0192  0.985 
#> 3 Class3rd     0.0503   0.279       0.180   0.858 
#> 4 ClassCrew    0.0740   0.283       0.262   0.796 
#> 5 n           -0.00106  0.000907   -1.16    0.255 
#> 6 AgeChild    -0.131    0.225      -0.582   0.566 
#> 7 SexMale      0.0833   0.208       0.401   0.692

reprex package于2018年11月2日创建(v0.2.1)

需要这样的东西:

enter image description here

最佳答案

也许以下内容就足够了:

df <- as_tibble(Titanic) %>%  dplyr::mutate(y_n = if_else(Survived == "Yes", 1, 0))
m <- glm(y_n ~ Class + n + Age + Sex, data = df)

(trm <- attr(m$terms, "term.labels")) # Getting original variables
# [1] "Class" "n"     "Age"   "Sex"  
(asgn <- attr(model.matrix(m$formula, data = df), "assign")) # See ?model.matrix
# [1] 0 1 1 1 2 3 4 

cbind(Term = trm[asgn[-1]], 
      Category = str_replace(names(coef(m)[-1]), trm[asgn[-1]], ""))
#      Term    Category
# [1,] "Class" "2nd"   
# [2,] "Class" "3rd"   
# [3,] "Class" "Crew"  
# [4,] "n"     ""      
# [5,] "Age"   "Child" 
# [6,] "Sex"   "Male" 

截取线缺失,但如果需要,您可以在 asgn[1] == 0 的情况下添加它。

关于r - 将 GLM 变量名称和值与术语分开,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53127335/

相关文章:

r - 使用 dplyr 的 select if 函数根据范围条件选择列

regression - 使用 GLM 进行逻辑回归

r - 在 glm 中使用 predict() 函数

r - 按行和列互惠的子集

RMarkdown Reveal.js 演示代码折叠

R - 跨两组列搜索两个条件

r - 按计数对 R 中的表进行排序

r - tidyr::收集不同类型的多列

r - for循环中的多个glm

用递归值替换零和 NA