r - 执行具有多种结果的LM

标签 r dplyr

我有以下数据框

> df <- dput(df2)
structure(list(Economy = c("FRANCE", "FRANCE", "SPAIN", "SPAIN", 
"GREECE", "GREECE", "ITALY", "ITALY", "PORTUGAL", "PORTUGAL"), 
    ConditionA = c(9, 12, 12, 12, 12, 12, 13, 13, 12, 13), ConditionB = c(16, 
    16, 18, 21, 27, 27, 30, 36, 36, 36), ConditionC = c(27, 29, 
    31, 34, 41, 48, 52, 56, 56, 56), ConditionD = c(NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_)), row.names = c(NA, 10L), class = "data.frame


> df2
    Economy ConditionA ConditionB ConditionC ConditionD
1    FRANCE          9         16         27         NA
2    FRANCE         12         16         29         NA
3     SPAIN         12         18         31         NA
4     SPAIN         12         21         34         NA
5    GREECE         12         27         41         NA
6    GREECE         12         27         48         NA
7     ITALY         13         30         52         NA
8     ITALY         13         36         56         NA
9  PORTUGAL         12         36         56         NA
10 PORTUGAL         13         36         56         NA

我想对每个国家/地区的每个条件进行线性回归。

    df %>% 
      dplyr::select(-Economy) %>%  # exclude outcome, leave only predictors 
      map(~lm(.x ~ Economy , data =df, na.action = "na.omit"))  %>%
      map(summary)
     Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

原始数据框中有 188 个条件。怎么了?

最佳答案

除了修复 ConditionD 之外,您还可以一次性回归所有条件:

M = df2[,grep("Condition",colnames(df2))]
M = as.matrix(M[,colSums(!is.na(M))>0])

fit = lm(M ~ Economy,data=df2)

lapply(summary(fit),coefficients)

$`Response ConditionA`
                Estimate Std. Error  t value     Pr(>|t|)
(Intercept)         10.5  0.7071068 14.84924 2.505578e-05
EconomyGREECE        1.5  1.0000000  1.50000 1.939037e-01
EconomyITALY         2.5  1.0000000  2.50000 5.449010e-02
EconomyPORTUGAL      2.0  1.0000000  2.00000 1.019395e-01
EconomySPAIN         1.5  1.0000000  1.50000 1.939037e-01

$`Response ConditionB`
                Estimate Std. Error   t value     Pr(>|t|)
(Intercept)         16.0    1.50000 10.666667 0.0001253456
EconomyGREECE       11.0    2.12132  5.185450 0.0035093242
EconomyITALY        17.0    2.12132  8.013877 0.0004889171
EconomyPORTUGAL     20.0    2.12132  9.428090 0.0002265750
EconomySPAIN         3.5    2.12132  1.649916 0.1598731108

$`Response ConditionC`
                Estimate Std. Error   t value     Pr(>|t|)
(Intercept)         28.0   1.974842 14.178351 3.142696e-05
EconomyGREECE       16.5   2.792848  5.907948 1.978175e-03
EconomyITALY        26.0   2.792848  9.309493 2.406736e-04
EconomyPORTUGAL     28.0   2.792848 10.025608 1.688635e-04
EconomySPAIN         4.5   2.792848  1.611258 1.680400e-01

我认为现在 broom 中的 tidy 可以与此传销配合使用:

library(broom)
tidy(fit)
# A tibble: 15 x 6
   response   term            estimate std.error statistic   p.value
   <chr>      <chr>              <dbl>     <dbl>     <dbl>     <dbl>
 1 ConditionA (Intercept)        10.5      0.707     14.8  0.0000251
 2 ConditionA EconomyGREECE       1.5      1.         1.5  0.194    
 3 ConditionA EconomyITALY        2.5      1.         2.50 0.0545   
 4 ConditionA EconomyPORTUGAL     2.       1.         2    0.102    
 5 ConditionA EconomySPAIN        1.5      1.         1.5  0.194    
 6 ConditionB (Intercept)        16.0      1.50      10.7  0.000125 
 7 ConditionB EconomyGREECE      11.       2.12       5.19 0.00351  
 8 ConditionB EconomyITALY       17.       2.12       8.01 0.000489 
 9 ConditionB EconomyPORTUGAL    20.       2.12       9.43 0.000227 
10 ConditionB EconomySPAIN        3.5      2.12       1.65 0.160    
11 ConditionC (Intercept)        28        1.97      14.2  0.0000314
12 ConditionC EconomyGREECE      16.5      2.79       5.91 0.00198  
13 ConditionC EconomyITALY       26.0      2.79       9.31 0.000241 
14 ConditionC EconomyPORTUGAL    28.0      2.79      10.0  0.000169 
15 ConditionC EconomySPAIN        4.50     2.79       1.61 0.168    

关于r - 执行具有多种结果的LM,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61371558/

相关文章:

r - r 中的 mutate 和 truncate 函数未产生所需的输出

带有 dplyr 和 magrittr 的 rollmean

r - 过滤多个范围的时间序列

r - R中有并行矩阵求逆的包吗

r - data.table 按索引粘贴选定的列

r - 禁用 TortoiseSVN 项目监视器

r - 分组矩阵相关

regex - 使用 dplyr 从 data.table 中删除列

c - 无法链接 Rcpp 的 cpp 文件中的相关头文件

r - XTS数据占用太多内存空间?