r - 在嵌套的 dplyr 数据集中运行配对 t 检验的问题

我已经了解了新 dplyr v1.0.0 的逐行操作的小插图，并且对 nest_by 函数在不同孤岛内建模的可能性很感兴趣一个数据集。

但是，我很难让重复测量分析发挥作用。

这是一个例子来说明它何时确实工作

df1 <- data.frame(group = factor(rep(LETTERS[1:3],10)),
                  pred = factor(rep(letters[1:2],each=5,length.out=30)),
                  out = rnorm(30))

现在根据 group 变量创建嵌套。

library(dplyr)
nest1 <- df1 %>% nest_by(group)
nest

我们可以查看这个新的特殊嵌套数据框

# A tibble: 3 x 2
# Rowwise:  group
# group               data
# <fct> <list<tbl_df[,2]>>
# a               [10 x 2]
# b               [10 x 2]
# c               [10 x 2]

现在我们可以对其执行操作，如线性回归，在原始组变量的每个级别内对 pred 进行回归 out。

mods <- nest1 %>% mutate(mod = list(lm(out ~ pred, data = data)))

在这个新对象中，我们向包含 lm() 对象的原始嵌套数据集添加了一个新列

mods

#   # A tibble: 3 x 3
#   # Rowwise:  group
#   group               data mod   
#   <fct> <list<tbl_df[,2]>> <list>
#   1 A               [10 x 2] <lm>  
#   2 B               [10 x 2] <lm>  
#   3 C               [10 x 2] <lm>

并且我们可以查看这些模型的结果

library(broom)
mods %>% summarise(broom::tidy(mod))
#   A tibble: 6 x 6
#   Groups:   group [3]
#   group term        estimate std.error statistic  p.value
#   <fct> <chr>          <dbl>     <dbl>     <dbl>  <dbl>
# 1 A     (Intercept)   0.0684     0.295     0.232  0.823 
# 2 A     predb        -0.231      0.418    -0.553  0.595 
# 3 B     (Intercept)  -0.159      0.447    -0.356  0.731 
# 4 B     predb         0.332      0.633     0.524  0.615 
# 5 C     (Intercept)  -0.385      0.245    -1.57   0.154 
# 6 C     predb         0.891      0.346     2.58   0.0329

现在我希望能够做同样的事情，但需要重复测量 t 检验。

# dataset with grouping factor and two columns, each representing a measure at one of two timepoints
df2 <- data.frame(group = factor(rep(letters[1:3],10)),
                  t1 = rnorm(30),
                  t2 = rnorm(30))

# nest by grouping factor
nest2 <- df2 %>% nest_by(group)
nest2

# A tibble: 3 x 2

# Rowwise:  group
# group                 data
# <fct>   <list<tbl_df[,2]>>
# 1 a               [10 x 2]
# 2 b               [10 x 2]
# 3 c               [10 x 2]

现在，当我尝试在新嵌套数据集的每个级别执行配对 t 检验时，使用与线性模型类似的过程...

mods2 <- nest2 %>% mutate(t = list(t.test(t1, t2, data = data)))

...我收到以下错误消息

Error: Problem with `mutate()` input `t`.
x object 't1' not found
i Input `t` is `list(t.test(t1, t2, data = data))`.
i The error occured in row 1.
Run `rlang::last_error()` to see where the error occurred.

谁能帮帮我？

最佳答案

data 选项与 formula 方法一起使用，而 's3' 方法与 x, y 一起使用作为参数，我们可以使用 with

包装

library(dplyr)
library(purrr)
nest2 %>%
      mutate(t = list(with(data, t.test(t1, t2))))
# A tibble: 3 x 3
# Rowwise:  group
#  group               data t      
#  <fct> <list<tbl_df[,2]>> <list> 
#1 a               [10 × 2] <htest>
#2 b               [10 × 2] <htest>
#3 c               [10 × 2] <htest>

或者使用提取器($, [[)

nest2 %>% 
    mutate(t = list(t.test(data$t1, data$t2)))

关于r - 在嵌套的 dplyr 数据集中运行配对 t 检验的问题，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/63310058/

r - 在嵌套的 dplyr 数据集中运行配对 t 检验的问题

上一篇：python - 如何使用经过训练的模型编写 keras 以读取我自己的图片？

下一篇：google-drive-api - 使用 Google Docs API 从 Google Drive 插入图片