r - 如何找到满足设定条件的数据框的尾行?

标签 r dplyr

我的数据示例的结构如下:

Individ <- data.frame(Participant = c("Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Jane", "Jane", "Jane", "Jane", 
                                      "Jane", "Jane", "Jane", "Jane", "Jane", "Jane", "Jane", "Jane", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill"),  
                      Time = c(1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6),
                      Condition = c("Placebo", "Placebo", "Placebo", "Placebo", "Placebo", "Placebo", "Expr", "Expr", "Expr", "Expr", "Expr", "Expr", 
                                    "Placebo", "Placebo", "Placebo", "Placebo", "Placebo", "Placebo", "Expr", "Expr", "Expr", "Expr", "Expr", "Expr"),
                      Location = c("Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
                                   "Away", "Away", "Away", "Away", "Away", "Away"),
                      Power = c(400, 250, 180, 500, 300, 450, 600, 512, 300, 500, 450, 200, 402, 210, 130, 520, 310, 451, 608, 582, 390, 570, 456, 205))

Condition等于Placebo并且Location等于Home时,我希望找到每个Participant的尾行。这将用于检查最后一个时间点的Power,因此我可以提前检查剩余的 10 行。因此,找到行号非常重要。

我知道我可以使用以下方法找到每个参与者的最后一行:

ddply(Individ,.(Participant, Time, Condition),function(x) tail(x,1))

但是,我的实际数据帧长度为 400 万行,有超过 50 个参与者,并且在不同的时间点收集了Power。有没有一种方法可以快速做到这一点,而且计算成本不高?

干杯!

最佳答案

使用data.table,我们可以将“data.frame”转换为“data.table”(setDT(Individ)),并按“Participant”分组,在“i”中使用逻辑条件 ('Condition == 'Placebo' & Location =='Home') 并对最后一个观察结果进行子集 (tail(.SD, 1L)or.SD[.N]`)

library(data.table)
setDT(Individ)[Condition=='Placebo' & Location=='Home', 
                             tail(.SD, 1L) ,.(Participant)]
#   Participant Time Condition Location Power
#1:        Bill    6   Placebo     Home   450
#2:        Jane    6   Placebo     Home   451

如果我们需要行号,可以通过.I获取

setDT(Individ)[Condition=='Placebo' & Location=='Home',
        c(rn = .I[.N],tail(.SD, 1L)) ,.(Participant)]
#    Participant rn Time Condition Location Power
#1:        Bill  6    6   Placebo     Home   450
#2:        Jane 18    6   Placebo     Home   451

关于r - 如何找到满足设定条件的数据框的尾行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35403375/

相关文章:

r - 在 R 中插入路径/曲线

r - dplyr 小组未在 Shiny 工作

r - "group_by->summarise->mean()"花费的时间比预期的要长

r - 如何使用不带大括号的管道将值传递给 `filter` 函数?

r - 按连续出现的值分组

r - 分组然后计算缺失的变量?

r - 将生成的多边形网格绘制到传单上的问题

r - 创建 zip 文件 : error running command "" had status 127

r - 沿着从不同年份开始且具有不同窗口长度的时间序列的移动平均值

r - 具有 NULL 名称的对象上的 all.equal 会导致 'Error: not compatible with STRSXP'——错误还是预期?