r - 创建以按组的变量总和为条件的变量

我有一个 data.table 如下:

panelID = c(1:50)   
year    = c(2001:2010)
country = c("NLD", "BEL", "GER")
urban   = c("A", "B", "C")
indust  = c("D", "E", "F")
sizes   = c(1, 2, 3, 4, 5)
n <- 2

library(data.table)

set.seed(123)
DT <- data.table(
    panelID = rep(sample(panelID), each = n),
    country = rep(sample(country, length(panelID), replace = T), each = n),
    year    = c(replicate(length(panelID), sample(year, n))),
    some_NA = sample(0:5, 6),                                             
    some_NA_factor = sample(0:5, 6),
    industry       = rep(sample(indust, length(panelID), replace = T), each = n),
    urbanisation   = rep(sample(urban, length(panelID), replace = T), each = n),
    size      = rep(sample(sizes, length(panelID), replace = T), each = n),
    norm      = round(runif(100)/10, 2),
    sales     = round(rnorm(10, 10, 10), 2),
    Happiness = sample(10, 10),
    Sex       = round(rnorm(10, 0.75, 0.3), 2),
    Age       = sample(100, 100),
    Educ      = round(rnorm(10, 0.75, 0.3), 2)
)        
DT [, uniqueID := .I]  # Creates a unique ID     
DT[DT == 0] <- NA 
DT$sales[DT$sales< 0] <- NA 
DT <- as.data.frame(DT)

我想要的是 panelID 的数量，其中 size 的总和等于 8。所以我想我会这样做:

DT[sum(size)==8, condition:=1, by=panelID]

我在这里做错了什么？

最佳答案

使用data.table:

DT[,conditional:=ifelse(sum(size)==8,1,0),by=panelID][]
# To get the lengths of those which are True(1), save the above as res
#nrow(res[res[,conditional==1],"panelID"])

或者简单地像@chinsoon12建议的那样:

DT[, conditional := +(sum(size)==8), panelID]

结果(头):

 panelID country year some_NA some_NA_factor industry urbanisation size norm sales
1:      31     GER 2010       4              1        F            C    4 0.09  5.63
2:      31     GER 2005       2             NA        F            C    4 0.03 13.31
3:      15     NLD 2005      NA              4        D            C    3 0.05    NA
4:      15     NLD 2008       1              5        D            C    3 0.01 12.12
5:      14     BEL 2003       5              3        E            B    1 0.09 22.37
6:      14     BEL 2002       3              2        E            B    1 0.04 30.38
   Happiness  Sex Age Educ uniqueID conditional
1:         7 0.69  62 0.25        1           1
2:         3 1.00  10 1.31        2           1
3:        10 0.66  59 0.73        3           0
4:         9 0.85  49 0.88        4           0
5:         2 0.34   7 0.90        5           0
6:         5 0.84  61 1.11        6           0

关于r - 创建以按组的变量总和为条件的变量，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/60993550/

r - 创建以按组的变量总和为条件的变量

上一篇：Ruby:不同数量的 case 语句

下一篇：django - 我如何将最近保存的数据发送到 django 中同一 View 函数中的模板中