r - 有条件地使用 data.table 中的变量

这是一个大项目中的一个小挑战，所以我会尽量保持简单。

我正在尝试有条件地将列添加到 data.table，然后有条件地处理它们。

x <- T
y <- data.table(a = 1:10, b = c(rep(1,5), rep(2,5)))

y[  # filter some rows
  a != 1
][  # conditionally add two calculated columns
  ,
  if(x){
    `:=` (
      c = a*b,
      d = 1/b
    )
  }
][  # process columns and group
  ,
  list(
    a = sum(a),
    b = sum(b),
    if(x) c = sum(c)  # only add c if it's created above
  ),
  by = if(x) list(b, d) else list(b)  # only group by d if it's created above
]

这是输出(错误引用第二组[]):

Error in eval(expr, envir, enclos) : object 'd' not found
In addition: Warning message:
In deconstruct_and_eval(m, envir, enclos) :
  Caught and removed `{` wrapped around := in j. := and `:=`(...) are 
                defined for use in j, once only and in particular ways. See help(":=").

当然，错误是警告的症状。我怎样才能完成这个工作？

正如 @Michal 指出的，将 if() 语句放在 data.table 调用之外是一种选择:

if(x) {
  y[
   ...
  ]
} else {
  y[
   ...
  ]
}

我希望有一种方法可以在不重复整个代码的情况下完成此任务，从而简化一切。

最佳答案

我想不出在 j-expression 中执行此操作的方法，因为 := 在那里进行评估(它实际上仅在以下情况下才有效)位于表达式树的根部)，但您可以将其放在 i-expression 中作为解决方法:

x = FALSE
y[a != 1][x, `:=`(c = a * b, d = 1/b)][]
#    a b
#1:  2 1
#2:  3 1
#3:  4 1
#4:  5 1
#5:  6 2
#6:  7 2
#7:  8 2
#8:  9 2
#9: 10 2

x = TRUE
y[a != 1][x, `:=`(c = a * b, d = 1/b)][]
#    a b  c   d
#1:  2 1  2 1.0
#2:  3 1  3 1.0
#3:  4 1  4 1.0
#4:  5 1  5 1.0
#5:  6 2 12 0.5
#6:  7 2 14 0.5
#7:  8 2 16 0.5
#8:  9 2 18 0.5
#9: 10 2 20 0.5

由于c(1)与c(1, NULL)相同，因此当您不确定有多少个元素时，它可以用于返回完整向量将创作它们。

有条件地在 j 中包含列

y[
  ,
  c(
    list(
      a = sum(a), 
      b = sum(b)
    ), 
    if(x) list(c = sum(c))
  )
]

并有条件地在 by 中包含列

y[
  ,
  ...,
  by = c("b", if(x) "d")
]

by 不会接受 list 的 vector，但它会接受列的 vector名称。

关于r - 有条件地使用 data.table 中的变量，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33834022/

r - 有条件地使用 data.table 中的变量

上一篇：Coq 计算风格双条件链

下一篇：objective-c - 动态字段属性名称 Objective C