r - 仅当列存在时才执行dplyr操作

借助关于conditional dplyr evaluation的讨论，我想根据所传递的数据帧中是否存在引用列来有条件地执行管道中的步骤。

例

1)和2)生成的结果应该相同。

现有栏

# 1)
mtcars %>% 
  filter(am == 1) %>%
  filter(cyl == 4)

# 2)
mtcars %>%
  filter(am == 1) %>%
  {
    if("cyl" %in% names(.)) filter(cyl == 4) else .
  }

不可用的栏

# 1)
mtcars %>% 
  filter(am == 1)

# 2)    
mtcars %>%
  filter(am == 1) %>%
  {
    if("absent_column" %in% names(.)) filter(absent_column == 4) else .
  }

问题

对于可用列，传递的对象不对应于初始数据帧。原始代码返回错误消息：

filter(cyl == 4)中的错误：找不到对象'cyl'

我尝试了其他语法（没有运气）：

>> mtcars %>%
...   filter(am == 1) %>%
...   {
...     if("cyl" %in% names(.)) filter(.$cyl == 4) else .
...   }
 Show Traceback

 Rerun with Debug
 Error in UseMethod("filter_") : 
  no applicable method for 'filter_' applied to an object of class "logical"

跟进

我想扩展这个问题，以解决==调用中filter右侧的评估问题。例如，以下语法尝试过滤第一个可用值。
mtcars％>％

filter({
    if ("does_not_ex" %in% names(.))
      does_not_ex
    else
      NULL
  } == {
    if ("does_not_ex" %in% names(.))
      unique(.[['does_not_ex']])
    else
      NULL
  })

预期，该调用会评估为错误消息：

filter_impl(.data, quo)中的错误：结果的长度必须为32，而不是0

当应用于现有列时：

mtcars %>%
  filter({
    if ("mpg" %in% names(.))
      mpg
    else
      NULL
  } == {
    if ("mpg" %in% names(.))
      unique(.[['mpg']])
    else
      NULL
  })

它与警告消息一起工作：

  mpg cyl disp  hp drat   wt  qsec vs am gear carb
1  21   6  160 110  3.9 2.62 16.46  0  1    4    4

警告消息：在{中：较长的对象长度不是以下内容的倍数
较短的物体长度

后续问题

是否有一种巧妙的方法来扩展现有语法，以便在filter调用的右侧获得条件评估，从而理想地留在dplyr工作流程中？

最佳答案

使用dplyr> 1.0.0中的across()时，现在可以在过滤时使用any_of。将原始数据与所有列进行比较：

mtcars %>% 
  filter(am == 1) %>% 
  filter(cyl == 4)

删除cyl时，将引发错误：

mtcars %>% 
  select(!cyl) %>% 
  filter(am == 1) %>% 
  filter(cyl == 4)

使用any_of（请注意，您必须编写"cyl"而不是cyl）：

mtcars %>% 
  select(!cyl) %>% 
  filter(am == 1) %>% 
  filter(across(any_of("cyl"), ~.x == 4))
#N.B. this is equivalent to just filtering by `am == 1`.

关于r - 仅当列存在时才执行dplyr操作，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45146688/

r - 仅当列存在时才执行dplyr操作

上一篇：node.js - 如何将缓冲区作为 fs.createReadStream 的参数传递

下一篇：spring-boot - 为 facebook 实现PrincipalExtractor(带社交的 Spring Boot)