r - 如何缩短这个长的 dplyr 语法？

在小标题中，我希望能够更正变量 nbeta_dep01、nbeta_dep02 ...

所取的某些值

下面是我正在做的一个可重现的例子。

我想知道是否有缩短语法的方法(因为在我的示例中，我复制并粘贴了与变量 nbeta_depXX 一样多的修正指令)

suppressMessages(library(dplyr))

test <- tribble(
  ~ent, ~dep_impl, ~nbeta_dep01, ~nbeta_dep02, ~nbeta_dep03, ~nbeta_dep04, ~nbeta_dep05,
  "a",  "01",  0,   0,   0,   0,   0,  
  "b",  "03",  2,   0,   3,   0,   1,
  "c",  "05",  0,   0,   0,   1,   0,
  "d",  "02",  0,   0,   0,   0,   0
)

test %>% 
  rowwise() %>% 
  mutate(
    nbeta_dep01 = ifelse(
      nbeta_dep01==0 & nbeta_dep02==0 & nbeta_dep03==0 & nbeta_dep04==0 & nbeta_dep05==0 & dep_impl=="01",
      1,
      nbeta_dep01),
    nbeta_dep02 = ifelse(
      nbeta_dep01==0 & nbeta_dep02==0 & nbeta_dep03==0 & nbeta_dep04==0 & nbeta_dep05==0 & dep_impl=="02",
      1,
      nbeta_dep02),
    nbeta_dep03 = ifelse(
      nbeta_dep01==0 & nbeta_dep02==0 & nbeta_dep03==0 & nbeta_dep04==0 & nbeta_dep05==0 & dep_impl=="03",
      1,
      nbeta_dep03),
    nbeta_dep04 = ifelse(
      nbeta_dep04==0 & nbeta_dep02==0 & nbeta_dep03==0 & nbeta_dep04==0 & nbeta_dep05==0 & dep_impl=="04",
      1,
      nbeta_dep04),
  )
#> # A tibble: 4 x 7
#> # Rowwise: 
#>   ent   dep_impl nbeta_dep01 nbeta_dep02 nbeta_dep03 nbeta_dep04 nbeta_dep05
#>   <chr> <chr>          <dbl>       <dbl>       <dbl>       <dbl>       <dbl>
#> 1 a     01                 1           0           0           0           0
#> 2 b     03                 2           0           3           0           1
#> 3 c     05                 0           0           0           1           0
#> 4 d     02                 0           1           0           0           0
Created on 2021-10-25 by the reprex package (v2.0.1)

最佳答案

你可以使用

library(dplyr)
library(stringr)

test %>% 
  mutate(across(matches("dep\\d+$"), 
       ~ifelse(rowSums(across(nbeta_dep01:nbeta_dep05)) == 0 & dep_impl == str_extract(cur_column(), "\\d+$"),
               1,
               .x)))

# A tibble: 4 x 7
  ent   dep_impl nbeta_dep01 nbeta_dep02 nbeta_dep03 nbeta_dep04 nbeta_dep05
  <chr> <chr>          <dbl>       <dbl>       <dbl>       <dbl>       <dbl>
1 a     01                 1           0           0           0           0
2 b     03                 2           0           3           0           1
3 c     05                 0           0           0           1           0
4 d     02                 0           1           0           0           0

我们使用正则表达式识别要更改的列:"dep\\d+$" 匹配以“dep”后跟两位数字结尾的所有列。这些列在 across() 函数中使用。
if 语句得到简化:因为所有 nbeta_dep 列都需要为 0，我们通过使用 对这些列求和>rowSum 函数结合选择 across() 函数。此外，我们检查当前列名中的数字是否与 dep_impl 列中的数字匹配。
如果满足这些条件，我们返回 1，否则返回当前列/行中已有的值 .x。

关于r - 如何缩短这个长的 dplyr 语法？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/69704385/

r - 如何缩短这个长的 dplyr 语法？

上一篇：vue.js - Vue - 在初始路由之前检查 localStorage(用于 token )

下一篇：unicode - 如何在 Julia 中表示任何 unicode 字符？