r - 是否可以将 tidyselect 助手与 cols_only() 函数一起使用？

我有一个像这样的 .csv 文件(除了真正的 .csv 文件有更多列):

library(tidyverse)

tibble(id1 = c("a", "b"),
       id2 = c("c", "d"),
       data1 = c(1, 2),
       data2 = c(3, 4),
       data1s = c(5, 6), 
       data2s = c(7, 8)) %>% 
  write_csv("df.csv")

我只想要 id1、id2、data1 和 data2。

我可以做到这一点:

df <- read_csv("df.csv", 
               col_names = TRUE,
               cols_only(id1 = col_character(),
                         id2 =  col_character(),
                         data1 = col_integer(),
                         data2 = col_integer()))

但是，如上所述，我的真实数据集有更多列，所以我想使用 tidyselect帮助程序仅读取指定的列并确保指定的格式。

我尝试过这个:

df2 <- read_csv("df.csv",
         col_names = TRUE,
         cols_only(starts_with("id") = col_character(),
                   starts_with("data") & !ends_with("s") =  col_integer()))

但是错误消息表明语法有问题。是否可以使用tidyselect helper 就这样？

最佳答案

我的建议在某种程度上是围绕房子的，但它几乎可以让你在“规则”而不是明确的基础上自定义读取规范

library(tidyverse)

tibble(id1 = c("a", "b"),
       id2 = c("c", "d"),
       data1 = c(1, 2),
       data2 = c(3, 4),
       data1s = c(5, 6), 
       data2s = c(7, 8)) %>% 
  write_csv("df.csv")

# read only 1 row to make a spec from with minimal read; really just to get the colnames
df_spec <- spec(read_csv("df.csv", 
               col_names = TRUE,
        n_max = 1))

#alter the spec with base R functions startsWith / endsWith etc.
df_spec$cols <- imap(df_spec$cols,~{if(startsWith(.y,"id")){
  col_character()
} else if(startsWith(.y,"data") &
                                       !endsWith(.y,"s")){
  col_integer()
} else {
  col_skip()
}})

df <- read_csv("df.csv",
               col_types = df_spec$cols)

关于r - 是否可以将 tidyselect 助手与 cols_only() 函数一起使用？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/73642144/

r - 是否可以将 tidyselect 助手与 cols_only() 函数一起使用？

上一篇：多个目标上的 Makefile : . PHONY

下一篇：node.js - 为什么 Yarn 2/Yarn 3 会在 package.json 中添加 `packageManager`，我可以删除它吗？