r - 无法访问 dbplyr 中的字符串方法

标签 r dbplyr

我正在尝试使用 str_detect , str_replace , str_replace_all dbplyr 中的方法与 oracle作为 beckend 数据库,但似乎无法访问此方法。

这是错误:

db_tbl %>% mutate(COMMENTS_NEW = str_detect(COMMENTS,"[^[:alnum:]///' ]", "")) %>% show_query()

Error: str_detect() is not available in this SQL variant

我已经重新安装了所有软件包,但仍然没有用。
但是,我可以看到它是在 dbplyr 1.2.0 中实现的。见 here ?

试过 grepl这转化为:
db_tbl %>% mutate(COMMENTS_NEW = grepl(COMMENTS,pattern = '[^[:alnum:]]')) %>% show_query()

<SQL>
Named arguments ignored for SQL greplSELECT grepl("COMMENTS", '[^[:alnum:]]' AS "pattern") AS "COMMENTS_NEW"
FROM ("schema".table) 

也返回错误。这是回溯:

20.
stop(structure(list(message = "<SQL> 'SELECT * FROM (SELECT \"COMMENTS\", \"TYPE_28\", grepl(\"COMMENTS\", '[^[:alnum:]]' AS \"pattern\") AS \"COMMENTS_NEW\"\nFROM (\"schema\".table) ) \"zzz3\" WHERE ROWNUM <= 6.0'\n nanodbc/nanodbc.cpp:1587: HY000: [Oracle][ODBC][Ora]ORA-00907: missing right parenthesis\n ", call = NULL, cppstack = NULL), class = c("odbc::odbc_error", "C++Error", "error", "condition")))
19.
new_result(connection@ptr, statement)
18.
OdbcResult(connection = conn, statement = statement)
17.
dbSendQuery(con, sql)
16.
dbSendQuery(con, sql)
15.
db_collect.DBIConnection(x$src$con, sql, n = n, warn_incomplete = warn_incomplete)
14.
db_collect(x$src$con, sql, n = n, warn_incomplete = warn_incomplete)
13.
collect.tbl_sql(x, n = n)
12.
collect(x, n = n)
11.
as.data.frame(collect(x, n = n))
10.
as.data.frame.tbl_sql(head(x, n + 1))
9.
as.data.frame(head(x, n + 1))
8.
trunc_mat(x, n = n, width = width, n_extra = n_extra)
7.
format.tbl(x, ..., n = n, width = width, n_extra = n_extra)
6.
format(x, ..., n = n, width = width, n_extra = n_extra)
5.
paste0(..., "\n")
4.
cat(paste0(..., "\n"), sep = "")
3.
cat_line(format(x, ..., n = n, width = width, n_extra = n_extra))
2.
print.tbl_sql(x)
1.
function (x, ...) UseMethod("print")(x)


继承人我的 session :

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] dbplot_0.3.2           pool_0.1.4.2           dbplyr_1.4.2           DBI_1.0.0              odbc_1.1.6             data.table_1.11.8     
 [7] qdap_2.3.0             RColorBrewer_1.1-2     qdapTools_1.3.3        qdapRegex_0.7.2        qdapDictionaries_1.0.7 textclean_0.9.3       
[13] drlib_0.1.0            lubridate_1.7.4        ggrepel_0.8.0          fpp2_2.3               expsmooth_2.3          fma_2.3               
[19] forecast_8.5           recipes_0.1.5          textSummary_0.1.0      scales_1.0.0           janitor_1.1.1          forcats_0.3.0         
[25] stringr_1.4.0          dplyr_0.8.1            purrr_0.2.5            readr_1.2.1            tidyr_0.8.2            tibble_2.1.1          
[31] ggplot2_3.2.0          tidyverse_1.2.1       

loaded via a namespace (and not attached):
  [1] openNLPdata_1.5.3-4 colorspace_1.4-1    class_7.3-14        rprojroot_1.3-2     fs_1.2.6            base64enc_0.1-3    
  [7] rstudioapi_0.8      remotes_2.0.2       bit64_0.9-7         prodlim_2018.04.18  fansi_0.4.0         xml2_1.2.0         
 [13] splines_3.5.0       knitr_1.20          pkgload_1.0.2       jsonlite_1.6        venneuler_1.1-0     rJava_0.9-10       
 [19] broom_0.5.1         compiler_3.5.0      httr_1.3.1          backports_1.1.2     assertthat_0.2.1    Matrix_1.2-14      
 [25] lazyeval_0.2.2      cli_1.1.0           later_0.8.0         prettyunits_1.0.2   tools_3.5.0         igraph_1.2.2       
 [31] NLP_0.2-0           gtable_0.3.0        glue_1.3.1          reshape2_1.4.3      Rcpp_1.0.1          slam_0.1-43        
 [37] cellranger_1.1.0    fracdiff_1.4-2      urca_1.3-0          gdata_2.18.0        nlme_3.1-137        lmtest_0.9-36      
 [43] timeDate_3043.102   gower_0.1.2         gender_0.5.2        ps_1.2.1            xlsxjars_0.6.1      testthat_2.0.1     
 [49] rvest_0.3.2         devtools_2.0.1      gtools_3.8.1        XML_3.98-1.16       xlsx_0.6.1          MASS_7.3-49        
 [55] zoo_1.8-5           ipred_0.9-8         hms_0.4.2           parallel_3.5.0      yaml_2.2.0          quantmod_0.4-14    
 [61] curl_3.3            memoise_1.1.0       gridExtra_2.3       rpart_4.1-13        stringi_1.4.3       desc_1.2.0         
 [67] tseries_0.10-46     plotrix_3.7-4       TTR_0.23-4          pkgbuild_1.0.2      openNLP_0.2-6       lava_1.6.4         
 [73] chron_2.3-53        rlang_0.4.0         pkgconfig_2.0.2     bitops_1.0-6        lattice_0.20-35     processx_3.2.0     
 [79] bit_1.1-14          tidyselect_0.2.5    plyr_1.8.4          magrittr_1.5        R6_2.4.0            generics_0.0.2     
 [85] pillar_1.3.1        haven_2.0.0         withr_2.1.2         xts_0.11-2          survival_2.41-3     RCurl_1.95-4.11    
 [91] nnet_7.3-12         modelr_0.1.2        crayon_1.3.4        utf8_1.1.4          wordcloud_2.6       usethis_1.4.0      
 [97] grid_3.5.0          readxl_1.1.0        callr_3.0.0         blob_1.1.1          reports_0.1.4       digest_0.6.18      
[103] tm_0.7-5            munsell_0.5.0       sessioninfo_1.1.1   quadprog_1.5-5     

最佳答案

这不是真正的答案,而是一个简单的解决方法。

问题是dbplyr::无法创建一个足够的 SQL 子句(SQL 没有名称为 str_detectgrepl 的函数),所以它抛出了毛巾(和一个错误)。

在这两个表达式中,您都会收到错误消息,因为 dbplyr cannot translate neither stringr::str_detect() nor base::grepl() to a valid SQL expression. One way to get almost what you want is to收集()before you过滤器()`:

db_tbl %>% 
  mutate(COMMENTS_NEW = str_detect(COMMENTS,"[^[:alnum:]///' ]", "")) %>% 
  show_query()
db_tbl %>% 
  mutate(COMMENTS_NEW = str_detect(COMMENTS,"[^[:alnum:]///' ]", "")) %>% 
  collect()
db_tbl %>% 
  mutate(COMMENTS_NEW = grepl(COMMENTS,pattern = '[^[:alnum:]]')) %>% 
  show_query()
db_tbl %>% 
  mutate(COMMENTS_NEW = grepl(COMMENTS,pattern = '[^[:alnum:]]')) %>% 
  collect()

但是,如果您放置 collect()前...
db_tbl %>% 
  collect() %>%
  mutate(COMMENTS_NEW = str_detect(COMMENTS,"[^[:alnum:]///' ]", ""))
db_tbl %>% 
  collect() %>%
  mutate(COMMENTS_NEW = grepl(COMMENTS,pattern = '[^[:alnum:]]'))

您的远程表成为本地表,您可以对其应用 str_detect()平静地。

作为旁注,show_query()由于显而易见的原因,它不再有意义。

关于r - 无法访问 dbplyr 中的字符串方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57773578/

相关文章:

r - 在我的 R 包中包含数据库连接

r - Sparklyr 使用 case_when 和变量

r - 使用 R 开发地理专题图

r - dplyr 用于逐行分位数

r - 使用 confusionMatrix 计算 Precision、Recall 和 F-Score

r - 将 dplyr 查询保存到 postgresql

r - 错误 : The dbplyr package is required to communicate with database backends

r - 将列添加到 sqlite 数据库

r - 绘制二分加折线图比较

r - 如何在 RMarkdown 中隐藏代码,并可选择查看