r - 构建函数以将自然中断 jenks 应用于我的 df 列时出错

标签 r function dplyr classification tidyverse

我正在构建一个函数来使用 BAMMtools::getJenksBreaks 对一些数据进行分类,并将其应用于我的数据框中的许多列。现在该函数如下所示,需要将列名称和列名称保存为字符串。

natural_breaks <- function(var, var_name) {
  var_breaks <- getJenksBreaks(base$var, k = 6)
  base <- base %>% 
    mutate("jenks_{var_name}" := case_when(var >= var_breaks[[1]] & var < var_breaks[[2]] ~ 1,
                                          var >= var_breaks[[2]] & var < var_breaks[[3]] ~ 2,
                                          var >= var_breaks[[3]] & var < var_breaks[[4]] ~ 3,
                                          var >= var_breaks[[4]] & var < var_breaks[[5]] ~ 4,
                                          var >= var_breaks[[5]] & var <= var_breaks[[6]] ~ 5))}

不幸的是,当我测试它时,出现以下错误:

Error: Problem with `mutate()` column `jenks_{var_name}`.
i `jenks_{var_name} = case_when(...)`.
x object 'cob_veg' not found

我有三个疑问:

  • 如何修复它;
  • 如何保 stub 据 var 或 var_name 命名的自定义 var_breaks 对象。
  • 稍后如何将其仅应用于数据框中的特定列。

数据框示例在这里:

structure(list(cob_veg = c(0.86803147357506, 0.98597641788095, 
0.945304875294837, 0.959523866756455, 0.935088200938743, 0.955440071601913, 
0.780679878678589, 0.919061105749216, 0.754083690513821, 0.934638627368186, 
0.779504910469382, 0.932277169486821, 0.983666420737883, 0.969713065983134, 
0.570338049877308, 0.495524555659863, 0.638310245129726, 0.602560510080645, 
0.617785971417673, 0.546486582257036, 0.652756280297962, 0.845794912256059, 
0.860218927651217, 0.819250832367754, 0.878546966532212, 0.551594778584805, 
0.758434691089889, 0.643282494368923, 0.560988091494091, 0.630783245885837, 
0.390769402641844, 0.332421789733322, 0.132023929202626, 0.556019037194128, 
0.46900246096093, 0.654427041913071, 0.617484420038141, 0.264128687792105, 
0.37575984808437, 0.365848359971157, 0.367679682805015, 0.475147381473783, 
0.452224985040756, 0.169207032481509, 0.353810795376943, 0.244133134529422, 
0.343513908469291, 0.25343574732654, 0.465607919266776, 0.859254452617688, 
0.746039265099324, 0.200336275850391, 0.243934238080241, 0.161816742042312, 
0.192100452686832, 0.146047891493549, 0.185522248243521, 0.160885189854072, 
0.371260987816518, 0.596011452471618, 0.607939714389832, 0.582555506807736, 
0.411112999530761, 0.361981098565565, 0.44975008603926, 0.525150178038468, 
0.918128951197249, 0.559847757559022, 0.489984093017057, 0.521531521302737
), perc_usonat_apps = c(0.910200507534977, 0.991310264651337, 
0.990143087982175, 0.956236321704356, 0.959601690409758, 0.974772892917237, 
0.822648680592049, 0.978075628742136, 0.832034767628519, 0.967016851661925, 
0.836218849557041, 0.985103946810543, 0.933429138518985, 0.983499592446542, 
0.66132738832019, 0.588155046046737, 0.693938098572592, 0.722439144480843, 
0.655940658430979, 0.600827161082373, 0.68212195383472, 0.844365474743003, 
0.92925550865909, 0.751330431694333, 0.876136523118295, 0.530249841126731, 
0.741879811498003, 0.571757068800593, 0.508808389211847, 0.567307119497008, 
0.409787813217999, 0.337337435606789, 0.177280655911278, 0.540658093057356, 
0.483555499384771, 0.66203979962686, 0.541514881797238, 0.271047680639661, 
0.388127397267596, 0.348244505340865, 0.383153552695151, 0.410596274192153, 
0.454632302077619, 0.292281014838088, 0.403561577704534, 0.316380462359112, 
0.433421947597835, 0.320347734795976, 0.570434885562273, 0.905674567963832, 
0.738427042829166, 0.358799511159138, 0.435418632993027, 0.295484119265782, 
0.278005865102639, 0.314160999098798, 0.332986934479202, 0.326470971412635, 
0.503609748775341, 0.723882898890287, 0.745560426065341, 0.699198910239965, 
0.519508693857776, 0.466151407380638, 0.560914375310719, 0.591003074957121, 
0.889125746100216, 0.597221446183249, 0.568494315135315, 0.598214051715306
), erosao = c(0.281683063077801, 0.413077052212588, 0.239627807914039, 
0.446506926524651, 0.301132316630354, 0.234425909168424, 0.212642324970679, 
0.288834336987724, 0.261656313369531, 0.353375610212599, 0.312044950637136, 
0.276546984825996, 0.192396819630045, 0.176274043795615, 0.238457559260873, 
0.173051998384093, 0.290174096400216, 0.276213105709424, 0.207362518220635, 
0.203924857380293, 0.209070187490531, 0.188727285753061, 0.183283583795479, 
0.218105912638741, 0.116881993690697, 0.336937422277897, 0.26118317573317, 
0.379659011386182, 0.25180969630994, 0.368804147089116, 0.219280171411813, 
0.417073536525508, 0.303602581137798, 0.19855070054074, 0.269987408545423, 
0.119316814396703, 0.173289625687129, 0.334832844631985, 0.368770836563608, 
0.290468489928059, 0.203277757055697, 0.356586893882311, 0.518587800092028, 
0.505504655736967, 0.674924265431065, 0.718263665824309, 0.66065930517235, 
0.642373517412848, 0.535269517528344, 0.596488715811097, 0.725963746257344, 
0.244844846721053, 0.225104590539911, 0.259928107835643, 0.307226734945412, 
0.265653538111086, 0.282013668233342, 0.188069235825211, 0.424136757281733, 
0.190076913589606, 0.178295187946561, 0.203361993249055, 0.322632404617998, 
0.452358410930949, 0.22093094757563, 0.379344783768208, 0.612939094657735, 
0.581448918472422, 0.358680868765925, 0.136563757192298)), row.names = c(NA, 
-70L), class = c("tbl_df", "tbl", "data.frame"))

提前非常感谢!

最佳答案

如果我们传递一个字符串,我们可以使用.data;如果我们传递一个不带引号的变量,我们可以使用{{}}

 natural_breaks <- function(data, var, var_name) {
   breaks <- getJenksBreaks(data[[var]], k = 6)
    data %>% 
     mutate("jenks_{{var_name}}" := case_when(.data[[var]] >= breaks[[1]] & .data[[var]] < breaks[[2]] ~ 1,
                                           .data[[var]] >= breaks[[2]] & .data[[var]] < breaks[[3]] ~ 2,
                                           .data[[var]] >= breaks[[3]] & .data[[var]] < breaks[[4]] ~ 3,
                                           .data[[var]] >= breaks[[4]] & .data[[var]] < breaks[[5]] ~ 4,
                                           .data[[var]] >= breaks[[5]] & .data[[var]] <= breaks[[6]] ~ 5))
                                           
                                           }

-测试

library(dplyr) 
library(BAMMtools)
natural_breaks(base, "cob_veg", cob_veg) 

-输出

# A tibble: 70 x 4
   cob_veg perc_usonat_apps erosao jenks_cob_veg
     <dbl>            <dbl>  <dbl>         <dbl>
 1   0.868            0.910  0.282             5
 2   0.986            0.991  0.413             5
 3   0.945            0.990  0.240             5
 4   0.960            0.956  0.447             5
 5   0.935            0.960  0.301             5
 6   0.955            0.975  0.234             5
 7   0.781            0.823  0.213             4
 8   0.919            0.978  0.289             5
 9   0.754            0.832  0.262             4
10   0.935            0.967  0.353             5
# … with 60 more rows

关于r - 构建函数以将自然中断 jenks 应用于我的 df 列时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69050922/

相关文章:

r - 有办法 `pipe through a list'吗?

用 NA 替换 <NA>

r - 检查组中的任何日期是否在 r 中该组的特定时间间隔内

SQL 服务器 : auto-generated custom format sequence number

bash - 使用别名覆盖内置命令

r - 将变量值转换为列名; tidyr::spread 中的 "duplicate identifiers for rows"

c++ - 使用变量名过滤 dplyr 的 tbl_df

r - {reactlog} 生成的 r Shiny react 日志图形中的 "Theme Counter"是什么?

r - 跟踪观察在两个数据帧之间发生了什么变化以及如何变化

c: 我的函数不计算任何单词