r - dplyr:函数错误消息

标签 r dplyr

我创建了一个函数来根据 id 减去一些数据。在 dplyr 更新之前,该函数一直运行良好。最初,该函数不接受列名称作为函数中的输入。我用过Programming with dplyr调整函数以接受列名称,但是我现在收到一条新的错误消息。

testdf <- structure(list(date = c("2016-04-04", "2016-04-04", "2016-04-04", 
                        "2016-04-04", "2016-04-04", "2016-04-04"), sensorheight = c(1L, 
                                                                                    16L, 1L, 16L, 1L, 16L), farm = c("McDonald", "McDonald", 
                                                                                                                     "McDonald", "McDonald", "McDonald", "McDonald"
                                                                                    ), location = c("4", "4", "5", "5", "Outside", "Outside"), Temp = c(122.8875, 
                                                                                                                                                        117.225, 102.0375, 98.3625, 88.5125, 94.7)), .Names = c("date", 
                                                                                                                                                                                                                "sensorheight", "farm", "location", "Temp"), row.names = c(NA, 
                                                                                                                                                                                                                                                                           6L), class = "data.frame")


DailyInOutDiff <- function (df, variable) {

  DailyInOutDiff04 <- df %>%
    filter(location %in% c(4, 'Outside')) %>% 
    group_by(date, sensorheight, farm) %>%
    arrange(sensorheight, farm, location) %>%
    summarise(Diff = if(n()==1) NA else !!variable[location=="4"] - !!variable[location=='Outside'], 
              location = "4")  %>%
    select(1, 2, 3, 5, 4)

  DailyInOutDiff05 <- df %>%
    filter(location %in% c(5, 'Outside')) %>% 
    group_by(date, sensorheight, farm) %>%
    arrange(sensorheight, farm, location) %>%
    summarise(Diff = if(n()==1) NA else !!variable[location=="5"] - !!variable[location=='Outside'], 
              location = "5")  %>%
    select(1, 2, 3, 5, 4)

  temp.list <- list(DailyInOutDiff04, DailyInOutDiff05)
  final.df = bind_rows(temp.list)
  return(final.df)
}

test <- DailyInOutDiff(testdf, quo(Temp))

我想知道错误消息的含义以及如何修复它。

 Error in location == "4" : 
    comparison (1) is possible only for atomic and list types 

最佳答案

我认为 ! 的优先级导致了问题。当这种情况发生时,看起来应该使用 UQ 代替 !!

在这种情况下,函数的第一部分将如下所示

DailyInOutDiff <- function (df, variable) {

    variable = enquo(variable)

    df %>%
        filter(location %in% c(4, 'Outside')) %>% 
        group_by(date, sensorheight, farm) %>%
        arrange(sensorheight, farm, location) %>%
        summarise(Diff = if(n()==1) NA else UQ(variable)[location == "4"] - 
                    UQ(variable)[location == "Outside"], 
                location = "4")

}

现在运行没有错误。

DailyInOutDiff(testdf, Temp)

        date sensorheight     farm   Diff location
       <chr>        <int>    <chr>  <dbl>    <chr>
1 2016-04-04            1 McDonald 34.375        4
2 2016-04-04           16 McDonald 22.525        4

我认为使用 UQ 可能是实现此目的的最佳方法。另一种选择是以函数的形式使用提取括号。这也绕过了优先级问题。

例如,代码如下

!!variable[location == "4"]

可以重写为

`[`(!!variable, location == "4")

对函数的第一部分进行这些更改,事情看起来像

DailyInOutDiff <- function (df, variable) {

    variable = enquo(variable)

    df %>%
        filter(location %in% c(4, 'Outside')) %>% 
        group_by(date, sensorheight, farm) %>%
        arrange(sensorheight, farm, location) %>%
        summarise(Diff = if(n()==1) NA else `[`(!!variable, location == "4") - 
                    `[`(!!variable, location == "Outside"), 
                location = "4")

}

运行也没有错误

DailyInOutDiff(testdf, Temp)

        date sensorheight     farm   Diff location
       <chr>        <int>    <chr>  <dbl>    <chr>
1 2016-04-04            1 McDonald 34.375        4
2 2016-04-04           16 McDonald 22.525        4

关于r - dplyr:函数错误消息,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44531973/

相关文章:

r - 在执行中停止 dplyr/tidyr 链并保存计算进度

r - 将日期时间转换为 24 小时格式时出错

postgresql - 将 GNU R 连接到 PostgreSQL

r - 安装 gplot 时出错

r - 过滤到特定列中的特定日期

r - dplyr::row_number() 是否计算每个 obs 的行号?如果是这样,怎么办?

r - 按 id 匹配并在两个数据帧中划分列值

r - 在 dplyr mutate 中使用 seq 函数

r - 检查两个列表在排列上是否相等

正则表达式匹配以逗号作为分隔符的十进制数字