我有一个看起来像这样的数据框
df <- data.frame (
age = rep(c("40-44", "45-49", "50-54", "55-59", "60-64"),4),
dep = rep(c("Dep1", "Dep2", "Dep3", "Dep4", "Dep5"),4),
ethnic = rep(c(rep("M",5),rep("NM",5)),2),
gender = c(rep("M",10), rep("F",10))
)
我正在尝试为许多相似的数据框生成描述性统计信息,所有这些数据框都来自不同的来源,因此我可以进行比较。
我正在运行以下代码来获取性别、性别、种族、年龄和性别 + 种族的计数和比例作为函数,我可以将其应用于多个数据集
Dems_fun <- function(data, sex, eth, dep, age) {
Fun <- function(data, ...) {
group_var <- quos(...)
data %>%
group_by(!!! group_var) %>%
summarise (n = n()) %>%
mutate(freq = n / sum(n)) %>%
unite(dem, !!! group_var, sep = "_", remove = T)
}
Sex <- Fun(data, sex)
Sex_eth <- Fun(data, sex, eth)
Eth <- Fun(data, eth)
Dep <- Fun(data, dep)
Age <- Fun(data, age)
Dems <- rbind(Sex, Sex_eth, Eth, Dep, Age)
colnames(Dems) <- c("Category", "count", "percentage")
return(Dems)
}
当我运行这个函数时
test <- Dems_fun(df, gender, ethnic, dep, age)
我收到以下错误消息:
Error in grouped_df_impl(data, unname(vars), drop) : Column
sex
is unknown
谁能告诉我哪里出错了?
我看过类似的问题Error with using enquo for creating function with ddplyr ,但我无法判断是否同样的错误适用于我的示例。
最佳答案
您唯一缺少的是您需要 enquo
函数的列名,然后在稍后将它们用作函数参数时取消引用( !!
)它们。所以你会做age_var <- enquo(age)
然后用 !!age_var
返回引用当你调用Fun
.
library(tidyverse)
df <- data.frame (
age = rep(c("40-44", "45-49", "50-54", "55-59", "60-64"),4),
dep = rep(c("Dep1", "Dep2", "Dep3", "Dep4", "Dep5"),4),
ethnic = rep(c(rep("M",5),rep("NM",5)),2),
gender = c(rep("M",10), rep("F",10))
)
Dems_fun <- function(data, sex, eth, dep, age) {
# enquo all these variables
sex_var <- enquo(sex)
eth_var <- enquo(eth)
dep_var <- enquo(dep)
age_var <- enquo(age)
Fun <- function(data, ...) {
group_var <- quos(...)
data %>%
group_by(!!! group_var) %>%
summarise (n = n()) %>%
mutate(freq = n / sum(n)) %>%
unite(dem, !!! group_var, sep = "_", remove = T)
}
# unquote all these variables
Sex <- Fun(data, !!sex_var)
Sex_eth <- Fun(data, !!sex_var, !!eth_var)
Eth <- Fun(data, !!eth_var)
Dep <- Fun(data, !!dep_var)
Age <- Fun(data, !!age_var)
Dems <- rbind(Sex, Sex_eth, Eth, Dep, Age)
colnames(Dems) <- c("Category", "count", "percentage")
return(Dems)
}
Dems_fun(df, gender, ethnic, dep, age)
#> # A tibble: 18 x 3
#> Category count percentage
#> <chr> <int> <dbl>
#> 1 F 10 0.5
#> 2 M 10 0.5
#> 3 F_M 5 0.5
#> 4 F_NM 5 0.5
#> 5 M_M 5 0.5
#> 6 M_NM 5 0.5
#> 7 M 10 0.5
#> 8 NM 10 0.5
#> 9 Dep1 4 0.2
#> 10 Dep2 4 0.2
#> 11 Dep3 4 0.2
#> 12 Dep4 4 0.2
#> 13 Dep5 4 0.2
#> 14 40-44 4 0.2
#> 15 45-49 4 0.2
#> 16 50-54 4 0.2
#> 17 55-59 4 0.2
#> 18 60-64 4 0.2
由 reprex package 创建于 2018-05-30 (v0.2.0).
关于r - grouped_df_impl 中的错误(数据,取消命名(vars),删除): Column is unknown,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50614988/