我在 R 中创建了一个自定义函数来准备绘图数据。我将一个数据帧和两列(来自该数据帧)传递给我的函数,然后使用 dplyr。该函数需要按分类变量(在本例中为age.group)进行分组,并且在数据仍然分组时创建连续变量的分箱版本(to.be.binned)并获取该组的计数。我尝试使用 mutate 来完成这两个任务。
此函数中的代码在函数外部工作,但我将数据帧和变量传递给函数(使用大括号,因为它是 dplyr)。
我收到以下错误:
错误:无法修改列“age.group”,因为它是分组变量
我认为我的代码没有做任何修改这个变量的事情。我需要按组进行计数,以便获得每个组的百分比,因此我无法先取消分组(这是对其他遇到相同错误的人的建议)。
如有任何建议,我们将不胜感激!
代表:
library(tidyverse)
simple.df <- data.frame(
age.group = c("18-30","Under 18","Over 30",
"Over 30","Over 30","Under 18","18-30","18-30",
"Over 30","Under 18","18-30","18-30","18-30","18-30",
"Under 18","18-30","Under 18","18-30","Under 18",
"Under 18","Under 18","Over 30","Over 30","Over 30",
"Over 30","Over 30","18-30","Under 18","Over 30",
"Under 18"),
to.be.binned = c(98.415794,32.35116,73.29943,
81.92012,99.61144,29.665798,97.652885,94.94358,
77.798035,24.110243,99.110245,98.415794,99.80469,94.24913,
79.665794,98.415794,72.02691,96.332466,94.94358,
97.02691,97.02691,92.860245,98.415794,97.02691,
90.082466,99.110245,99.80469,98.415794,99.55236,99.110245)
)
bin_by_group <- function(df, my.grouping, bin.this) {
bw = 25
new.df <- df %>%
group_by({{my.grouping}}) %>%
mutate(this.binned = cut(as.numeric({{bin.this}}),
breaks = seq(0, 100, bw),
labels = seq(0 + bw, 100, bw)-(bw/2)),
n = n()) %>%
group_by({{my.grouping}}, this.binned) %>%
summarise(p = n()/n[1]) %>%
ungroup() %>%
mutate(this.binned = as.numeric(as.character(this.binned)))
return(new.df)
}
test.df <- bin_by_group(simple.df, "age.group", "to.be.binned")
#> Warning in cut(as.numeric(~"to.be.binned"), breaks = seq(0, 100, bw), labels =
#> seq(0 + : NAs introduced by coercion
#> Error: Column `"age.group"` can't be modified because it's a grouping variable
最佳答案
只是我们需要传递不带引号的参数,因为 {{}}
希望它不带引号,因为 {{}}
相当于 enquo
+ !!
.
bin_by_group(simple.df, age.group, to.be.binned)
-输出
# A tibble: 7 x 3
# age.group this.binned p
# <chr> <dbl> <dbl>
#1 18-30 87.5 1
#2 Over 30 62.5 0.1
#3 Over 30 87.5 0.9
#4 Under 18 12.5 0.1
#5 Under 18 37.5 0.2
#6 Under 18 62.5 0.1
#7 Under 18 87.5 0.6
如果我们想要传递带引号或不带引号的值,请使用 ensym
进行转换,然后评估 (!!
)
bin_by_group <- function(df, my.grouping, bin.this) {
bw = 25
my.grouping <- ensym(my.grouping)
bin.this <- ensym(bin.this)
new.df <- df %>%
group_by(!! my.grouping) %>%
mutate(this.binned = cut(as.numeric(!!bin.this),
breaks = seq(0, 100, bw),
labels = seq(0 + bw, 100, bw)-(bw/2)),
n = n()) %>%
group_by(!! my.grouping, this.binned) %>%
summarise(p = n()/n[1], .groups = 'drop') %>%
ungroup() %>%
mutate(this.binned = as.numeric(as.character(this.binned)))
return(new.df)
}
-测试
bin_by_group(simple.df, "age.group", "to.be.binned")
# A tibble: 7 x 3
age.group this.binned p
<chr> <dbl> <dbl>
1 18-30 87.5 1
2 Over 30 62.5 0.1
3 Over 30 87.5 0.9
4 Under 18 12.5 0.1
5 Under 18 37.5 0.2
6 Under 18 62.5 0.1
7 Under 18 87.5 0.6
bin_by_group(simple.df, age.group, to.be.binned)
# A tibble: 7 x 3
age.group this.binned p
<chr> <dbl> <dbl>
1 18-30 87.5 1
2 Over 30 62.5 0.1
3 Over 30 87.5 0.9
4 Under 18 12.5 0.1
5 Under 18 37.5 0.2
6 Under 18 62.5 0.1
7 Under 18 87.5 0.6
关于r - 使用 dplyr 中的 mutate 对 R 中的自定义函数中的分组数据使用数据框和列作为参数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66163728/