我有一个关于如何根据变量是否满足特定条件重新分类的问题。也就是说,如果该类别不符合标准,它将被分配到另一个符合标准的类别。
我的数据具有以下形式:
data = data.frame(firm_size = c("Micro", "Small", "Medium","Big"),
employees = c(5,10,100,1000))
> data
firm_size employees
1 Micro 5
2 Small 10
3 Medium 100
4 Big 1000
所以,如果我的条件是我必须将员工少于 10 人的公司归为一组,然后将它们与其他符合条件的类别合并
> new_data
firm_size employees
1 Micro-Small 15
3 Medium 100
4 Big 1000
我想做的是编写一个函数来泛化这个过程,例如,如果我的数据是,它也可以工作
> data
firm_size employees
1 Micro 5
2 Small 8
3 Medium 9
4 Big 1000
> new_data
firm_size employees
1 Micro-Small-Medium 22
4 Big 1000
我认为这可以使用 tidyverse 的工具来完成。
提前致谢
最佳答案
这是一个使用 tally
的方法:
library(dplyr)
size <- 10
data %>%
arrange(firm_size,desc(employees)) %>%
group_by(firm_size = c(as.character(firm_size[employees > size]),
rep(paste(firm_size[employees <= size], collapse = "-"),
sum(employees <= size)))) %>%
tally(employees, name = "employees")
## A tibble: 3 x 2
# firm_size employees
# <chr> <dbl>
#1 Big 1000
#2 Medium 100
#3 Small-Micro 15
对于您的第二组数据:
data2 %>%
arrange(firm_size,desc(employees)) %>%
group_by(firm_size = c(as.character(firm_size[employees > size]),
rep(paste(firm_size[employees <= size], collapse = "-"),
sum(employees <= size)))) %>%
tally(employees, name = "employees")
## A tibble: 2 x 2
# firm_size employees
# <chr> <int>
#1 Big 1000
#2 Medium-Small-Micro 22
数据
data <- structure(list(firm_size = structure(c(3L, 4L, 2L, 1L), .Label = c("Big",
"Medium", "Micro", "Small"), class = "factor"), employees = c(5,
10, 100, 1000)), class = "data.frame", row.names = c(NA, -4L))
data2 <- structure(list(firm_size = structure(c(3L, 4L, 2L, 1L), .Label = c("Big",
"Medium", "Micro", "Small"), class = "factor"), employees = c(5L,
8L, 9L, 1000L)), class = "data.frame", row.names = c("1", "2",
"3", "4"))
关于r - 根据是否满足 tidyverse 的一系列条件对类别进行分组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62317089/