我有以下数据框:
df <- data.frame(
Code = c("a", "a", "a", "a", "a", "b", "b", "b", "b", "b"),
Inst = c("Yes", "No", "No", "No", "No", "No", "No", "No", "No", "No"),
Date = c(
"2021-01-01", "2021-01-02", "2021-01-03", "2021-01-04", "2021-01-05",
"2021-01-06", "2021-01-06", "2021-01-06", "2021-01-09", "2021-01-10"
)
)
我想将 dplyr::group_by
应用于变量 Code
并针对特定值“Yes”和最小 Date
进行过滤,但是我想保留所有不包含 Yes 值的组的观察结果。我尝试了 filter(any(Inst == "Yes"))
但这不起作用。
我想要这样的结果:
Code Inst Date
a Yes 2021-01-01
b No 2021-01-06
b No 2021-01-06
b No 2021-01-06
最佳答案
如果可以有多个 Yes
值:
df %>%
group_by(Code) %>%
slice(if(all(Inst != "Yes")) 1:n() else which(Inst == "Yes"))
Code Inst
<chr> <chr>
1 a Yes
2 b No
3 b No
4 b No
5 b No
6 b No
考虑更新的问题:
df %>%
mutate(Date = as.Date(Date, format = "%Y-%m-%d")) %>%
group_by(Code) %>%
slice(if(all(Inst != "Yes")) 1:n() else which(Inst == "Yes")) %>%
filter(Date == min(Date))
Code Inst Date
<chr> <chr> <date>
1 a Yes 2021-01-01
2 b No 2021-01-06
3 b No 2021-01-06
4 b No 2021-01-06
关于r - group_by 并保留所有不包含特定值的组,并在有值的地方进行过滤,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67431632/