继 here 中的上一个问题之后, 并进行进一步的统计分析, 我想知道是否有可能去除存在于 >= 3 个数据帧中的常见峰值
a <- data.frame(ID = c("1", "2", "3", "4", "5"), peak = c("peak1", "peak2", "peak3", "peak4", "peak10"))
b <- data.frame(ID = c("1", "2", "3", "4"), peak = c("peak1","peak3", "peak20", "peak21"))
c <- data.frame(ID = c("1", "2", "3"), peak = c("peak1", "peak5", "peak3"))
d <- data.frame(ID = c("1", "2", "3", "4", "5", "6"),peak = c("peak1", "peak3", "peak7", "peak8", "peak11", "peak12"))
e <- data.frame(ID = c("1", "2", "3"), peak = c("peak1", "peak3", "peak9"))
并且我想删除存在于 >= 3 个数据帧中的常见峰值,并具有所需的输出:
a <- data.frame(ID = c("1", "2", "3", "4", "5"), peak = c("peak2", "peak4", "peak10"))
b <- data.frame(ID = c("1", "2", "3", "4"), peak = c("peak20", "peak21"))
c <- data.frame(ID = c("1", "2", "3"), peak = c( "peak5"))
d <- data.frame(ID = c("1", "2", "3", "4", "5", "6"),peak = c( "peak7", "peak8", "peak11", "peak12"))
e <- data.frame(ID = c("1", "2", "3"), peak = c ("peak9"))
最佳答案
如果'peak'值在每个数据集中是唯一的,将数据集绑定(bind)在一起成为一个数据(bind_rows
),得到'peak'的count
,filter
'n' 小于 3 的行并pull
那些 'peak' 元素
library(dplyr)
to_keep <- bind_rows(a, b, c, d, e, .id = 'grp') %>%
count(peak) %>%
filter(n < 3) %>%
pull(peak)
现在我们更新全局环境中的对象(不推荐),在根据“to_keep”的峰值对这些元素进行子集
之后使用assign
for(obj in letters[1:5]) {
assign(obj, subset(get(obj), peak %in% to_keep))
}
或者将对象保存在列表
中并从那里获取子集
library(purrr)
lst1 <- lst(a, b, c, d, e) %>%
map(~ .x %>%
filter(peak %in% to_keep))
关于r - 如果存在于 =.>3 个数据框中(共 5 个),如何删除公共(public)元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67200590/