这是我的示例数据框
example = data.frame(group = c("A", "B", "A", "A"), word = c("car", "sun ,sun, house", "car, house", "tree"))
我只想在组内和组内获得唯一的词
所以我想得到这个
group word
A car, tree
B sun
我使用聚合并得到这个
aggregate(word ~ group , data = example, FUN = paste0)
group word
1 A car, car, house, tree
2 B sun ,sun, house
但现在我只需要选择唯一值,但即使这样也行不通
for (i in 1:nrow(cluster)) {cluster[i, ][["word"]] = lapply(unlist(cluster[i, ][["word"]]), unique)}
与
Error in `[[<-.data.frame`(`*tmp*`, "word", value = list("car", "car, house", :
replacement has 3 rows, data has 1
最佳答案
使用 aggregate
+ subset
+ ave
的基本 R 选项,如下所示
with(
aggregate(
word ~ .,
example,
function(x) {
unlist(strsplit(x, "[, ]+"))
}
),
aggregate(
. ~ ind,
subset(
unique(stack(setNames(word, group))),
ave(seq_along(ind), values, FUN = length) == 1
),
c
)
)
给予
ind values
1 A car, tree
2 B sun
关于r - 按组划分的独特词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73700720/