r - 在 dplyr 中使用 c() 聚合字符串汇总或聚合

我想使用 c() 作为 dplyr 中的聚合函数来聚合一些字符串。我首先尝试了以下方法:

> InsectSprays$spray = as.character(InsectSprays$spray)
> dt = tbl_df(InsectSprays)
> dt %>% group_by(count) %>% summarize(c(spray))
Error: expecting a single value

但是在aggregate()中使用c()函数是有效的:

> da = aggregate(spray ~ count, InsectSprays, c)
> head(da)
  count                  spray
1     0                   C, C
2     1       C, C, C, C, E, E
3     2             C, C, D, E>

在 stackoverflow 中搜索 hinted使用带有崩溃功能的paste()而不是c()函数可以解决这个问题:

dt %>% group_by(count) %>% summarize(s=paste(spray, collapse=","))

或

dt %>% group_by(count) %>% summarize(paste( c(spray), collapse=","))

我的问题是:为什么 c() 函数在aggregate() 中起作用，但在dplyr summarise() 中不起作用？

最佳答案

如果你仔细观察，你会发现当我们使用do()时，c()实际上确实起作用了(在一定程度上)。但据我了解，dplyr 目前不允许这种类型的列表打印

> InsectSprays$spray = as.character(InsectSprays$spray)
> dt = tbl_df(InsectSprays)
> doC <- dt %>% group_by(count) %>% do(s = c(.$spray))
> head(doC)
Source: local data frame [6 x 2]

  count        s
1     0 <chr[2]>
2     1 <chr[6]>
3     2 <chr[4]>
4     3 <chr[8]>
5     4 <chr[4]>
6     5 <chr[7]>

> head(doC)[[2]]
[[1]]
[1] "C" "C"

[[2]]
[1] "C" "C" "C" "C" "E" "E"

[[3]]
[1] "C" "C" "D" "E"

[[4]]
[1] "C" "C" "D" "D" "E" "E" "E" "E"

[[5]]
[1] "C" "D" "D" "E"

[[6]]
[1] "D" "D" "D" "D" "D" "E" "E"

关于r - 在 dplyr 中使用 c() 聚合字符串汇总或聚合，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/27288833/

上一篇：xcode - 无效 bundle 错误 - "requires launch storyboard"

下一篇：performance - 在 Linux 上本地对 Apache 进行基准测试的好方法是什么？

相关文章：

r - 如何为ggplot强制geom_smooth渲染？

r - 如何将嵌套列表分解为数据框，其中每个嵌入列表成为单独的列？

r - 按年份排序facet_wrap图的更简单方法

python - 检查 Python 字符串是否是有效的 Excel 单元格

c++ - 为什么我的字符串第二次没有正确接受输入？

将字符串转换为 double

django - Elasticsearch 聚合[python]

mysql - 按百分比排序汇总结果？

r - 障碍模型预测 - 计数与响应

c# - 有没有一种简单的方法可以通过非交换操作进行并行聚合？