r - 使用 dplyr 进行计数和分组

我的目标只是统计每天每小时的记录数。我认为使用 dplyr 或 data.table 包可以找到一个简单的解决方案:

我的数据集非常简单:

> head(test)
        id       date hour
1 14869663 2018-01-24   17
2 14869664 2018-01-24   17
3 14869665 2018-01-24   17
4 14869666 2018-01-24   17
5 14869667 2018-01-24   17
6 14869668 2018-01-24   17

我只需要按两个变量(日期和小时)分组并计数。 id 无关紧要。然而，dplyr 中的这两种方法似乎并没有产生预期的结果(与输入数据相同长度的数据帧，其中包含数百万条记录，是输出)。我在这里做错了什么？

test %>% group_by(date, hour) %>% mutate(count = n())
test %>% add_count(date, hour)

输出看起来像这样

> head(output)
n_records       date hour
1 700      2018-01-24   0
2 750      2018-01-24   1
3 730      2018-01-24   2
4 700      2018-01-24   3
5 721      2018-01-24   4
6 753      2018-01-24   5

等等

有什么建议吗？

最佳答案

这似乎可以解决问题:

library(dplyr)
starwars %>% 
    group_by(gender, species) %>%
    count

似乎(h/t to Frank)计数函数可以直接采用分组字段:

starwars %>% count(gender, species)

关于r - 使用 dplyr 进行计数和分组，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/48454334/

上一篇：javascript - 如何修复 Angular 5 中的 CORS 问题 http 请求

下一篇：firebase - Firestore 云功能接受多部分/表单数据？

r - 为什么R不再打开我保存的文档？

mysql - MariaDB - 更新/删除无提示失败

r - 从R中的字符串中匹配提取的国家名称

r - 使用 dplyr 按自定义顺序排列行

R删除data.table中的多个文本字符串

r - R 中的 "Only the first element of list is used"

r - dplyr 管道 : how to add a margin row calculating a total (like addmargins function - base)

r - 根据条件对多列进行变异，每一列都有不同的设置

regex - data.table集名与正则表达式结合使用