r - 如何根据字符串变量对数值变量的值求和

考虑以下数据框:

df <- data.frame(numeric=c(1,2,3,4,5,6,7,8,9,10), string=c("a", "a", "b", "b", "c", "d", "d", "e", "d", "f"))
print(df)
numeric string
1        1      a
2        2      a
3        3      b
4        4      b
5        5      c
6        6      d
7        7      d
8        8      e
9        9      d
10      10      f

它有一个数字变量和一个字符串变量。现在，我想创建另一个数据框，其中字符串变量仅显示唯一值“a”、“b”、“c”、“d”、“e”、“f”的列表，而数字变量是前一个数据帧中数值总和的结果，产生此数据帧:

print(new_df)
numeric string
1        3      a
2        7      b
3        5      c
4       22      d
5        8      e
6       10      f

这可以使用 for 循环来完成，但在大型数据集中效率相当低，我更喜欢其他选项。我尝试过使用 dplyr包，但没有得到预期的结果:

library(dplyr)
> df %>% group_by(string) %>% summarize(result = sum(numeric))
result
1     55

最佳答案

这可能是 plyr 中的屏蔽函数的问题(summarise/mutate 函数也存在于 plyr 中)。我们可以从 dplyr

显式指定 summarise

library(dplyr)
df %>% 
    group_by(string) %>%
    dplyr::summarise(numeric = sum(numeric))

关于r - 如何根据字符串变量对数值变量的值求和，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56028131/

上一篇：tensorflow - 微调 BERT 的最后 x 层

下一篇：r - 为什么 cSplit 返回 TRUE 而不是字符

相关文章：

mysql - 我如何按重复次数排序，然后在 MySQL 中按日期排序？

r - 使用列索引而不是 group_by_at 中的名称显示子组的加权平均值

r - 按第一组元素排序 dplyr

r - 通过使用 dplyr 添加包含序列号的变量来扩展(爆炸？) data.frame

r - 如何在 R 中使用 ggplot 绘制绘图区域的 'outside'？

R. 处理导入的 Stata 文件中的日期和宽格式

r - 当其值在 R 中为 NA 时如何跳过 paste() 参数

mysql - 使用 select、group by 和 count 时如何获得非空结果集？

r - 在 R data.table 中按组分配

c# - 如何获得类似于此 tsql 查询的 lambda 查询？