这才突然出现在我的脑海里,
让我们从最近的一个问题中举出这个例子:
数据:
df1<-
structure(list(Year = c(2015L, 2015L, 2015L, 2015L, 2016L, 2016L,
2016L, 2016L), Category = c("a", "1", "2", "3", "1", "2", "3",
"1"), Value = c(2L, 3L, 2L, 1L, 7L, 2L, 1L, 1L)), row.names = c(NA,
-8L), class = "data.frame")
代码:
aggregate( Value ~ Year + c(MY_NAME = c("OneTwo", "three")[Category %in% 1:2 + 1]), data=df1, FUN=sum )
当前输出:(看新 var 的长丑名字)
# Year c(MY_NAME = c("OneTwo", "three")[Category %in% 1:2 + 1]) Value
#1 2015 OneTwo 3
#2 2016 OneTwo 1
#3 2015 three 5
#4 2016 three 10
所需的输出:
# Year MY_NAME Value
#1 2015 OneTwo 3
#2 2016 OneTwo 1
#3 2015 three 5
#4 2016 three 10
请注意:
code:
中的one-liner中添加代码来直接设置新变量的名称。部分。 最佳答案
而不是 c
, 我们需要 cbind
,这会导致 matrix
列名为“MY_NAME”的一列,而 c
得到一个 named
vector
具有“MY_NAME”的唯一名称( make.unique
)
aggregate( Value ~ Year +
cbind(MY_NAME = c("OneTwo", "three")[Category %in% 1:2 + 1]), data=df1, FUN=sum )
# Year MY_NAME Value
#1 2015 OneTwo 3
#2 2016 OneTwo 1
#3 2015 three 5
#4 2016 three 10
在
?aggregate
,提到了cbind
的用法在 formula
方法formula - a formula, such as y ~ x or cbind(y1, y2) ~ x1 + x2, where the y variables are numeric data to be split into groups according to the grouping x variables (usually factors).
带有
tidyverse
的选项将是library(dplyr)
df1 %>%
group_by(Year, MY_NAME = c("OneTwo", "three")[Category %in% 1:2 + 1]) %>%
summarise(Value = sum(Value))
关于r - 设置公式中定义的变量名称,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52992152/