我正在尝试合并两个数据集。过去我用过 merge()
与 by
等于我想合并的变量。但是,现在我想用两个变量来做到这一点。我的第一个数据集看起来像这样:
Year Winning_Tm Losing_Tm
2011 Texas Washington
2012 Alabama South Carolina
2013 Tennessee Texas
然后我有另一个数据集,其中包含每年每个团队的排名(这是非常简化的)。像这样:
Year Team Rank
2011 Texas 32
2011 Washington 34
2012 South Carolina 45
2012 Alabama 12
2013 Texas 6
2013 Tennessee 51
我想合并它们,所以我有一个如下所示的数据集:
Year Winning_Tm Winning_TM_rank Losing_Tm Losing_Tm_rank
2011 Texas 32 Washington 34
2012 Alabama 12 South Carolina 45
2013 Tennessee 51 Texas 6
我希望有一种简单的方法可以做到这一点,但它可能更复杂。谢谢!
最佳答案
我复制了你的数据(下次尝试包含它的 dput
):
A <- data.frame(
Year = c(2011, 2012, 2013),
Winning_Tm = c("Texas","Alabama","Tennessee"),
Losing_Tm = c("Washington","South Carolina", "Texas"),
stringsAsFactors = FALSE
)
B <- data.frame(
Year = c("2011","2011","2012","2012","2013","2013"),
Team = c("Texas","Washington","South Carolina","Alabama","Texas","Tennessee"),
Rank = c(32,34,45,12,6,51),
stringsAsFactors = FALSE
)
您可以
melt
使用 reshape2
的第一个数据帧包裹:library(reshape2)
A <- melt(A, id.vars = "Year")
names(A)[3] <- "Team"
现在它看起来像这样:
> A
Year variable Team
1 2011 Winning_Tm Texas
2 2012 Winning_Tm Alabama
3 2013 Winning_Tm Tennessee
4 2011 Losing_Tm Washington
5 2012 Losing_Tm South Carolina
6 2013 Losing_Tm Texas
然后,您可以按感兴趣的两列将数据集合并在一起:
AB <- merge(A, B, by=c("Year","Team"))
看起来像这样:
> AB
Year Team variable Rank
1 2011 Texas Winning_Tm 32
2 2011 Washington Losing_Tm 34
3 2012 Alabama Winning_Tm 12
4 2012 South Carolina Losing_Tm 45
5 2013 Tennessee Winning_Tm 51
6 2013 Texas Losing_Tm 6
然后使用
reshape
来自基础 R 的命令您可以更改 AB
到宽格式:reshape(AB, idvar = "Year", timevar = "variable", direction = "wide")
结果:
Year Team.Winning_Tm Rank.Winning_Tm Team.Losing_Tm Rank.Losing_Tm
1 2011 Texas 32 Washington 34
3 2012 Alabama 12 South Carolina 45
5 2013 Tennessee 51 Texas 6
关于r - 通过 R 中的 2 个变量合并数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39151389/