假设我们有一个看起来像的数据框
set.seed(7302012)
county <- rep(letters[1:4], each=2)
state <- rep(LETTERS[1], times=8)
industry <- rep(c("construction", "manufacturing"), 4)
employment <- round(rnorm(8, 100, 50), 0)
establishments <- round(rnorm(8, 20, 5), 0)
data <- data.frame(state, county, industry, employment, establishments)
state county industry employment establishments
1 A a construction 146 19
2 A a manufacturing 110 20
3 A b construction 121 10
4 A b manufacturing 90 27
5 A c construction 197 18
6 A c manufacturing 73 29
7 A d construction 98 30
8 A d manufacturing 102 19
我们想 reshape 它,使每一行代表一个(州和)县,而不是县工业,列
construction.employment
, construction.establishments
,以及用于制造的类似版本。什么是有效的方法来做到这一点?一种方法是子集
construction <- data[data$industry == "construction", ]
names(construction)[4:5] <- c("construction.employment", "construction.establishments")
与制造业类似,然后进行合并。如果只有两个行业,这还不错,但是想象一下有 14 个;这个过程会变得乏味(虽然通过在
for
的级别上使用 industry
循环可以减少)。还有其他想法吗?
最佳答案
如果我正确理解您的问题,这可以在基础 R reshape 中完成:
reshape(data, direction="wide", idvar=c("state", "county"), timevar="industry")
# state county employment.construction establishments.construction
# 1 A a 146 19
# 3 A b 121 10
# 5 A c 197 18
# 7 A d 98 30
# employment.manufacturing establishments.manufacturing
# 1 110 20
# 3 90 27
# 5 73 29
# 7 102 19
关于 reshape 数据框 --- 将行更改为列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11725964/