我有一个包含 3 列和 47,772 行的矩阵。行内有 64 个参数。
目前数据框看起来像:
SAMPLE_DATE PARAMETER RESULT
8/2/1954 Alkalinity, total as CaCO3(mg/L) 112.5
8/2/1954 Depth, Secchi disk depth(m) 2.44
8/2/1954 Nutrient-nitrogen as N(mg/L) 0.87
8/2/1954 Phosphorus as P(mg/L) 0.001
8/2/1954 Sulfate as SO4(mg/L) 11
3/7/1962 Alkalinity, total as CaCO3(mg/L) 140
3/7/1962 Alkalinity, total as CaCO3(mg/L) 320
3/7/1962 Alkalinity, total as CaCO3(mg/L) 130
3/7/1962 Ammonia-nitrogen as N(mg/L) 0.02
3/7/1962 Ammonia-nitrogen as N(mg/L) 0.26
3/7/1962 Ammonia-nitrogen as N(mg/L) 0.02
3/7/1962 Apparent color(PCU) 10
3/7/1962 Apparent color(PCU) 10
....
我想把它变成这样的东西:
Date Alkalinity, total as CaCO3(mg/L) Depth, Secchi disk depth(m).....etc
8/2/1954 112.5 2.44 ..... etc
注意:不是每个日期都有每个参数
有什么想法吗?
最佳答案
这是一种方法。我添加了一个“时间”变量,因为存在重复的“SAMPLE_DATE”+“PARAMETER”组合。
library(reshape2) # for dcast
library(splitstackshape) # for getanID
x2 <- getanID(x, id.vars = c("SAMPLE_DATE", "PARAMETER"))
dcast(x2, .id + SAMPLE_DATE ~ PARAMETER, value.var = "RESULT")
# .id SAMPLE_DATE Alkalinity, total as CaCO3(mg/L) Ammonia-nitrogen as N(mg/L)
# 1 1 3/7/1962 140.0 0.02
# 2 1 8/2/1954 112.5 NA
# 3 2 3/7/1962 320.0 0.26
# 4 3 3/7/1962 130.0 0.02
# Apparent color(PCU) Depth, Secchi disk depth(m) Nutrient-nitrogen as N(mg/L)
# 1 10 NA NA
# 2 NA 2.44 0.87
# 3 10 NA NA
# 4 NA NA NA
# Phosphorus as P(mg/L) Sulfate as SO4(mg/L)
# 1 NA NA
# 2 0.001 11
# 3 NA NA
# 4 NA NA
如上,但使用“data.table”包:
library(data.table)
packageVersion("data.table")
# [1] ‘1.8.11’
DT <- data.table(x)
DT[, .id := sequence(.N), by = list(SAMPLE_DATE, PARAMETER)]
dcast.data.table(DT, .id + SAMPLE_DATE ~ PARAMETER, value.var="RESULT")
如果您不希望重复的组合在单独的行中,则必须先以某种方式聚合
数据。
关于r - 将具有三列(日期、参数、结果)的矩阵解析为在指定日期 R 的每个参数都有一列的矩阵,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20526152/