我有一个示例数据框,其中有一列每行存储 3 个字母。数据框还有 2 个附加列:日期和颜色:
Alphabet Date Colour
ABC 2018-09-10 green
DEF 2017-06-11 red
GHI 2016-05-12 blue
JKL NA yellow
MNO NA orange
PQR Unknown brown
此数据框中的某些日期丢失/未知。我有另一个数据框,它也有字母表和日期列。第二个数据帧包含第一个数据帧中缺失日期的日期:
Alphabet Date
JKL 2017-06-07
MNO 2018-08-03
PQR 2019-10-07
STU 2019-11-08
VWX 2019-12-08
我想通过匹配两个数据帧中的字母记录来填充第一个数据帧中缺失的日期,然后将第二个数据帧中的日期插入到第一个数据帧中。
期望的输出:
Alphabet Date Colour
ABC 2018-09-10 green
DEF 2017-06-11 red
GHI 2016-05-12 blue
JKL 2017-06-07 yellow
MNO 2018-08-03 orange
PQR 2019-10-07 brown
感谢任何帮助。
最佳答案
一个选项是与data.table
连接
library(data.table)
setDT(df1)[df2, Date := i.Date, on = .(Alphabet)]
df1
# Alphabet Date Colour
#1: ABC 2018-09-10 green
#2: DEF 2017-06-11 red
#3: GHI 2016-05-12 blue
#4: JKL 2017-06-07 yellow
#5: MNO 2018-08-03 orange
#6: PQR 2019-10-07 brown
更新
使用新的“df2n”数据集
i1 <- is.na(df1$Date)|df1$Date %in% "Unknown"
setDT(df1)[df2n[df2n$Alphabet %in% df1$Alphabet[i1],],
Date := i.Date, on = .(Alphabet)]
df1
# Alphabet Date Colour
#1: ABC 2018-09-10 green
#2: DEF 2017-06-11 red
#3: GHI 2016-05-12 blue
#4: JKL 2017-06-07 yellow
#5: MNO 2018-08-03 orange
#6: PQR 2019-10-07 brown
或者使用base R
中的match
i1 <- match(df2$Alphabet, df1$Alphabet)
df1$Date[i1] <- df2$Date
数据
df1 <- structure(list(Alphabet = c("ABC", "DEF", "GHI", "JKL", "MNO",
"PQR"), Date = c("2018-09-10", "2017-06-11", "2016-05-12", NA,
NA, "Unknown"), Colour = c("green", "red", "blue", "yellow",
"orange", "brown")), class = "data.frame", row.names = c(NA,
-6L))
df2 <- structure(list(Alphabet = c("JKL", "MNO", "PQR"), Date = c("2017-06-07",
"2018-08-03", "2019-10-07")), class = "data.frame", row.names = c(NA,
-3L))
df2a <- structure(list(Alphabet = c("JKL", "MNO", "PQR", "STU", "VWX"
), Date = c("2017-06-07", "2018-08-03", "2019-10-07", "2019-11-08",
"2019-12-08")), class = "data.frame", row.names = c(NA, -5L))
关于r - 使用不同的数据框填充列中的缺失值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59096149/