我想按 ID 合并两个数据集。 dataset1 中的日期应仅与 dataset2 中最近的日期匹配。我希望 dataset1 中的所有日期都包含在合并中。
dataset1 <- read.table(text="ID Date
A 2021-03-18
A 2021-04-27
A 2021-04-05
A 2021-05-02
A 2021-02-08
A 2021-06-02
A 2021-05-29 ", header=TRUE)
dataset2 <- read.table(text="ID Date
A 2021-01-01
A 2021-01-01
A 2021-05-02
A 2021-05-09
A 2021-05-09
A 2021-05-09
A 2021-05-09
A 2021-06-16
A 2021-06-27 ", header=TRUE)
带有 roll = "nearest"
的 data.table
选项:
setDT(dataset1)[, c("Date", "Date1") := as.Date(Date)]
setDT(dataset2)[, c("Date", "nearest") := as.Date(Date)]
dataset2[dataset1, on = .(ID, Date), roll = "nearest"][, Date := NULL][]
ID nearest Date1
1: A 2021-05-02 2021-03-18
2: A 2021-05-02 2021-04-27
3: A 2021-05-02 2021-04-05
4: A 2021-05-02 2021-05-02
5: A 2021-01-01 2021-02-08
6: A 2021-06-16 2021-06-02
7: A 2021-06-16 2021-05-29
匹配行数的其他选项:
dataset1[dataset2, on = .(ID, Date), roll = "nearest"][, Date := NULL][]
ID Date1 nearest
1: A 2021-02-08 2021-01-01
2: A 2021-02-08 2021-01-01
3: A 2021-05-02 2021-05-02
4: A 2021-05-02 2021-05-09
5: A 2021-05-02 2021-05-09
6: A 2021-05-02 2021-05-09
7: A 2021-05-02 2021-05-09
8: A 2021-06-02 2021-06-16
9: A 2021-06-02 2021-06-27