r - 如何在R中基于多个条件组合两个data.tables?

标签 r date

我有两个 data.tables,我想根据一个表中的日期是否在另一个表中的给定时间范围内来组合它们。在 dt1 中,我有退出日期,我想在 dt2 中检查每个 ID 在退出日期时哪些值有效。

dt1 <- data.table (ID = 1:10,
                  exit = c("31/12/2010", "01/01/2021", "30/09/2010", "31/12/2015", "30/09/2010","31/10/2018", "01/02/2016", "01/05/2015", "01/09/2013", "01/01/2016"))

dt2 <- data.table (ID = c(1,2,2,2,3,5,6,6,7,8,8,9,10),
                   valid_from = c("01/01/2010", "01/01/2012", "01/01/2013", "01/12/2017", "01/05/2010", "01/04/2010", "01/05/2014", "01/11/2016", "01/01/2016", "15/04/2013", "01/01/2015", "15/02/2010", "01/04/2012"),
                   valid_until = c("01/01/2021", "31/12/2012", "30/11/2017", "01/01/2021", "01/01/2021", "01/01/2021", "31/10/2016", "01/01/2021", "01/01/2021", "31/12/2014", "01/05/2015", "01/01/2013", "01/01/2021"),
                   text1 = c("a", "a", "b", "c", "b", "b", "c", "a", "a", "b", "a", "c", "a"),
                   text2 = c("I", "I", "II", "I", "III", "I", "II", "III", "I", "II", "II", "I", "III" ))

    ID       exit
 1:  1 31/12/2010
 2:  2 01/01/2021
 3:  3 30/09/2010
 4:  4 31/12/2015
 5:  5 30/09/2010
 6:  6 31/10/2018
 7:  7 01/02/2016
 8:  8 01/05/2015
 9:  9 01/09/2013
10: 10 01/01/2016

    ID valid_from valid_until text1 text2
 1:  1 01/01/2010  01/01/2021     a     I
 2:  2 01/01/2012  31/12/2012     a     I
 3:  2 01/01/2013  30/11/2017     b    II
 4:  2 01/12/2017  01/01/2021     c     I
 5:  3 01/05/2010  01/01/2021     b   III
 6:  5 01/04/2010  01/01/2021     b     I
 7:  6 01/05/2014  31/10/2016     c    II
 8:  6 01/11/2016  01/01/2021     a   III
 9:  7 01/01/2016  01/01/2021     a     I
10:  8 15/04/2013  31/12/2014     b    II
11:  8 01/01/2015  01/05/2015     a    II
12:  9 15/02/2010  01/01/2013     c     I
13: 10 01/04/2012  01/01/2021     a   III

因此,我想在 dt1 中返回退出日期的有效值。 如果在 dt2 中未找到 ID(示例数据中的 ID 4 就是这种情况),则应返回 NA。

     ID       exit text1 text2
 1:  1 31/12/2010     a     I
 2:  2 01/01/2021     c     I
 3:  3 30/09/2010     b   III
 4:  4 31/12/2015  <NA>  <NA>
 5:  5 30/09/2010     b     I
 6:  6 31/10/2018     a   III
 7:  7 01/02/2016     a     I
 8:  8 01/05/2015     a    II
 9:  9 01/09/2013     c     I
10: 10 01/01/2016     a   III

谁能帮我解决这个问题吗?

最佳答案

由于输入是 data.table,请考虑使用快速的 data.table 方法

library(data.table)
# // convert the date columns to `Date` class
dt1[, exit := as.IDate(exit, '%d/%m/%Y')]
dt2[, c('valid_from', 'valid_until') := .(as.IDate(valid_from, '%d/%m/%Y'), 
       as.IDate(valid_until, '%d/%m/%Y'))]
# // do a non-equi join
 dt1[dt2, c('text1', 'text2') := .(i.text1, i.text2),
     on = .(ID, exit >= valid_from, exit <= valid_until)]

-输出

> dt1
    ID       exit text1 text2
 1:  1 2010-12-31     a     I
 2:  2 2021-01-01     c     I
 3:  3 2010-09-30     b   III
 4:  4 2015-12-31  <NA>  <NA>
 5:  5 2010-09-30     b     I
 6:  6 2018-10-31     a   III
 7:  7 2016-02-01     a     I
 8:  8 2015-05-01     a    II
 9:  9 2013-09-01  <NA>  <NA>
10: 10 2016-01-01     a   III

关于r - 如何在R中基于多个条件组合两个data.tables?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69514627/

相关文章:

javascript - 为什么我不能将调用日期对象方法的结果存储在变量中

javascript - 如何在 Js 中创建交货日期范围

r - R 的 Plotly 中的两个 X 轴

r - 通过跨多列测试逻辑条件进行过滤

r - 使用数据帧中另一列的时间戳检查特定时间跨度的列中的值

r - 如何用r中的条件替换文本元素

ios - 动态改变cell.update日期函数?

r - 如何重新调整 logit 输出值

android - Android Eclipse 中的 Date.getTime

delphi - 在运行时获取数值而不是日期值