我有 10 个日期变量,假设每个后续变量都在前一个变量之上或之后(我必须检查这个假设是否成立)。我要比较TloCriminal1CrimeDetails1Di_0001
至TloCriminal1CrimeDetails2Di_0001
TloCriminal1CrimeDetails2Di_0001
至TloCriminal1CrimeDetails3Di_0001
, ..., TloCriminal1CrimeDetails9Di_0001
至TloCriminal1CrimeDetails10D_0001
。理想情况下,对于每一对,我想输出名为 compare1to2
的变量。 , compare2to3
, ..., compare9to10
等于 1
如果该对的第二个实例位于第一个实例之上或之后,且 0
否则。如果这是不可能的,则等于 1
的“总体”变量如果任何对是“坏”的(例如,第二个日期在第一个日期之前)并且 0
否则就足够了。
我尝试在 SAS 工作,但意识到这是不可能的,所以我换到了 R。我没有一个好的起点。这是我的数据集的片段。感谢您的帮助!
structure(list(TloCriminal1CrimeDetails1Di_0001 = structure(c(10197,
12205, 15979, 12586, NA, 13787, 12913, 14616), label = "TloCriminal1CrimeDetails1DispositionDate", format.sas = "DATE", class = "Date"),
TloCriminal1CrimeDetails2Di_0001 = structure(c(10148, NA,
15979, 12586, NA, 14516, 12913, 14665), label = "TloCriminal1CrimeDetails2DispositionDate", format.sas = "MMDDYY", class = "Date"),
TloCriminal1CrimeDetails3Di_0001 = structure(c(10148, NA,
NA, 12586, NA, 13787, 12913, 14665), label = "TloCriminal1CrimeDetails3DispositionDate", format.sas = "MMDDYY", class = "Date"),
TloCriminal1CrimeDetails4Di_0001 = structure(c(NA, NA, NA,
NA, NA, NA, 12913, 14670), label = "TloCriminal1CrimeDetails4DispositionDate", format.sas = "MMDDYY", class = "Date"),
TloCriminal1CrimeDetails5Di_0001 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_), label = "TloCriminal1CrimeDetails5DispositionDate", format.sas = "MMDDYY", class = "Date"),
TloCriminal1CrimeDetails6Di_0001 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_), label = "TloCriminal1CrimeDetails6DispositionDate", format.sas = "MMDDYY", class = "Date"),
TloCriminal1CrimeDetails7Di_0001 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_), label = "TloCriminal1CrimeDetails7DispositionDate", format.sas = "MMDDYY", class = "Date"),
TloCriminal1CrimeDetails8Di_0001 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_), label = "TloCriminal1CrimeDetails8DispositionDate", format.sas = "MMDDYY", class = "Date"),
TloCriminal1CrimeDetails9Di_0001 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_), label = "TloCriminal1CrimeDetails9DispositionDate", format.sas = "MMDDYY", class = "Date"),
TloCriminal1CrimeDetails10D_0001 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_), label = "TloCriminal1CrimeDetails10DispositionDate", format.sas = "MMDDYY", class = "Date")), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"), label = "CRIME_CHK")
最佳答案
我们可以删除第一列和最后一列,然后以向量化的方式进行比较
out <- +(df1[-1] >= df1[-ncol(df1)])
out[is.na(out)] <- FALSE
如果是在每列中查找任意
元素
colSums(out, na.rm = TRUE) == 0
关于r - 如何检查后续日期变量是否出现在前一个日期变量之上或之后,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65512699/