如何合并具有重叠间隔的 data.frames 中的 data.frame?
数据框 1
read.table(textConnection(
" from to Lith Form
1 0 1.2 GRN BCM
2 1.2 5.0 GDI BDI
"), header=TRUE)
数据框 2
read.table(textConnection(
" from to Weath Str
1 0 1.1 HW ES
2 1.1 2.9 SW VS
3 2.9 5.0 HW ST
"), header=TRUE)
结果数据框
from to Weath Str Lith Form
1 0.0 1.1 HW ES GRN BCM
2 1.1 1.2 SW VS GRN BCM
3 1.2 2.9 SW VS GDI BDI
4 2.9 5.0 HW ST GDI BDI
这是一种方法。它类似于 eddi ( R cutting two data.frames based on intervals and merging ) 的答案,但您可以根据需要在 data.frames 中拥有任意数量的列。
# change your data to data.table
dt1 <- data.table(df1, key='from')
dt2 <- data.table(df2, key='from')
# skeleton for joined data.table
dt <- data.table(from=sort(unique(c(dt1[,from], dt2[,from]))),
to=sort(unique(c(dt1[,to], dt2[,to]))),
key='from')
# function to join skeleton with data.table
j1 <- function(dt, dt1){
dt3 <- dt1[dt, roll=TRUE]
dt3[,':='(to=to.1, to.1=NULL)]
setkey(dt3, from, to)
return(dt3)
}
# merge two data.tables
j1(dt, dt2)[j1(dt, dt1)]
在 v1.9.3 中,最近实现了重叠连接(或间隔连接)。有了这个,我认为你的任务可以按如下方式完成(假设你的 data.frames 是 df1
和 df2
):
require(data.table) ## 1.9.3+
setDT(df1) ## convert to data.table without copy
setDT(df2)
setkey(df2, from, to)
ans = foverlaps(df1, df2, type="any")
ans = ans[, `:=`(from = pmax(from, i.from), to = pmin(to, i.to))]
ans = ans[, `:=`(i.from=NULL, i.to=NULL)][from <= to]
# from to Weath Str Lith Form
# 1: 0.0 1.1 HW ES GRN BCM
# 2: 1.1 1.2 SW VS GRN BCM
# 3: 1.2 2.9 SW VS GDI BDI
# 4: 2.9 5.0 HW ST GDI BDI