我想将 2 个表 1 和表 2(左侧在 COlB 上,右侧在 ColD 上)与最大匹配字符串连接起来
表1
表2
输出表
最佳答案
使用fuzzyjoin
,可以选择根据距离进行连接
library(fuzzyjoin)
library(dplyr)
stringdist_inner_join(df1, df2, by = c(ColB = "ColD"),
max_dist = 0.5, method = "jaccard") %>%
select(-ColD)
ColA ColB ColC
1 123 C/O room Hanbur court vaux road Lightroom
2 456 House Malveri business park Office
数据
df1 <- structure(list(ColA = c(123L, 456L),
ColB = c("C/O room Hanbur court vaux road",
"House Malveri business park")), class = "data.frame", row.names = c(NA,
-2L))
df2 <- structure(list(ColD = c("Hanbur Court", "Malveri park"),
ColC = c("Lightroom",
"Office")), class = "data.frame", row.names = c(NA, -2L))
关于r - 如何在R中连接两个具有最大匹配字符串的表?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71831994/