R 多个数据框列匹配以填充列

我有一个数据框“df1”，如下所示:

structure(list(MAPS_code = c("SARI", "SABO", "SABO", "SABO", 
"ISLA", "TROP"), Location_code = c("LCP-", "LCP-", "LCP-", "LCP-", "LCP-",
"LCP-"), Contact = c("Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", 
"Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall"), Lat = c(NA, NA, NA, 
NA, NA, "51.23"), Long = c(NA, NA, NA, NA, NA, "-109.26")), row.names = c(NA, 6L), class = "data.frame")

第二个数据框“df2”如下所示:

structure(list(MAPS_code = c("SAFR", "SAGA", "ELPU", "ISLA", 
"SABO", "SATE", "QUST", "SARI", "PANA", "COPA", "LOAN", "GAPA", 
"MELI", "CAGO", "PINO", "GABO", "RIJA", "FILA", "AMIS"), Lat = c(8.765833, 
8.751389, 8.768611, 8.835833, 8.801111, 8.808333, 8.815, 8.827778, 
8.781667, 8.778333, 8.783333, 8.800833, 8.790278, 8.754444, 8.844444, 
8.801389, 8.786667, 8.785278, 8.952222), Long = c(-82.94277, 
-82.951111, -82.95, -82.963056, -82.917222, -82.924444, -82.923889, 
-82.924167, -82.896944, -82.955833, -82.938611, -82.972222, -82.967222, 
-82.925833, -82.97, -82.972222, -82.964722, -82.976111, -82.833333
), Contact = c("Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", 
"Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", 
"Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", 
"Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", 
"Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall"
), Location = c("LCP-", "LCP-", "LCP-", "LCP-", "LCP-", "LCP-", 
"LCP-", "LCP-", "LCP-", "LCP-", "LCP-", "LCP-", "LCP-", "LCP-", 
"LCP-", "LCP-", "LCP-", "LCP-", "LCP-")), class = "data.frame", row.names = c(NA, 
-19L))

当相应行的“Contact”、“Location”和“MAPS_code”在 df1 之间匹配时，如何从 df2 的“Lat”和“Long”填充 df1 的“Lat”和“Long”的每一行和 df2？因此 df1 的结果如下所示:

structure(list(MAPS_code = c("SARI", "SABO", "SABO", "SABO", 
"ISLA", "TROP"), Location_code = c("LCP-", "LCP-", "LCP-", "LCP-", "LCP-", 
"LCP-"), Contact = c("Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", 
"Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall"), Lat = c("8.827778", "8.801111", "8.801111
", "8.801111", "8.835833", "51.23"), Long = c("-82.92417", "-82.91722", "-82.91722", "-82.91722", "-82.96306", "-109.26")), row.names = c(NA, 6L), class = "data.frame")

请注意，如果经纬度中已经有数据，我不希望将它们删除或用 NA 覆盖。

最佳答案

更新的答案 我们可以使用 dplyr::coalesce 在两对 Lat 和 Long 中检索不是 NA 的值>:

library(dplyr)

df1 %>%
  rename(Location = Location_code) %>%
  left_join(df2, by = c('MAPS_code', 'Contact', 'Location')) %>%
  mutate(across(ends_with('.x'), as.double)) %>%
  mutate(Lat = coalesce(!!!(select(., starts_with('Lat')))), 
         Long = coalesce(!!!select(., starts_with('Long')))) %>%
  select(!contains('.'))


  MAPS_code Location          Contact       Lat       Long
1      SARI     LCP- Chase Mendenhall  8.827778  -82.92417
2      SABO     LCP- Chase Mendenhall  8.801111  -82.91722
3      SABO     LCP- Chase Mendenhall  8.801111  -82.91722
4      SABO     LCP- Chase Mendenhall  8.801111  -82.91722
5      ISLA     LCP- Chase Mendenhall  8.835833  -82.96306
6      TROP     LCP-        Tom Jones 51.230000 -109.26000

关于R 多个数据框列匹配以填充列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/73911322/

R 多个数据框列匹配以填充列

上一篇：spring - 向 Spring Data Rest 端点返回的对象添加字段？

下一篇：Goland显示os.Remove()无法解决？