R 如果表 a 中第 1 列中的值与表 b 中第 1 列中的值匹配,则将表 b 中第 2 列中的值复制到表 1

标签 r

#问题# 我有 2 个数据框。 1 个数据框 (A) 具有多个列。数据帧 A 中的第 1 列有一个电子邮件地址,其中多行具有相同的电子邮件地址。另一个数据框 (B) 在第 1 列中有一个唯一电子邮件地址列表,在第 2 列中有该电子邮件在数据框 A 的列表中出现的次数。我本质上想做一个 vlookup,以便无论电子邮件地址在哪里匹配从这两个表中,它会将计数拉入数据框 A 的新列中。任何人都可以帮忙吗?

数据

Table A   
Column 1    Column 2      Column 3  
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="563716377835393b" rel="noreferrer noopener nofollow">[email protected]</a>     home          123   
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9afbdafbb4f9f5f7" rel="noreferrer noopener nofollow">[email protected]</a>     house         456   
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c9ab89abe7aaa6a4" rel="noreferrer noopener nofollow">[email protected]</a>     tree          221   

Table B   
Column 1    Column 2(Count)      
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="58391839763b3735" rel="noreferrer noopener nofollow">[email protected]</a>        2   
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="bedcfedc90ddd1d3" rel="noreferrer noopener nofollow">[email protected]</a>        1   

Expected result should be Table A with an additional column:   
Column 1    Column 2      Column 3     Column 4   
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="fe9fbe9fd09d9193" rel="noreferrer noopener nofollow">[email protected]</a>      home           123             2   
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="97f6d7f6b9f4f8fa" rel="noreferrer noopener nofollow">[email protected]</a>      house          456             2   
<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="ccae8caee2afa3a1" rel="noreferrer noopener nofollow">[email protected]</a>      tree           221             1   

最佳答案

您不需要 df2 来获取计数。您可以单独使用 df1 来获取计数:

#solution using data.table package
library(data.table)
setDT(df1)[,count:=.N,by=Column1]
   Column1 Column2 Column3 count
1: <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="e988a988c78a8684" rel="noreferrer noopener nofollow">[email protected]</a>    home     123      2
2: <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="d4b594b5fab7bbb9" rel="noreferrer noopener nofollow">[email protected]</a>   house     456      2
3: <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="abc9ebc985c8c4c6" rel="noreferrer noopener nofollow">[email protected]</a>    tree     221      1

#solution using dplyr package
library(dplyr)
df1 %>%
group_by(Column1)%>%
mutate(count=n())
Source: local data frame [3 x 4]
Groups: Column1

  Column1 Column2 Column3 count
1 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c1a081a0efa2aeac" rel="noreferrer noopener nofollow">[email protected]</a>    home     123     2
2 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="224362430c414d4f" rel="noreferrer noopener nofollow">[email protected]</a>   house     456     2
3 <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="a5c7e5c78bc6cac8" rel="noreferrer noopener nofollow">[email protected]</a>    tree     221     1
#Data
df1<-structure(list(Column1 = structure(c(1L, 1L, 2L), .Label = c("<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="bedffedf90ddd1d3" rel="noreferrer noopener nofollow">[email protected]</a>", 
"<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="debc9ebcf0bdb1b3" rel="noreferrer noopener nofollow">[email protected]</a>"), class = "factor"), Column2 = structure(1:3, .Label = c("home", 
"house", "tree"), class = "factor"), Column3 = c(123L, 456L, 
221L)), .Names = c("Column1", "Column2", "Column3"), class = "data.frame", row.names = c(NA, 
-3L))

关于R 如果表 a 中第 1 列中的值与表 b 中第 1 列中的值匹配,则将表 b 中第 2 列中的值复制到表 1,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30329536/

相关文章:

r - 使用 na.approx 在数据框中插入 NA 值

r - 将 SAS 数据集导出到 CSV 时保存列描述

r - 图中的分页符(新页面)

r - Switch 语句不适用于数字对象

R-Heatmap.2 在禁用列树状图后删除标题和实际热图之间留下的巨大空间

r - 将 R 输出导出到 Excel

r - 如何移动嵌套 tibbles 中的 tibbles 列?

html - R 开发工具 : create both html and pdf of vignettes

r - 使用 .SD 在 data.table 中进行持久分配

r - 将组总和除以总和