python - 如何在 Pandas Python 中更新数据框

我在 pandas 中有以下两个数据框:

DF1:
AuthorID1  AuthorID2  Co-Authored
A1         A2         0
A1         A3         0
A1         A4         0
A2         A3         0

DF2:
AuthorID1  AuthorID2  Co-Authored
A1         A2         5
A2         A3         6
A6         A7         9

我希望(无需循环和比较)找到 DF2 中存在于 DF1 中的匹配 AuthorID1 和 AuthorID2 配对，并相应地更新列值。所以上面两个表的结果如下:

Resulting Updated DF1:
AuthorID1  AuthorID2  Co-Authored
A1         A2         5
A1         A3         0
A1         A4         0
A2         A3         6

有没有快速的方法来做到这一点？因为我在 DF1 中有 700 万行，循环和比较将花费很长时间。

更新:请注意，DF2 中的最后两个不应成为 DF1 中更新的一部分，因为它在 DF1 中不存在

最佳答案

您可以使用update :

df1.update(df2)
print (df1)
  AuthorID1 AuthorID2  Co-Authored
0        A1        A2          5.0
1        A2        A3          6.0
2        A1        A4          0.0
3        A2        A3          0.0

示例:

df1 = pd.DataFrame({'new': {0: 7, 1: 8, 2: 1, 3: 3}, 
                    'AuthorID2': {0: 'A2', 1: 'A3', 2: 'A4', 3: 'A3'}, 
                    'AuthorID1': {0: 'A1', 1: 'A1', 2: 'A1', 3: 'A2'}, 
                    'Co-Authored': {0: 0, 1: 0, 2: 0, 3: 0}})

df2 = pd.DataFrame({'AuthorID2': {0: 'A2', 1: 'A3'},
                    'AuthorID1': {0: 'A1', 1: 'A2'}, 
                    'Co-Authored': {0: 5, 1: 6}})

  AuthorID1 AuthorID2  Co-Authored  new
0        A1        A2            0    7
1        A1        A3            0    8
2        A1        A4            0    1
3        A2        A3            0    3

print (df2)
  AuthorID1 AuthorID2  Co-Authored
0        A1        A2            5
1        A2        A3            6

df1.update(df2)
print (df1)
  AuthorID1 AuthorID2  Co-Authored  new
0        A1        A2          5.0    7
1        A2        A3          6.0    8
2        A1        A4          0.0    1
3        A2        A3          0.0    3

按评论编辑:

我认为您需要首先使用 isin 按 df1 过滤 df2 :

df2 = df2[df2[['AuthorID1','AuthorID2']].isin(df1[['AuthorID1','AuthorID2']]).any(1)]
print (df2)
  AuthorID1 AuthorID2  Co-Authored
0        A1        A2            5
1        A2        A3            6

df1.update(df2)
print (df1)
  AuthorID1 AuthorID2  Co-Authored
0        A1        A2          5.0
1        A2        A3          6.0
2        A1        A4          0.0
3        A2        A3          0.0

关于python - 如何在 Pandas Python 中更新数据框，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/38827835/

python - 如何在 Pandas Python 中更新数据框

上一篇：python - 如何从边缘检测图像中找到多边形顶点？

下一篇：python - uWSGI 未在 Ubuntu 16 服务器中创建套接字