python - 连接并填充 Pandas 中缺失的列

我有df如下:

df1:

   id city district      date  value
0   1   bj       ft  2019/1/1      1
1   2   bj       ft  2019/1/1      5
2   3   sh       hp  2019/1/1      9
3   4   sh       hp  2019/1/1     13
4   5   sh       hp  2019/1/1     17

df2

   id      date  value
0   3  2019/2/1      1
1   4  2019/2/1      5
2   5  2019/2/1      9
3   6  2019/2/1     13
4   7  2019/2/1     17

我需要df s 基于 id 连接并填充city和district在df2来自df1 。预期的应该是这样的:

   id city district      date  value
0   1   bj       ft  2019/1/1      1
1   2   bj       ft  2019/1/1      5
2   3   sh       hp  2019/1/1      9
3   4   sh       hp  2019/1/1     13
4   5   sh       hp  2019/1/1     17
5   3   sh       hp  2019/2/1      1
6   4   sh       hp  2019/2/1      5
7   5   sh       hp  2019/2/1      9
8   6  NaN      NaN  2019/2/1     13
9   7  NaN      NaN  2019/2/1     17

到目前为止使用 pd.concat([df1, df2], axis=0) 生成的结果是这样的:

  city      date district  id  value
0   bj  2019/1/1       ft   1      1
1   bj  2019/1/1       ft   2      5
2   sh  2019/1/1       hp   3      9
3   sh  2019/1/1       hp   4     13
4   sh  2019/1/1       hp   5     17
0  NaN  2019/2/1      NaN   3      1
1  NaN  2019/2/1      NaN   4      5
2  NaN  2019/2/1      NaN   5      9
3  NaN  2019/2/1      NaN   6     13
4  NaN  2019/2/1      NaN   7     17

谢谢!

最佳答案

添加DataFrame.merge通过 id 列左连接:

df = pd.concat([df1,df2.merge(df1[['id','city','district']], how='left', on='id')],sort=False)
print (df)
   id city district      date  value
0   1   bj       ft  2019/1/1      1
1   2   bj       ft  2019/1/1      5
2   3   sh       hp  2019/1/1      9
3   4   sh       hp  2019/1/1     13
4   5   sh       hp  2019/1/1     17
0   3   sh       hp  2019/2/1      1
1   4   sh       hp  2019/2/1      5
2   5   sh       hp  2019/2/1      9
3   6  NaN      NaN  2019/2/1     13
4   7  NaN      NaN  2019/2/1     17

关于python - 连接并填充 Pandas 中缺失的列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/58705905/

上一篇：python - Flask 和减少 mysql 查询的最佳方法，也许是 celery？

下一篇：python - 加速 pandas 上的复杂功能

相关文章：

python - 分割数据帧的行并将它们作为单独的行存储在同一数据帧中

r - 将一个巨大的数据帧拆分为许多较小的数据帧，以在 r 中创建语料库

python - 将两个 numpy 数组转换为成对数组的数组

python - 如何更快地创建 Pandas 索引？

Python Pandas，使用 Dataframe 中的先前值

python - pandas 数据框单元格中 numpy 数组的元素平均值

python:Python类中列表或字典的信号量保护

python + sqlite3 : using IS NOT in a join query

python - 删除值跨列交换的重复行

python - 根据 pandas 中定义类别的列过滤掉观察数量不足的 DataFrame 行