我有三个 Pandas 数据帧,df1
、df2、
和 df3
,如下所示:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'id' : ['one', 'two', 'three'], 'score': [56, 45, 78]})
df2 = pd.DataFrame({'id' : ['one', 'five', 'four'], 'score': [35, 81, 90]})
df3 = pd.DataFrame({'id' : ['five', 'two', 'six'], 'score': [23, 66, 42]})
我如何根据 id
连接这些数据框,然后将它们的列连接在一起?所需的输出如下:
#join_and_concatenate by id:
id score(df1) score(df2) score(df3)
one 56 35 NaN
two 45 NaN 66
three 78 NaN NaN
four NaN 90 NaN
five NaN 81 23
six NaN NaN 42
我找到了一个相关的 page谈到 merge()
、concatenate()
和 join()
但我不确定其中任何一个都能满足我的要求。
最佳答案
concat
可能有更好的方法,但这应该可行:
In [48]: pd.merge(df1, df2, how='outer', on='id').merge(df3, how='outer', on='id')
Out[48]:
id score_x score_y score
0 one 56 35 NaN
1 two 45 NaN 66
2 three 78 NaN NaN
3 five NaN 81 23
4 four NaN 90 NaN
5 six NaN NaN 42
[6 rows x 4 columns]
得到你想要的答案:
In [54]: merged = pd.merge(df1, df2, how='outer', on='id').merge(df3, how='outer', on='id')
In [55]: merged.set_index('id').rename(columns={'score_x': 'score(df1)', 'score_y': 'score(df2)
', 'score': 'score(df3)'})
Out[55]:
score(df1) score(df2) score(df3)
id
one 56 35 NaN
two 45 NaN 66
three 78 NaN NaN
five NaN 81 23
four NaN 90 NaN
six NaN NaN 42
[6 rows x 3 columns]
关于 python Pandas : Join on unique column values and concatenate,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20975526/