我看过几篇关于这个的帖子,但我无法理解合并、连接和连接将如何处理这个问题。如何合并两个数据框以找到匹配的索引?
在:
import pandas as pd
import numpy as np
row_x1 = ['a1','b1','c1']
row_x2 = ['a2','b2','c2']
row_x3 = ['a3','b3','c3']
row_x4 = ['a4','b4','c4']
index_arrays = [np.array(['first', 'first', 'second', 'second']), np.array(['one','two','one','two'])]
df1 = pd.DataFrame([row_x1,row_x2,row_x3,row_x4], columns=list('ABC'), index=index_arrays)
print(df1)
输出:
A B C
first one a1 b1 c1
two a2 b2 c2
second one a3 b3 c3
two a4 b4 c4
在:
row_y1 = ['d1','e1','f1']
row_y2 = ['d2','e2','f2']
df2 = pd.DataFrame([row_y1,row_y2], columns=list('DEF'), index=['first','second'])
print(df2)
出来
D E F
first d1 e1 f1
second d2 e2 f2
换句话说,我如何合并它们以实现df3(如下)?
在
row_x1 = ['a1','b1','c1']
row_x2 = ['a2','b2','c2']
row_x3 = ['a3','b3','c3']
row_x4 = ['a4','b4','c4']
row_y1 = ['d1','e1','f1']
row_y2 = ['d2','e2','f2']
row_z1 = row_x1 + row_y1
row_z2 = row_x2 + row_y1
row_z3 = row_x3 + row_y2
row_z4 = row_x4 + row_y2
df3 = pd.DataFrame([row_z1,row_z2,row_z3,row_z4], columns=list('ABCDEF'), index=index_arrays)
print(df3)
出来
A B C D E F
first one a1 b1 c1 d1 e1 f1
two a2 b2 c2 d1 e1 f1
second one a3 b3 c3 d2 e2 f2
two a4 b4 c4 d2 e2 f2
最佳答案
选项 1
使用 pd.DataFrame.reindex
+ pd.DataFrame.join
reindex
有一个方便的 level
参数,允许您扩展不存在的索引级别。
df1.join(df2.reindex(df1.index, level=0))
A B C D E F
first one a1 b1 c1 d1 e1 f1
two a2 b2 c2 d1 e1 f1
second one a3 b3 c3 d2 e2 f2
two a4 b4 c4 d2 e2 f2
选项 2
您可以重命名您的坐标轴,join
将起作用
df1.rename_axis(['a', 'b']).join(df2.rename_axis('a'))
A B C D E F
a b
first one a1 b1 c1 d1 e1 f1
two a2 b2 c2 d1 e1 f1
second one a3 b3 c3 d2 e2 f2
two a4 b4 c4 d2 e2 f2
您可以使用另一个 rename_axis
进行跟进以获得所需的结果
df1.rename_axis(['a', 'b']).join(df2.rename_axis('a')).rename_axis([None, None])
A B C D E F
first one a1 b1 c1 d1 e1 f1
two a2 b2 c2 d1 e1 f1
second one a3 b3 c3 d2 e2 f2
two a4 b4 c4 d2 e2 f2
关于python - 使用多索引合并两个数据帧,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46551951/