python - 如何在 python pandas 中合并 2个复杂的数据帧？

我有 2 个 pandas 数据框。

dictionary1 = {'match_up' : ['1985_1116_1234' , '1985_1116_1475', '1985_1234_1172', '1985_1475_2132',  '1985_1242_1325'], \
               'result': [1, 1, 0, 0, 1], 'year':[1985,1985,1985,1985,1985]  }


dictionary2 = {'team' : [1234 , 1475,  2132, 1172, 1242, 1116 , 1325], 'win_A_B': [0.667, 0.636, 0.621, 0.629, 0.615,0.943, 0.763], \
               'year':[1985,1985,1985,1985,1985,1985,1985] }

df1 = pd.DataFrame(dictionary1)

df2 = pd.DataFrame(dictionary2)

df1:
           match_up     result  year
    0   1985_1116_1234    1     1985
    1   1985_1116_1475    1     1985
    2   1985_1234_1172    0     1985
    3   1985_1475_2132    0     1985
    4   1985_1242_1325    1     1985

df2:
    team      win_A_B    year
    1234      0.667      1985
    1475      0.636      1985 
    2132      0.621      1985
    1172      0.629      1985
    1242      0.615      1985
    1116      0.943      1985
    1325      0.763      1985

数据框 df1 中的列值与数据框 df2 中的列 team 匹配。 df2 中的team 列都是唯一值。

我需要按以下方式组合上述 2 个数据框:

           match_up     result  year   team_A   team_B    win_A    win_B
    0   1985_1116_1234    1     1985    1116      1234     0.943    0.667    
    1   1985_1116_1475    1     1985    1116       1475    0.943     0.636
    2   1985_1234_1172    0     1985    1234       1172    0.667     0.629
    3   1985_1475_2132    0     1985    1475       2132    0.636    0.621
    4   1985_1242_1325    1     1985    1242       1325    0.615    0.763

我知道我已经在 pandas 中问过类似的问题。我是 pandas 的新手，所以如果我问这样的问题，请多多包涵。

最佳答案

以下将起作用:

d_teams=pd.DataFrame( [[int(y) for y in x.split('_')[1:]] \
            for x in df1.match_up], columns=('team_A', 'team_B') )
merged=pd.concat((df1,d_teams),axis=1)
df2i=df2.set_index('team')
merged['win_A']=df2i.ix[merged.team_A].reset_index().win_A_B
merged['win_B']=df2i.ix[merged.team_B].reset_index().win_A_B

首先，我们创建d_teams，这是一个DataFrame，由match_up 列组成，用'_'分割，并转为int。我们扔掉年份，因为它已经包含在 df1 中，只保留 team_A 和 team_B。然后我们通过将其与 df1 连接来创建合并数据框。

接下来，我们创建df2i，这是由团队索引的df2。然后我们可以使用 merged.team_A 或 merged.team_B 进行索引以获得获胜值。但是，我们不希望结果被那些团队索引，因此我们首先重置索引。

关于python - 如何在 python pandas 中合并 2个复杂的数据帧？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/29003137/

python - 如何在 python pandas 中合并 2个复杂的数据帧？

上一篇：python - 与另一个进程的标准输入/输出交互

下一篇：python - 为 matplotlib tricontourf 设置掩码