这是我的数据 -
FROM TO DIRECTION AMOUNT
B A IN 100
A B OUT 200
A B IN 300
B A OUT 40
作为输出,我想显示谁总共支付了谁的摘要 -
FROM TO AMOUNT
A B 300
B A 340
澄清一下,如果 A --> B 是第 2 行和第 1 行(IN
表示从 TO
传输到 FROM
,OUT
表示从FROM
传输到TO
)
我在使用 .groupby()
方式时遇到问题。我尝试过的 -
df.groupby(['FROM', 'TO', 'DIRECTION'])
但这当然似乎并不能解决问题。任何帮助表示赞赏。
最佳答案
Idea 是按条件交换 FROM
和 TO
的值:
mask = df['DIRECTION'] == 'IN'
df.loc[mask, ['TO', 'FROM']] = df.loc[mask, ['FROM', 'TO']].values
print (df)
FROM TO DIRECTION AMOUNT
0 A B IN 100
1 A B OUT 200
2 B A IN 300
3 B A OUT 40
然后聚合sum
:
df = df.groupby(['FROM', 'TO'], as_index=False)['AMOUNT'].sum()
print (df)
FROM TO AMOUNT
0 A B 300
1 B A 340
如果不想修改原始DataFrame
,则解决方案非常相似:
mask = df['DIRECTION'] == 'IN'
df1 = df[['TO','FROM']].mask(mask, df[['FROM','TO']].values)
#output is same like above, only changed order of columns
print (df1)
TO FROM
0 B A
1 B A
2 A B
3 A B
df2 = df['AMOUNT'].groupby([df1['FROM'], df1['TO']]).sum().reset_index()
print (df2)
FROM TO AMOUNT
0 A B 300
1 B A 340
关于python - Pandas 按条件分组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51593268/