python - pandas groupby 并在不同类型之间使用数字

假设我有一个像这样的 df:

client   order_type    amount
John     Buy           100
John     Sell          100
Jeff     Buy           100
Jeff     Buy           100
Aaron    Buy           100
Aaron    Sell          100
Aaron    Buy           100

如果我这样做:

df.groupby(['client','order_type'])['amount'].sum()

我会得到类似的东西:

John    Buy   100
        Sell  100
Jeff    Buy   100
        Sell  100
Aaron   Buy   200
        Sell  100

如何在新数据框中获取诸如“购买 - 销售”列之类的内容:

Name      NetBuy
John      0
Jeff      200
Aaron     100

最佳答案

只需将您的 order_type 映射到一个符号，有很多方法可以做到这一点，但我认为最容易阅读的是:

df['sign'] = [1 if x == 'Buy' else -1 for x in df.order_type]
df['amount_adj'] = df.sign*df.amount
df.groupby(['client'])['amount_adj'].sum()

输出:

client
Aaron    100
Jeff     200
John       0

使用单行和更快的 np.where 得到相同的结果:

df = df.assign(amount=np.where(df.order_type.eq('Sell'), 
          df.amount*-1, df.amount)).groupby(['client'])['amount'].sum()

输出:

client
Aaron    100
Jeff     200
John       0

关于python - pandas groupby 并在不同类型之间使用数字，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56117205/

上一篇：python - 网格几何管理器不会将框架置于窗口中间

下一篇：python - 无法使用pip3模块: ModuleNotFoundError

相关文章：

python - 在Raspberry pi 3中安装openCV失败

python - Tkinter 网格布局不会扩展

Python 跨数据帧匹配项目

python - 在内置函数中赋值

apache-spark - Apache spark 中的数据帧示例 |斯卡拉

python - R pandas groupby每组第一行的索引

python - Flask-Testing 和 Flask-SQLAlchemy : first_or_404()

Python和Trio，生产者是消费者，工作完成后如何优雅退出？

python-3.x - 来自 DataFrame 的一张图上的多个图

apache-spark - DataFrame na() 填充方法和不明确引用的问题