python - 用于选择多列 Pandas python 的 Groupby

标签 python pandas group-by

我有一个 table pandas dataframe df 有 3 列让我们说:

[IN]:df
[OUT]:

Tree Name   Planted by Govt   Planted by College
A               Yes                No 
B               Yes                No
C               Yes                No 
C               Yes                No
A               No                 No 
B               No                 Yes
B               Yes                Yes
B               Yes                No
B               Yes                No  

查询:

每种树种有多少棵树是由政府而非大学种植的。政府:是,私有(private):否

需要的输出:

1 Tree(s) 'A' were planted by govt and not by college
3 Tree(s) 'B' were planted by govt and not by college
2 Tree(s) 'C' were planted by govt and not by college

谁能帮忙

最佳答案

首先通过比较用 & 链接的两个列来创建 bool 掩码以进行按位 AND,然后使用聚合 sum 转换为数字:

s = df['Planted by Govt'].eq('Yes') & df['Planted by College'].eq('No')
out = s.view('i1').groupby(df['Tree Name']).sum()
#alternative
#out = s.astype(int).groupby(df['Tree Name']).sum()
print (out)
Tree Name
A    1
B    3
C    2
dtype: int8

自定义输出的最后一个使用 f-strings:

for k, v in out.items():
    print (f"{v} Tree(s) {k} were planted by govt and not by college")

    1 Tree(s) A were planted by govt and not by college
    3 Tree(s) B were planted by govt and not by college
    2 Tree(s) C were planted by govt and not by college

另一个想法是为原始创建新列:

df['new'] = (df['Planted by Govt'].eq('Yes') & df['Planted by College'].eq('No')).view('i1')
print (df)
  Tree Name Planted by Govt Planted by College  new
0         A             Yes                 No    1
1         B             Yes                 No    1
2         C             Yes                 No    1
3         C             Yes                 No    1
4         A              No                 No    0
5         B              No                Yes    0
6         B             Yes                Yes    0
7         B             Yes                 No    1
8         B             Yes                 No    1

out = df.groupby('Tree Name')['new'].sum()
print (out)
Tree Name
A    1
B    3
C    2
Name: new, dtype: int8

关于python - 用于选择多列 Pandas python 的 Groupby,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58075544/

相关文章:

python - 如何将列表作为二维列表中的列插入?

python - 如何使用plotly python绘制时间序列堆积条形图

python - 使用索引编辑 pandas DataFrame

python - 如何在 Ruby 中模拟 Python 的命名 printf 参数?

python - RobuSTLy 杀死 Windows 程序卡住报告 'problems'

python - ctypes float 垃圾返回

python - 使用 pandas 在 Excel 中应用条件格式不起作用

python - groupby 对象 pandas 的绝对值平均值

sql - Postgres : select query with group by clause on a range of dates

sql - 如何通过在 MSSQL 中使用 Pivot with Month 获取评级列的值及其对每个特定月份的定义