以上输出来自:
df.groupby('croho subonderdeel').sum()
我计算了每个类别的毕业生总数,但我也想按列计算。例如,仅接收第一列“2011 MAN”的输出。
我尝试了以下方法:
df.groupby('croho subonderdeel','2011 MAN').sum()
然后我收到以下错误:
ValueError: No axis named 2011 MAN for object type <class 'pandas.core.frame.DataFrame'>
然后我想也许我需要对“2011 MAN”求和,而不是分组两次。所以我尝试了:
df.groupby('croho subonderdeel').sum('2011 MAN')
然后我收到此错误:
TypeError: f() takes 1 positional argument but 2 were given
有人可以向我解释一下,为什么我尝试的两种方法都不可行?也许我可以自己解决这个问题。
最佳答案
您需要在[]
中指定列,例如:
df.groupby('croho subonderdeel')['2011 MAN'].sum()
您还可以指定多列:
df.groupby('croho subonderdeel')['2011 MAN', '2012 MAN'].sum()
此外,如果需要2列
输出添加参数as_index=False
:
df.groupby('croho subonderdeel', as_index=False)['2011 MAN'].sum()
或者:
df.groupby('croho subonderdeel')['2011 MAN'].sum().reset_index()
<小时/>
但是如果想要按 2 个类别(2 列)进行分组,请将 []
添加到 groupby
:
df.groupby(['croho subonderdeel', 'another col'])['2011 MAN'].sum()
<小时/>
示例:
df = pd.DataFrame({'another col':list('efefef'),
'2011 MAN':[4,5,4,5,5,4],
'2011 WROUW':[7,8,9,4,2,3],
'2012 MAN':[1,3,5,7,1,0],
'2012 WROUW':[5,3,6,9,2,4],
'croho subonderdeel':list('aaabbb')})
print (df)
2011 MAN 2011 WROUW 2012 MAN 2012 WROUW another col croho subonderdeel
0 4 7 1 5 e a
1 5 8 3 3 f a
2 4 9 5 6 e a
3 5 4 7 9 f b
4 5 2 1 2 e b
5 4 3 0 4 f b
print(df.groupby('croho subonderdeel')['2011 MAN'].sum())
croho subonderdeel
a 13
b 14
Name: 2011 MAN, dtype: int64
print(df.groupby('croho subonderdeel', as_index=False)['2011 MAN'].sum())
croho subonderdeel 2011 MAN
0 a 13
1 b 14
print(df.groupby('croho subonderdeel')['2011 MAN'].sum().reset_index())
croho subonderdeel 2011 MAN
0 a 13
1 b 14
<小时/>
print(df.groupby('croho subonderdeel')['2011 MAN', '2012 WROUW'].sum())
2011 MAN 2012 WROUW
croho subonderdeel
a 13 14
b 14 15
print(df.groupby('croho subonderdeel', as_index=False)['2011 MAN', '2012 WROUW'].sum())
croho subonderdeel 2011 MAN 2012 WROUW
0 a 13 14
1 b 14 15
<小时/>
print (df.groupby(['croho subonderdeel', 'another col'])['2011 MAN'].sum())
croho subonderdeel another col
a e 8
f 5
b e 5
f 9
Name: 2011 MAN, dtype: int64
print (df.groupby(['croho subonderdeel', 'another col'], as_index=False)['2011 MAN'].sum())
croho subonderdeel another col 2011 MAN
0 a e 8
1 a f 5
2 b e 5
3 b f 9
<小时/>
print (df.groupby(['croho subonderdeel', 'another col']).sum())
2011 MAN 2011 WROUW 2012 MAN 2012 WROUW
croho subonderdeel another col
a e 8 16 6 11
f 5 8 3 3
b e 5 2 1 2
f 9 7 7 13
print (df.groupby(['croho subonderdeel', 'another col'], as_index=False).sum())
croho subonderdeel another col 2011 MAN 2011 WROUW 2012 MAN 2012 WROUW
0 a e 8 16 6 11
1 a f 5 8 3 3
2 b e 5 2 1 2
3 b f 9 7 7 13
关于python - 按 2 个类别分组然后求和,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47363270/