我有一个数据框
df = pd.DataFrame({'id':['one','one','two','two','three','three','three'],
'type':['current','saving','current','current','current','saving','credit']})
我想统计只有'current'的id的数量 应该是这样的:
only_currnt_id_list = ['two']
最佳答案
我认为你需要:
L = df.groupby('id') \
.filter(lambda x: (x['type'] == 'current').all() and
(x['type'] == 'current').sum() == 1)['id'].tolist()
print (L)
['two']
编辑:
df = pd.DataFrame({'id':['one','one','two','three','three','three'],'type':['current','current','current','current','saving','credit']})
print (df)
id type
0 one current
1 one current
2 two current
3 three current
4 three saving
5 three credit
<小时/>
L = df.groupby('id') \
.filter(lambda x: (x['type'] == 'current').all() and
(x['type'] == 'current').sum() == 1)['id'].tolist()
print (L)
['two']
L = df.groupby('id') \
.filter(lambda x: (x['type'] == 'current').all())['id'].unique().tolist()
print (L)
['one', 'two']
关于Python groupby 结果计数频率,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45962154/