我有这套:
df=pd.DataFrame({'user':[1,1,2,2,2,3,3,3,3,3,4,4],
'date':['1995-09-01','1995-09-02','1995-10-03','1995-10-04','1995-10-05','1995-11-07','1995-11-08','1995-11-09','1995-11-10','1995-11-15','1995-12-18','1995-12-20'],
'type':['a','a','b','a','c','a','b','a','b','b','a','b']})
这给了我:
user date type
1 1995-09-01 a
1 1995-09-02 a
2 1995-10-03 b
2 1995-10-04 a
2 1995-10-05 c
3 1995-11-07 a
3 1995-11-08 b
3 1995-11-09 a
3 1995-11-10 b
3 1995-11-15 b
4 1995-12-18 a
4 1995-12-20 b
我想创建一个新列,其中显示“类型”列上a值的计数,并按“用户”列分组
这是预期的结果:
user date type cta_a
1 1995-09-01 a 2
1 1995-09-02 a 2
2 1995-10-03 b 1
2 1995-10-04 a 1
2 1995-10-05 c 1
3 1995-11-07 a 2
3 1995-11-08 b 2
3 1995-11-09 a 2
3 1995-11-10 b 2
3 1995-11-15 b 2
4 1995-12-18 a 1
4 1995-12-20 b 1
我尝试了以下方法,但没有成功。
df['ct_a'] = df.groupby('user')[df['type']== 'a'].transform('count')
最佳答案
屏蔽
列type
中的非a
值,然后groupby
和transform
使用计数
:
df['ct_a'] = df['type'].mask(lambda x: x.ne('a'))\
.groupby(df['user']).transform('count')
user date type ct_a
0 1 1995-09-01 a 2
1 1 1995-09-02 a 2
2 2 1995-10-03 b 1
3 2 1995-10-04 a 1
4 2 1995-10-05 c 1
5 3 1995-11-07 a 2
6 3 1995-11-08 b 2
7 3 1995-11-09 a 2
8 3 1995-11-10 b 2
9 3 1995-11-15 b 2
10 4 1995-12-18 a 1
11 4 1995-12-20 b 1
关于python - Groupby + 计算特定项目(不是全部),将结果放入新列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64139634/