我需要提取客户购买的“mode”
操作系统品牌。我的问题是,当有品牌“抽签”
时,返回的不是值或列表。
import pandas as pd
import numpy as np
data = pd.DataFrame({
'status' : ['pending', 'pending','pending', 'canceled','canceled','canceled', 'confirmed', 'confirmed','confirmed'],
'clientId' : ['A', 'B', 'C', 'A', 'D', 'C', 'A', 'B','C'],
'partner' : ['A', np.nan,'C', 'A',np.nan,'C', 'A', np.nan,'C'],
'product' : ['afiliates', 'pre-paid', 'giftcard','afiliates', 'pre-paid', 'giftcard','afiliates', 'pre-paid', 'giftcard'],
'brand' : ['brand_1', 'brand_2', 'brand_3','brand_1', 'brand_2', 'brand_3','brand_1', 'brand_3', 'brand_3'],
'gmv' : [100,100,100,100,100,100,100,100,100]})
data = data.astype({'partner':'category','status':'category','product':'category', 'brand':'category'})
sumary = data.groupby(['clientId', 'product'], observed=True).aggregate({'brand':pd.Series.mode})
sumary = sumary.unstack()
sumary
对于客户“B”的“预付费”情况,哪种方式比较好?
最佳答案
如果您想要一个列表作为输出,请显式转换 to_list
:
sumary = (data.groupby(['clientId', 'product'], observed=True)
.aggregate({'brand': lambda x: x.mode().tolist()})
)
sumary = sumary.unstack()
from statistics import multimode
sumary = (data.groupby(['clientId', 'product'], observed=True)
.aggregate({'brand':multimode})
)
sumary = sumary.unstack()
brand
product afiliates giftcard pre-paid
clientId
A [brand_1] NaN NaN
B NaN NaN [brand_2, brand_3]
C NaN [brand_3] NaN
D NaN NaN [brand_2]
关于Pandas 分组模式 |如何处理平等值(value)观,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/77661382/