如何在不使用客户索引的情况下将行值转换为包含购买次数的列
数据:
customer fruits veggies grocery
A apple carrot brush
A apple carrot brush
A apple onion soap
A banana onion soap
B mango onion soap
B mango carrot brush
B banana tomato powder
B banana tomato powder
C apple carrot powder
C mango carrot soap
C mango tomato soap
C banana tomato brush
D banana carrot brush
D banana onion soap
D apple tomato powder
D apple tomato powder
预期输出:
customer apple mango banana carrot onion tomato brush soap powder
A 3 0 1 2 2 0 2 2 0
B 0 2 2 1 1 2 1 1 2
C 1 2 1 2 0 2 1 2 1
D 2 0 2 1 1 2 1 1 2
最佳答案
选项 1
使用set_index
+ stack
+ get_dummies
:
df.set_index('customer').stack().str.get_dummies().sum(level=0)
apple banana brush carrot mango onion powder soap tomato
customer
A 3 1 2 2 0 2 0 2 0
B 0 2 1 1 2 1 2 1 2
C 1 1 1 2 2 0 1 2 2
D 2 2 1 1 0 1 2 1 2
选项 2
另一种,稍微干净一些,使用 pd.crosstab
:
v = df.set_index('customer').stack()
pd.crosstab(v.index.get_level_values(0), v.values)
col_0 apple banana brush carrot mango onion powder soap tomato
row_0
A 3 1 2 2 0 2 0 2 0
B 0 2 1 1 2 1 2 1 2
C 1 1 1 2 2 0 1 2 2
D 2 2 1 1 0 1 2 1 2
crosstab
是 pivot_table
的专门版本,非常适合此类制表操作。
关于python - 如何根据其值的频率将行转换为列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48883138/