I want to combine multiple columns of text from multiple row groups. I know that the join joins into a single cell but it doesn't read by product. I want a single cell to have the information for that product in different brands and continents by the plan. For example: Within the laptop product what are the brands and continents in plan 1, in plan 2 , 3 and so on?
df['all'] = df[['product', 'brand', ...]].agg('-'.join, axis=1)
Here the dataframe:
product brand continent plan
laptop lg n_am P1
laptop samsung n_am P1
laptop apple eu P3
tv lg eu P3
tv samsung eu P2
tv apple n_am P2
tv samsung n_am P1
cellphone lg eu P3
cellphone apple n_am P2
cellphone apple eu P1
Expected dataframe:
product brand continent plan all
laptop lg n_am P1 product: laptop; plan: P1; brand: lg-samsung; continent: n_am
laptop samsung n_am P1 product: laptop; plan: P1; brand: lg-samsung; continent: n_am
laptop apple eu P3 product: laptop; plan: P3; brand: apple; continent: eu
tv lg eu P3 product: tv; plan: P3; brand: lg; continent: eu
tv samsung eu P2 product: tv; plan: P2; brand: samsung-apple; continent: n_am-eu
tv apple n_am P2 product: tv; plan: P2; brand: samsung-apple; continent: n_am-eu
tv samsung n_am P1 product: tv; plan: P1; brand: samsung; continent: n_am
cellphone lg eu P3 product: cellphone; plan: P3; brand: lg; continent: eu
cellphone apple n_am P2 product: cellphone; plan: P2; brand: apple; continent: n_am
cellphone apple eu P1 product: cellphone; plan: P1; brand: apple; continent: eu
最佳答案
将 groupby
与 sort=False
和 Series.unique
一起使用。之后加入并使用列表理解从 dict 构造字符串并分配给 all
列
l1 = (df[['product','plan']].join(df.groupby(['product', 'plan'], sort=False)
.transform(lambda x: '-'.join(x.unique()))).to_dict('records'))
df['all'] = [';'.join(k+':'+v for k, v in x.items()) for x in l1]
Out[528]:
product brand continent plan all
0 laptop lg n_am P1 product:laptop;plan:P1;brand:lg-samsung;continent:n_am
1 laptop samsung n_am P1 product:laptop;plan:P1;brand:lg-samsung;continent:n_am
2 laptop apple eu P3 product:laptop;plan:P3;brand:apple;continent:eu
3 tv lg eu P3 product:tv;plan:P3;brand:lg;continent:eu
4 tv samsung eu P2 product:tv;plan:P2;brand:samsung-apple;continent:eu-n_am
5 tv apple n_am P2 product:tv;plan:P2;brand:samsung-apple;continent:eu-n_am
6 tv samsung n_am P1 product:tv;plan:P1;brand:samsung;continent:n_am
7 cellphone lg eu P3 product:cellphone;plan:P3;brand:lg;continent:eu
8 cellphone apple n_am P2 product:cellphone;plan:P2;brand:apple;continent:n_am
9 cellphone apple eu P1 product:cellphone;plan:P1;brand:apple;continent:eu
关于python - 合并 Pandas 中多行的多列文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68139013/