python - 合并 Pandas 中多行的多列文本

I want to combine multiple columns of text from multiple row groups. I know that the join joins into a single cell but it doesn't read by product. I want a single cell to have the information for that product in different brands and continents by the plan. For example: Within the laptop product what are the brands and continents in plan 1, in plan 2 , 3 and so on?

df['all'] = df[['product', 'brand', ...]].agg('-'.join, axis=1)

Here the dataframe:

product     brand     continent    plan
laptop         lg       n_am         P1
laptop    samsung       n_am         P1
laptop      apple         eu         P3
tv             lg         eu         P3
tv        samsung         eu         P2
tv          apple       n_am         P2
tv        samsung       n_am         P1
cellphone      lg         eu         P3
cellphone   apple       n_am         P2
cellphone   apple         eu         P1

Expected dataframe:

product     brand     continent    plan   all
laptop         lg       n_am         P1   product: laptop; plan: P1; brand: lg-samsung; continent: n_am
laptop    samsung       n_am         P1   product: laptop; plan: P1; brand: lg-samsung; continent: n_am
laptop      apple         eu         P3   product: laptop; plan: P3; brand: apple; continent: eu
tv             lg         eu         P3   product: tv;   plan: P3; brand: lg; continent: eu
tv        samsung         eu         P2   product: tv;   plan: P2; brand: samsung-apple; continent: n_am-eu
tv          apple       n_am         P2   product: tv;   plan: P2; brand: samsung-apple; continent: n_am-eu
tv        samsung       n_am         P1   product: tv;   plan: P1; brand: samsung; continent: n_am
cellphone      lg         eu         P3   product: cellphone; plan: P3; brand: lg; continent: eu
cellphone   apple       n_am         P2   product: cellphone; plan: P2; brand: apple; continent: n_am
cellphone   apple         eu         P1   product: cellphone; plan: P1; brand: apple; continent: eu

最佳答案

将 groupby 与 sort=False 和 Series.unique 一起使用。之后加入并使用列表理解从 dict 构造字符串并分配给 all 列

l1 = (df[['product','plan']].join(df.groupby(['product', 'plan'], sort=False)
                                    .transform(lambda x: '-'.join(x.unique()))).to_dict('records'))
df['all'] = [';'.join(k+':'+v for k, v in x.items()) for x in l1]

Out[528]:
     product    brand continent plan                                                       all
0     laptop       lg      n_am   P1    product:laptop;plan:P1;brand:lg-samsung;continent:n_am
1     laptop  samsung      n_am   P1    product:laptop;plan:P1;brand:lg-samsung;continent:n_am
2     laptop    apple        eu   P3           product:laptop;plan:P3;brand:apple;continent:eu
3         tv       lg        eu   P3                  product:tv;plan:P3;brand:lg;continent:eu
4         tv  samsung        eu   P2  product:tv;plan:P2;brand:samsung-apple;continent:eu-n_am
5         tv    apple      n_am   P2  product:tv;plan:P2;brand:samsung-apple;continent:eu-n_am
6         tv  samsung      n_am   P1           product:tv;plan:P1;brand:samsung;continent:n_am
7  cellphone       lg        eu   P3           product:cellphone;plan:P3;brand:lg;continent:eu
8  cellphone    apple      n_am   P2      product:cellphone;plan:P2;brand:apple;continent:n_am
9  cellphone    apple        eu   P1        product:cellphone;plan:P1;brand:apple;continent:eu

关于python - 合并 Pandas 中多行的多列文本，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/68139013/

python - 合并 Pandas 中多行的多列文本

上一篇：python - 在导入父包后调用模块

下一篇：python - GridSearchCV 结果热图