python - 合并 Pandas 中多行的多列文本

标签 python pandas dataframe

I want to combine multiple columns of text from multiple row groups. I know that the join joins into a single cell but it doesn't read by product. I want a single cell to have the information for that product in different brands and continents by the plan. For example: Within the laptop product what are the brands and continents in plan 1, in plan 2 , 3 and so on?

df['all'] = df[['product', 'brand', ...]].agg('-'.join, axis=1) 

Here the dataframe:

product     brand     continent    plan
laptop         lg       n_am         P1
laptop    samsung       n_am         P1
laptop      apple         eu         P3
tv             lg         eu         P3
tv        samsung         eu         P2
tv          apple       n_am         P2
tv        samsung       n_am         P1
cellphone      lg         eu         P3
cellphone   apple       n_am         P2
cellphone   apple         eu         P1

Expected dataframe:

product     brand     continent    plan   all
laptop         lg       n_am         P1   product: laptop; plan: P1; brand: lg-samsung; continent: n_am
laptop    samsung       n_am         P1   product: laptop; plan: P1; brand: lg-samsung; continent: n_am
laptop      apple         eu         P3   product: laptop; plan: P3; brand: apple; continent: eu
tv             lg         eu         P3   product: tv;   plan: P3; brand: lg; continent: eu
tv        samsung         eu         P2   product: tv;   plan: P2; brand: samsung-apple; continent: n_am-eu
tv          apple       n_am         P2   product: tv;   plan: P2; brand: samsung-apple; continent: n_am-eu
tv        samsung       n_am         P1   product: tv;   plan: P1; brand: samsung; continent: n_am
cellphone      lg         eu         P3   product: cellphone; plan: P3; brand: lg; continent: eu
cellphone   apple       n_am         P2   product: cellphone; plan: P2; brand: apple; continent: n_am
cellphone   apple         eu         P1   product: cellphone; plan: P1; brand: apple; continent: eu

最佳答案

groupbysort=FalseSeries.unique 一起使用。之后加入并使用列表理解从 dict 构造字符串并分配给 all

l1 = (df[['product','plan']].join(df.groupby(['product', 'plan'], sort=False)
                                    .transform(lambda x: '-'.join(x.unique()))).to_dict('records'))
df['all'] = [';'.join(k+':'+v for k, v in x.items()) for x in l1]

Out[528]:
     product    brand continent plan                                                       all
0     laptop       lg      n_am   P1    product:laptop;plan:P1;brand:lg-samsung;continent:n_am
1     laptop  samsung      n_am   P1    product:laptop;plan:P1;brand:lg-samsung;continent:n_am
2     laptop    apple        eu   P3           product:laptop;plan:P3;brand:apple;continent:eu
3         tv       lg        eu   P3                  product:tv;plan:P3;brand:lg;continent:eu
4         tv  samsung        eu   P2  product:tv;plan:P2;brand:samsung-apple;continent:eu-n_am
5         tv    apple      n_am   P2  product:tv;plan:P2;brand:samsung-apple;continent:eu-n_am
6         tv  samsung      n_am   P1           product:tv;plan:P1;brand:samsung;continent:n_am
7  cellphone       lg        eu   P3           product:cellphone;plan:P3;brand:lg;continent:eu
8  cellphone    apple      n_am   P2      product:cellphone;plan:P2;brand:apple;continent:n_am
9  cellphone    apple        eu   P1        product:cellphone;plan:P1;brand:apple;continent:eu

关于python - 合并 Pandas 中多行的多列文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68139013/

相关文章:

pandas - 将列标题包装在 pandas.df.to_latex() 导出的 latex 表中

python - 如何在 Pandas 的多列中填充 NA 值?

python - 如何根据行值合并两个大小不等的DataFrame

python - 尝试将数字元组转换为字符串

python - 在 python 中用函数参数命名的列表

python - 从字符串python正则表达式中提取匹配组

Python合并两个数据框(模糊匹配,有些列完全匹配,而有些列不匹配)

python - 使用长度不均匀的列表项创建 pandas df 列?

python - Selenium / python : Find <label for =""> element with no other attributes

python - 将数据框转换为python中的列表列表