python-3.x - 将列表列表转换为字符串 pandas 数据框

标签 python-3.x pandas dataframe nlp nested-lists


我有以下玩具df包含列中的列表 BeforeAfter如下所示

import pandas as pd
before = [list(['in', 'the', 'bright', 'blue', 'box']), 
after = [list(['there', 'are', 'many', 'different']), 
       list(['i','like','a','lot','of', 'sports']), 

df= pd.DataFrame({'Before' : before, 
                   'After' : after,
                  'P_ID': [1,2,3], 
                  'Word' : ['crayons', 'cars', 'camels'],
                  'N_ID' : ['A1', 'A2', 'A3']

                    After                Before                     N_ID P_ID   Word
0   [in, the, bright, blue, box]        [there, are, many, different]   A1  1   crayons
1   [because, they, go, really, fast]   [i, like, a, lot, of, sports ]  A2  2   cars
2   [to, ride, and, have, fun]        [the, middle, east, has, many]    A3  3   camels


df.loc[:, ['After', 'Before']] = df[['After', 'Before']].apply(lambda x: x.str[0].str.replace(',', ''))取自 Removing commas and unlisting a dataframe产生以下输出:

接近我想要的但不完全 - 输出
    After   Before  N_ID  P_ID  Word
0   in      there    A1    1    crayons
1   because  i       A2    2    cars
2   to      the      A3    3    camels

这个输出很接近,但不是我想要的,因为 AfterBefore当我想要的输出如下所示时,列只有一个单词输出(例如 there ):

     After                           Before               N_ID  P_ID  Word
0 in the bright blue box        there are many different  A1    1   crayons
1 because they go really fast   i like a lot of sports    A2    2   cars
2 to ride and have fun         the middle east has many   A3    3   camels


我如何获得我的 所需输出 ?


agg + join .逗号不在您的列表中,它们只是 __repr__ 的一部分的名单。

str_cols = ['Before', 'After']

d = {k: ' '.join for k in str_cols}

df.agg(d).join(df.drop(str_cols, 1))
                        Before                     After  P_ID     Word N_ID
0       in the bright blue box  there are many different     1  crayons   A1
1  because they go really fast    i like a lot of sports     2     cars   A2
2         to ride and have fun  the middle east has many     3   camels   A3

df[str_cols] = df.agg(d)

关于python-3.x - 将列表列表转换为字符串 pandas 数据框,我们在Stack Overflow上找到一个类似的问题:


python - 如何找到哪些分支没有被测试覆盖?

python - Pandas:如何根据特定的后缀值对行进行排序?

r - 忽略第二个或更多连续的 0

python - 如何绘制 DataFrame 中包含的特定日期的时间序列数据,可能会产生单个记录

apache-spark - PySpark,决策树(Spark 2.0.0)

python - Pycharm 中的自动光标定位

python3 线程输出帮助。这是正确的输出吗?

python - 如何通过使用 pandas 进行转换来不删除而是处理异常值?

python - 删除 Pandas 方差低的列

python - Pandas 系列 any() 与 all()