python - 在 python 中使用 pandas 将关键字与数据框列映射

标签 python pandas dataframe data-analysis

我有一个数据框,

DF,
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
        2       Thanks for reading
Ram     1       Ram is one of the good cricket player
ganesh  1       good driver

和一个列表,

my_list=["one","driver"]

I tried, names=df.loc[df["Description"].str.contains("|".join(my_list),na=False), 'Name']

实现了除键值列之外的所有内容。

 output_DF.
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
Ram     1       Ram is one of the good cricket player

My desired output is,
desired_DF,
Name    Stage   Description                                 keyvalue
Sri     1       Sri is one of the good singer in this two    one
        2       Thanks for reading                           
Ram     1       Ram is one of the good cricket player        one
ganesh  1       good driver                                  driver

有人帮我生成键值列

最佳答案

我认为您可以使用 here 中的先前解决方案然后 extract :

pat = "|".join(my_list)

df['keyvalue'] = df['Description'].str.extract("(" + pat + ')', expand=False).fillna('')
print (df)
     Name  Stage                                Description keyvalue
0     Sri      1  Sri is one of the good singer in this two      one
1     Sri      2                         Thanks for reading         
2     Ram      1      Ram is one of the good cricket player      one
3  ganesh      1                                good driver   driver

一起:

print (df)
#     Name  Stage                                Description
#0     Sri      1  Sri is one of the good singer in this two
#1              2                         Thanks for reading
#2     Ram      1      Ram is one of the good cricket player
#3  ganesh      1                            good Driver one

my_list=["ONE","driver"]
df['Name'] = df['Name'].mask(df['Name'].str.strip() == '').ffill()

pat = "|".join(my_list).lower()

names=df.loc[df["Description"].str.lower().str.contains(pat,na=False), 'Name']

df = df[df['Name'].isin(names)]

df['keyvalue'] = (df['Description'].str.lower()
                                   .str.extract("(" + pat + ')', expand=False)
                                   .fillna(''))
print (df)
#     Name  Stage                                Description keyvalue
#0     Sri      1  Sri is one of the good singer in this two      one
#1     Sri      2                         Thanks for reading         
#2     Ram      1      Ram is one of the good cricket player      one
#3  ganesh      1                            good Driver one   driver

关于python - 在 python 中使用 pandas 将关键字与数据框列映射,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46686960/

相关文章:

python - Pandas,仅删除连续的重复值

c# - 将字节作为参数传递给 c#?

python - Sklearn,高斯过程 : XA and XB must have the same number of columns

python - 在 Qt 中启动单独的进程

python - 读取文件时使用 lambda 函数将日期转换为时间戳

Python:将列值的行收集到一行

python - Pandas - 如何基于另一个数据框创建新的数据框?

r - 粘贴函数以构造现有数据框名称并在 R 中评估

python - Tkinter 最大化/恢复/调整差异化大小

python - 重置列索引