python - 如何迭代 Pandas DataFrame 并在另一列中的项目匹配时替换字符串

我的 df 数据有两列，如下所示

thePerson  theText
"the abc" "this is about the abc"
"xyz" "this is about tyu"
"wxy" "this is about abc"
"wxy" "this is about WXY"

我想要一个结果df为

thePerson  theText
"the abc" "this is about <b>the abc</b>"
"xyz" "this is about tyu"
"wxy" "this is about abc"
"wxy" "this is about <b>WXY</b>"

请注意，如果同一行中的文本包含 thePerson，则该文本会变为粗体。

我尝试失败的解决方案之一是:

df['theText']=df['theText'].replace(df.thePerson,'<b>'+df.thePerson+'</b>', regex=True)

我想知道是否可以使用lapply或map来做到这一点

我的python环境设置为2.7版本

最佳答案

使用re.sub和zip

tt = df.theText.values.tolist()
tp = df.thePerson.str.strip('"').values.tolist()
df.assign(
    theText=[re.sub(r'({})'.format(p), r'<b>\1</b>', t, flags=re.I)
             for t, p in zip(tt, tp)]
)

  thePerson                       theText
0   the abc  this is about <b>the abc</b>
1       xyz             this is about tyu
2       wxy             this is about abc
3       wxy      this is about <b>WXY</b>

<小时/>

复制/粘贴
您应该能够运行这个确切的代码并获得所需的结果

from io import StringIO
import pandas as pd

txt = '''thePerson  theText
"the abc"  "this is about the abc"
"xyz"  "this is about tyu"
"wxy"  "this is about abc"
"wxy"  "this is about WXY"'''

df = pd.read_csv(StringIO(txt), sep='\s{2,}', engine='python')

tt = df.theText.values.tolist()
tp = df.thePerson.str.strip('"').values.tolist()
df.assign(
    theText=[re.sub(r'({})'.format(p), r'<b>\1</b>', t, flags=re.I)
             for t, p in zip(tt, tp)]
)

<小时/>

你应该看到这个

   thePerson                         theText
0  "the abc"  "this is about <b>the abc</b>"
1      "xyz"             "this is about tyu"
2      "wxy"             "this is about abc"
3      "wxy"      "this is about <b>WXY</b>"

关于python - 如何迭代 Pandas DataFrame 并在另一列中的项目匹配时替换字符串，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/43798177/

python - 如何迭代 Pandas DataFrame 并在另一列中的项目匹配时替换字符串

上一篇：Python 3.x : enter one list as multiple parameters in a function

下一篇： 'encourage' 使用工厂方法实例化类的 Pythonic 方式