我的 df
数据有两列,如下所示
thePerson theText
"the abc" "this is about the abc"
"xyz" "this is about tyu"
"wxy" "this is about abc"
"wxy" "this is about WXY"
我想要一个结果df
为
thePerson theText
"the abc" "this is about <b>the abc</b>"
"xyz" "this is about tyu"
"wxy" "this is about abc"
"wxy" "this is about <b>WXY</b>"
请注意,如果同一行中的文本包含 thePerson,则该文本会变为粗体。
我尝试失败的解决方案之一是:
df['theText']=df['theText'].replace(df.thePerson,'<b>'+df.thePerson+'</b>', regex=True)
我想知道是否可以使用lapply
或map
来做到这一点
我的python环境设置为2.7版本
最佳答案
使用re.sub
和zip
tt = df.theText.values.tolist()
tp = df.thePerson.str.strip('"').values.tolist()
df.assign(
theText=[re.sub(r'({})'.format(p), r'<b>\1</b>', t, flags=re.I)
for t, p in zip(tt, tp)]
)
thePerson theText
0 the abc this is about <b>the abc</b>
1 xyz this is about tyu
2 wxy this is about abc
3 wxy this is about <b>WXY</b>
<小时/>
复制/粘贴
您应该能够运行这个确切的代码并获得所需的结果
from io import StringIO
import pandas as pd
txt = '''thePerson theText
"the abc" "this is about the abc"
"xyz" "this is about tyu"
"wxy" "this is about abc"
"wxy" "this is about WXY"'''
df = pd.read_csv(StringIO(txt), sep='\s{2,}', engine='python')
tt = df.theText.values.tolist()
tp = df.thePerson.str.strip('"').values.tolist()
df.assign(
theText=[re.sub(r'({})'.format(p), r'<b>\1</b>', t, flags=re.I)
for t, p in zip(tt, tp)]
)
<小时/>
你应该看到这个
thePerson theText
0 "the abc" "this is about <b>the abc</b>"
1 "xyz" "this is about tyu"
2 "wxy" "this is about abc"
3 "wxy" "this is about <b>WXY</b>"
关于python - 如何迭代 Pandas DataFrame 并在另一列中的项目匹配时替换字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43798177/