python - 删除其他列中出现的单词，Pandas

标签 python string replace pandas dataframe

从一列 column 中的字符串中删除另一列中出现的单词的过程是什么？

例如:

Sr       A              B                            C
1      jack        jack and jill                 and jill
2      run         you should run,               you should ,
3      fly         you shouldnt fly,there        you shouldnt ,there

可以看出，我想要C 列，这样它就是B 减去A 的内容。请注意第三个示例，其中 fly 后跟一个逗号，因此它还应考虑标点符号(如果代码更倾向于检测其周围的空格)。
Column A 也可以有 2 个单词，所以这些需要被删除。
我需要 Pandas 中的表达式，例如:

df.apply(lambda x: x["C"].replace(r"\b"+x["A"]+r"\b", "").strip(), axis=1)

最佳答案

这看起来怎么样？

In [24]: df
Out[24]: 
   Sr     A                       B
0   1  jack           jack and jill
1   2   run         you should run,
2   3   fly  you shouldnt fly,there

[3 rows x 3 columns]

In [25]: df.apply(lambda row: row.B.strip(row.A), axis=1)
Out[25]: 
0                 and jill
1          you should run,
2    ou shouldnt fly,there
dtype: object

关于python - 删除其他列中出现的单词，Pandas，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/22713441/

上一篇：python - key 错误 : 0L when running packaged code

下一篇：Python locals() 和 globals() 是一样的吗？

相关文章：

Javascript trim 双斜杠

Python - 将包类导入控制台全局命名空间

Python，针对频繁模式的网络日志数据挖掘

php - 在文本 block 中查找并替换多个不同的关键字

java - 对对象数组使用 toString

javascript - 如何删除 JavaScript 字符串两个单词之间的字符 ↵？

python - 根据一些特定的列合并数据，pandas

python - 以稳定的方式找到曲线的肘点？

c++ - 如何将字符串更改为 QString？

Python 查找所有匹配的子字符串模式并替换子字符串