Python pandas 从字符串中删除子字符串之后的部分

尝试清除“名称”列中“模型”值后面的错误文本。

df = pd.DataFrame([['ABC-12(s)', 'Some text ABC-12(s) wrong text'], ['ABC-45', 'Other text ABC-45 garbage text'], ['XYZ-LL', 'Another text XYZ-LL unneeded text']], columns = ['Model', 'Name'])

<表类=“s-表”> <标题> 索引型号名称 <正文> 0 ABC-12(s) 一些文本 ABC-12(s) 错误文本 1 ABC-45 其他文本 ABC-45 垃圾文本 2 XYZ-LL 另一个文本 XYZ-LL 不需要的文本

预期结果:

<表类=“s-表”> <标题> 索引型号名称 <正文> 0 ABC-12(s) 一些文本 ABC-12(s) 1 ABC-45 其他文本 ABC-45 2 XYZ-LL 另一个文本 XYZ-LL

尝试过:

df["name"] = df["name"].str.partition(df["model"].to_string(), expand=False)

但这会返回原始字符串，没有任何更改或错误。就像它无法在“名称”单元格中找到分隔符一样。看来我错过了一些非常简单的事情。

最佳答案

另一种解决方案，使用 re:

import re

df["Name"] = df.apply(
    lambda x: re.split(r"(?<=" + re.escape(x["Model"]) + r")\s*", x["Name"])[0],
    axis=1,
)
print(df)

打印:

       Model                 Name
0  ABC-12(s)  Some text ABC-12(s)
1     ABC-45    Other text ABC-45
2     XYZ-LL  Another text XYZ-LL

关于Python pandas 从字符串中删除子字符串之后的部分，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/68717925/

上一篇：r - 如何将变量列表附加到 R 数据框特定行中的列表？

下一篇：python - Conda 环境未显示在 VS Code 中

python - python 3.3 的 urllib.request 无法下载文件

Python:计算 Pandas Dataframe 中列表的 PMF

python - Pandas .DataFrame : find the index of the row whose value in a given column is closest to (but below) a specified value

python - 如何使用 pandas 旋转数据框，使可变列变成行？

Python: 'numpy.ndarray' 对象没有属性 'violinplot'

python - 根据重复项对 Python 列表进行分组

python - 尝试删除 Pandas 中的异常值时出现 ValueError

python - 使用 Pandas 的 NaN 过滤时间序列中的空洞

r - 添加具有数据帧 R 中行最后一列的最后一个值的每一列