在数据框中,如果在同一行的句子列中的字符串中找到标签列中的字符串,我想对其进行过滤并保留行:
输入数据框(某些行为空):
输出数据帧:
最佳答案
这个怎么样?
from io import StringIO
import pandas as pd
s = """sentence labels
A met B A
C is dead X
D
D went to London and Berlin London and Berlin
E is sleeping """
df = pd.read_csv(StringIO(s), sep="\t")
print(df)
sentence labels
0 A met B A
1 C is dead X
2 D NaN
3 D went to London and Berlin London and Berlin
4 E is sleeping NaN
假设 NaN
值将被视为空字符串...
out = df.loc[
# fill missing vals with empty strings to avoid TypeError
df.fillna("")
# check if labels in sentence iteratively
# while also making making sure labels are not empty
.apply(
lambda r: r["labels"] in r["sentence"] and bool(r["labels"]),
axis="columns",
)
]
print(out)
sentence labels
0 A met B A
3 D went to London and Berlin London and Berlin
关于Python如何过滤A列中的子字符串在B列的字符串中找到的数据框?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/76148123/