我有两个数据框,
df1,
Names
one two three
Sri is a good player
Ravi is a mentor
Kumar is a cricketer player
df2,
values
sri
NaN
sri, is
kumar,cricketer player
我正在尝试获取 df1 中包含 df2 中所有项目的行
我的预期输出是,
values Names
sri Sri is a good player
NaN
sri, is Sri is a good player
kumar,cricketer player Kumar is a cricketer player
我尝试过,df1["Names"].str.contains("|".join(df2["values"].values.tolist()))
我也尝试过,
但我无法实现我的预期输出,因为它有(“,”)。请帮忙
最佳答案
将集合逻辑与 Numpy 广播结合使用。
d1 = df1['Names'].fillna('').str.lower().str.split('[^a-z]+').apply(set).values
d2 = df2['values'].fillna('').str.lower().str.split('[^a-z]+').apply(set).values
i, j = np.where(d1 >= d2[:, None])
df2.assign(Names=pd.Series(df1['Names'].values[j], df2['values'].index[i]))
values Names
0 sri Sri is a good player
1 NaN NaN
2 sri, is Sri is a good player
3 kumar,cricketer player Kumar is a cricketer player
关于python - 如何根据pandas中的条件映射两行不同的数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49022851/