我正在尝试使数据框的一列与列表匹配(如果有的话)。为此创建了一个名为 return hits 的自定义函数。
def returnhits(a_list, long_string):
matches =[]
for match in a_list:
if any(word in long_string.split() for match in a_list):
matches.append(match)
return ' , '.join(matches)
qualification_list = ('Professional Certificate', 'NiTEC ', "Bachelor's Degree", 'Diploma', 'Advanced/Higher/Graduate Diploma', 'Post Graduate Diploma' , 'Professional Degree', "Master's Degree" , 'Doctorate (PhD)')
但是我无法产生想要的结果。
df['Qualifications'] = df['Other information'].apply(lambda x : returnhits(qualification_list, x))
理想情况下,如果文本中有匹配项,它将返回 NiTEC ,Professional Degree
最佳答案
你可以试试这个来检查并返回多重匹配:
df = pd.DataFrame({'Other information': ['something', ' Diploma blah NiTEC', 'other Diploma']})
qualification_list = ('Professional Certificate', 'NiTEC', "Bachelor's Degree", 'Diploma', 'Advanced/Higher/Graduate Diploma', 'Post Graduate Diploma' , 'Professional Degree', "Master's Degree" , 'Doctorate (PhD)')
def returnhits(a_list, x):
return(' , '.join(a for a in a_list if a in x))
df['matches'] = df['Other information'].apply(lambda x : returnhits(qualification_list,x))
print(df)
输出:
Other information matches
0 something
1 Diploma blah NiTEC NiTEC , Diploma
2 other Diploma Diploma
关于python - 无法生成列表以显示列表中的任何匹配项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69161967/