我正在尝试用我创建的字典中的数据替换“地点”列中的数据。 “Place”列包含字典键的子字符串(不区分大小写)。我无法让我的任何一种方法发挥作用,感谢任何指导。
incoming_df = pd.DataFrame({'First_Name' : ['John', 'Chris', 'renzo', 'Laura', 'Stan', 'Russ', 'Lip', 'Hick', 'Donald'],
'Last_Name' : ['stanford', 'lee', 'Olivares', 'Johnson', 'Stanley', 'Russaford', 'Lipper', 'Hero', 'Lipsey'],
'location' : ['Grant Elementary', 'Code Academy', 'Queen Prep', 'Waves College', 'duke Prep', 'california Academy', 'SF College Prep', 'San Ramon Prep', 'San Jose High']})
df = pd.DataFrame({'FirstN': [],
'LastN':[],
'Place': []})
# re index based on data given
df = df.reindex(incoming_df.index)
# copy data over to new dataframe
df['LastN'] = incoming_df.loc[:, incoming_df.columns.str.contains('Last', case=False)]
df['FirstN'] = incoming_df.loc[:, incoming_df.columns.str.contains('First', case=False)]
df['Place'] = incoming_df.loc[:, incoming_df.columns.str.contains('School|Work|Site|Location', case=False)]
places = { 'Grant' : 'DEF Grant Elementary',
'Code' : 'DEF Code Academy',
'Queen' : 'DEF Queen Preparatory High School',
'Waves' : 'DEF Waves College Prep',
'Duke' : 'DEF Duke Preparatory Institute',
'California' : 'DEF California Academy',
'SF College' : 'DEF San Francisco College',
'San Ramon' : 'DEF San Ramon Prep',
'San Jose' : 'DEF San Jose High School' }
# replace dictionary values with values in Place (results in NAN values inside 'Place' column
pat = r'({})'.format('|'.join(places.keys()))
extracted = df.Place.str.extract(pat, expand=False).dropna()
df['Place'] = extracted.apply(lambda x: places[x])
# Also tried this method but did not work
df['Place'] = df['Place'].replace(places)
# original df
FirstN LastN Place
0 John stanford Grant Elementary
1 Chris lee Code Academy
2 renzo Olivares Queen Prep
3 Laura Johnson Waves College
4 Stan Stanley duke Prep
5 Russ Russaford california Academy
6 Lip Lipper SF College Prep
7 Hick Hero San Ramon Prep
8 Donald Lipsey San Jose High
# target df
FirstN LastN Place
0 John Stanford DEF Grant Elementary
1 Chris Lee DEF Code Academy
2 Renzo Olivares DEF Queen Preparatory High School
3 Laura Johnson DEF Waves College Prep
4 Stan Stanley DEF Duke Preparatory Institute
5 Russ Russaford DEF California Academy
6 Lip Lipper DEF San Francisco College
7 Hick Hero DEF San Ramon Prep
8 Donald Lipsey DEF San Jose High School
最佳答案
使用这个循环解决了我的问题
for k, v in dic.items():
df['Place'] = np.where(df['Place'].str.contains(k, case=False), v, df['Place'])
关于python - 如果它包含 pandas 数据框中基于字典键的子字符串,则替换整个字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51775540/