python - 如果它包含 pandas 数据框中基于字典键的子字符串,则替换整个字符串

标签 python python-3.x pandas

我正在尝试用我创建的字典中的数据替换“地点”列中的数据。 “Place”列包含字典键的子字符串(不区分大小写)。我无法让我的任何一种方法发挥作用,感谢任何指导。

incoming_df = pd.DataFrame({'First_Name' : ['John', 'Chris', 'renzo', 'Laura', 'Stan', 'Russ', 'Lip', 'Hick', 'Donald'],
                            'Last_Name' : ['stanford', 'lee', 'Olivares', 'Johnson', 'Stanley', 'Russaford', 'Lipper', 'Hero', 'Lipsey'],
                            'location' : ['Grant Elementary', 'Code Academy', 'Queen Prep', 'Waves College', 'duke Prep', 'california Academy', 'SF College Prep', 'San Ramon Prep', 'San Jose High']})

df = pd.DataFrame({'FirstN': [],
                        'LastN':[],
                        'Place': []})

# re index based on data given
df = df.reindex(incoming_df.index)

# copy data over to new dataframe
df['LastN'] = incoming_df.loc[:, incoming_df.columns.str.contains('Last', case=False)]
df['FirstN'] = incoming_df.loc[:, incoming_df.columns.str.contains('First', case=False)]
df['Place'] = incoming_df.loc[:, incoming_df.columns.str.contains('School|Work|Site|Location', case=False)]

places = { 'Grant' : 'DEF Grant Elementary',
                    'Code' : 'DEF Code Academy',
                    'Queen' : 'DEF Queen Preparatory High School',
                    'Waves' : 'DEF Waves College Prep',
                    'Duke' : 'DEF Duke Preparatory Institute',
                    'California' : 'DEF California Academy',
                    'SF College' : 'DEF San Francisco College',
                    'San Ramon' : 'DEF San Ramon Prep',
                    'San Jose' : 'DEF San Jose High School' }

# replace dictionary values with values in Place (results in NAN values inside 'Place' column
pat = r'({})'.format('|'.join(places.keys()))
extracted = df.Place.str.extract(pat, expand=False).dropna()
df['Place'] = extracted.apply(lambda x: places[x])

# Also tried this method but did not work
df['Place'] = df['Place'].replace(places)

# original df
    FirstN   LastN      Place
0   John    stanford    Grant Elementary
1   Chris   lee         Code Academy
2   renzo   Olivares    Queen Prep
3   Laura   Johnson     Waves College
4   Stan    Stanley     duke Prep
5   Russ    Russaford   california Academy
6   Lip     Lipper      SF College Prep
7   Hick    Hero        San Ramon Prep
8   Donald  Lipsey      San Jose High

# target df
    FirstN   LastN      Place
0   John    Stanford    DEF Grant Elementary
1   Chris   Lee         DEF Code Academy
2   Renzo   Olivares    DEF Queen Preparatory High School
3   Laura   Johnson     DEF Waves College Prep
4   Stan    Stanley     DEF Duke Preparatory Institute
5   Russ    Russaford   DEF California Academy
6   Lip     Lipper      DEF San Francisco College
7   Hick    Hero        DEF San Ramon Prep
8   Donald  Lipsey      DEF San Jose High School

最佳答案

使用这个循环解决了我的问题

for k, v in dic.items():
    df['Place'] = np.where(df['Place'].str.contains(k, case=False), v, df['Place'])

关于python - 如果它包含 pandas 数据框中基于字典键的子字符串,则替换整个字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51775540/

相关文章:

python - Celery 将并行任务链接成和弦

python - 模拟 itertools.zip_longest

python - Pandas 根据条件组合连续的行

python - 如何将包含字符串和数字的值列表写入文本文件

python - 根据 pandas groupby 运算结果构造一个超集

python - labelencoder 和 OneHotEncoder 的值错误

python - 以某些标准特别开始的阅读行

python - 如何理解Imagenet预处理的TensorFlow源代码

python - 在 Pandas 中附加到一个空的 DataFrame?

python - 派克达/CUDA : Causes of non-deterministic launch failures?