背景
我有以下示例 df
import pandas as pd
df = pd.DataFrame({'Birthdate':['This person was born Date of Birth: 5/6/1950 and other',
'no Date of Birth: nothing here',
'One Date of Birth: 01/01/2001 last here'],
'P_ID': [1,2,3],
'N_ID' : ['A1', 'A2', 'A3']}
)
df
Birthdate N_ID P_ID
0 This person was born Date of Birth: 5/6/1950 a... A1 1
1 no Date of Birth: nothing here A2 2
2 One Date of Birth: 01/01/2001 last here A3 3
目标
将生日的前几位数字替换为 *BDAY*
,例如5/6/1950
变为 *BDAY*1950
所需输出
Birthdate N_ID P_ID
0 This person was born Date of Birth: *BDAY*1950 a... A1 1
1 no Date of Birth: nothing here A2 2
2 One last Date of Birth: *BDAY*2001 last here A3 3
尝试过
来自python - Replace first five characters in a column with asterisks我尝试过以下代码:
df.replace(r'出生日期: ^\d{3}-\d{2}', "*BDAY*", regex=True)
但它并没有完全给我我想要的输出
问题
如何实现我想要的输出?
最佳答案
试试这个:
df['Birthdate'] = df.Birthdate.str.replace(r'[0-9]?[0-9]/[0-9]?[0-9]/', '*BDAY*')
Out[273]:
Birthdate P_ID N_ID
0 This person was born Date of Birth: *BDAY*1950... 1 A1
1 no Date of Birth: nothing here 2 A2
2 One Date of Birth: *BDAY*2001 last here 3 A3
关于regex - 替换 Pandas 出生日期的前几位,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57121057/