我有一个数据框如下
customerid mydate
123 2016-08-15 18:22:40
234 2017-08-15 42:34.04
234 39:35.01
mydate 列是混合的,有些有年份,有些只有时间。 mydate 列是一个对象,但我想将其转换为日期时间,如下所示
df['mydate'] = pd.to_datetime(df['mydate'], format='%Y-%m-%d %H:%M:%S')
但我收到以下错误
ValueError: time data 39:35.01 doesn't match format specified
最佳答案
您可以使用:
#get today date
today = pd.Timestamp.today().floor('D')
#split dates by whitespace
s = df['mydate'].str.split()
#first part convert to datetime if possible replace non exist dates to today
df['date'] = pd.to_datetime(s.str[0], errors='coerce').fillna(today)
#non exist second part of splited values replace by first part
s1 = s.str[1].combine_first(s.str[0])
#if 3. character from back is . add zero hours and convert to timedeltas
df['td'] = pd.to_timedelta(s1.mask(s1.str[-3] == '.', '00:'+ s1))
#add timedelta to dates
df['datefinal'] = df['date'] + df['td']
print (df)
customerid mydate date td \
0 123 2016-08-15 18:22:40 2016-08-15 18:22:40
1 234 2017-08-15 42:34.04 2017-08-15 00:42:34.040000
2 234 39:35.01 2018-11-21 00:39:35.010000
datefinal
0 2016-08-15 18:22:40.000
1 2017-08-15 00:42:34.040
2 2018-11-21 00:39:35.010
关于python - 在python中转换日期时间分钟和年份,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53406173/