我有一个专栏:line1 [ 'daytime' ]
此列的格式如下:
2018-02-07 17:40:29
2018-02-07 17:41:15
2018-02-07 17:41:55
2018-02-07 17:42:54
2018-02-07 17:43:44
2018-02-07 18:02:54
2018-02-07 18:03:44
Name: daytime, Length: 174859, dtype: datetime64[ns]
我想得到:
2018-02-07 17:00:00
2018-02-07 17:00:00
2018-02-07 17:00:00
2018-02-07 17:00:00
2018-02-07 17:00:00
2018-02-07 18:00:00
2018-02-07 18:00:00
我想改变整个列
最佳答案
使用 astype
转换为以小时为单位的 numpy
df.daytime.astype('datetime64[h]')
# dates
# 0 2018-02-07 17:00:00
# 1 2018-02-07 17:00:00
# 2 2018-02-07 17:00:00
# 3 2018-02-07 17:00:00
# 4 2018-02-07 17:00:00
# 5 2018-02-07 18:00:00
# 6 2018-02-07 18:00:00
提供的解决方案之间的一些速度比较:
datetime = pd.date_range(start='2020-01-01', freq='200S', periods=100)
df = pd.DataFrame(dict(daytime=datetime))
%%timeit
df.daytime.dt.to_period('H')
# 826 µs ± 355 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%%timeit
x = df.daytime.dt.floor('H')
# 774 µs ± 247 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%%timeit
df.daytime.astype('datetime64[h]')
# 190 µs ± 12.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
有 1k 条记录:
datetime = pd.date_range(start='2020-01-01', freq='200S', periods=1000)
df = pd.DataFrame(dict(daytime=datetime))
%%timeit
df.daytime.dt.to_period('H')
# 991 µs ± 312 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%%timeit
x = df.daytime.dt.floor('H')
# 825 µs ± 203 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%%timeit
df.daytime.astype('datetime64[h]')
# 237 µs ± 8.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
关于python - Pandas :强制 'minute' 和 'seconds' 为零,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62543350/