pandas DataFrame 列 duration
包含 timedelta64[ns]
,如图所示。如何将它们转换为秒?
0 00:20:32
1 00:23:10
2 00:24:55
3 00:13:17
4 00:18:52
Name: duration, dtype: timedelta64[ns]
我尝试了以下
print df[:5]['duration'] / np.timedelta64(1, 's')
但出现错误
Traceback (most recent call last):
File "test.py", line 16, in <module>
print df[0:5]['duration'] / np.timedelta64(1, 's')
File "C:\Python27\lib\site-packages\pandas\core\series.py", line 130, in wrapper
"addition and subtraction, but the operator [%s] was passed" % name)
TypeError: can only operate on a timedeltas for addition and subtraction, but the operator [__div__] was passed
也试过了
print df[:5]['duration'].astype('timedelta64[s]')
但收到错误
Traceback (most recent call last):
File "test.py", line 17, in <module>
print df[:5]['duration'].astype('timedelta64[s]')
File "C:\Python27\lib\site-packages\pandas\core\series.py", line 934, in astype
values = com._astype_nansafe(self.values, dtype)
File "C:\Python27\lib\site-packages\pandas\core\common.py", line 1653, in _astype_nansafe
raise TypeError("cannot astype a timedelta from [%s] to [%s]" % (arr.dtype,dtype))
TypeError: cannot astype a timedelta from [timedelta64[ns]] to [timedelta64[s]]
最佳答案
这在当前版本的 Pandas(0.14 版)中可以正常工作:
In [132]: df[:5]['duration'] / np.timedelta64(1, 's')
Out[132]:
0 1232
1 1390
2 1495
3 797
4 1132
Name: duration, dtype: float64
以下是旧版 Pandas/NumPy 的解决方法:
In [131]: df[:5]['duration'].values.view('<i8')/10**9
Out[131]: array([1232, 1390, 1495, 797, 1132], dtype=int64)
timedelta64 和 datetime64 数据在内部存储为 8 字节整数(dtype
'<i8'
)。所以上面将 timedelta64s 视为 8 字节整数,然后执行整数
除法将纳秒转换为秒。
请注意,您是 need NumPy version 1.7 or newer使用 datetime64/timedelta64s。
关于python - 在 Python Pandas DataFrame 中将 timedelta64[ns] 列转换为秒,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26456825/