python - 为什么通过 Python 的 numpy.loadtxt 导入的时间戳会延迟 6 到 7 个小时？

我正在尝试将文件中的一些 Twitter 文本日期导入到 Python 数组中。我写了以下函数:

import numpy as np

# Load data from a text file into an array and return the array contents
def load_text_file(file_path, file_name):
 try:
  text_data = np.loadtxt(file_path.strip() + file_name.strip(), dtype={'names': ('UserId', 'CreatedAt', 'CollectedAt', 'NumerOfFollowings', 'NumberOfFollowers', 'NumberOfTweets', 'LengthOfScreenName', 'LengthOfDescriptionInUserProfile'), 'formats': ('i8', 'datetime64[us]', 'datetime64[us]', 'i8', 'i8', 'i8', 'i8', 'i8')}, delimiter="\t")
  return text_data
 except IOError as e:
  print(e)

当我查看导入的时间戳对象时，它们似乎在一种情况下偏离了 6 小时，在另一种情况下似乎偏离了 7 小时。以下是我尝试导入的两行示例数据:

5945472 2007-05-10 20:12:18 2009-11-17 20:09:52 156 223 2134 10 54
5947912 2007-05-10 22:08:58 2009-11-19 11:28:25  52  37  730  7 32

这些被导入到 Python 数组中，如下所示:

(5945472, datetime.datetime(2007, 5, 11, 2, 12, 18), datetime.datetime(2009, 11, 18, 3, 9, 52), 156, 223, 2134, 10, 54)
(5947912, datetime.datetime(2007, 5, 11, 4, 8, 58), datetime.datetime(2009, 11, 19, 18, 28, 25), 52, 37, 730, 7, 32)

如您所见，时间戳相差 6 和 7 小时。我不确定为什么。由于更改，日期更改为第二天。有人会知道我如何完全按原样导入时间戳吗？谢谢!!!

最佳答案

据我所知，这是 numpy 从 datetime64 创建 datetime 对象的方式的结果。注意:

>>> np.datetime64('2009-11-17 20:09:52-0500')
numpy.datetime64('2009-11-17T20:09:52-0500')
>>> np.datetime64('2009-11-17 20:09:52-0500').item()
datetime.datetime(2009, 11, 18, 1, 9, 52)
>>> np.datetime64('2009-11-17 20:09:52-0500').item().tzinfo()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not callable

在这个例子中，我特别指定时区为 UTC-5:00。但是，datetime 对象是在没有偏移的情况下创建的；因此它显示为 UTC 时间。

那么你如何解决这个问题呢？您只能在 datetime64 中工作——它们已经正确指定了时区信息，因此计算应该可以正常进行。或者，如果您想使用 datetime，您可以在执行任何计算之前向它们添加时区信息(即 d.item().replace(tzinfo=pytz.timezone("美国/纽约")))。更有可能的是，仅使用 datetime64 会是更简单的方法。

关于python - 为什么通过 Python 的 numpy.loadtxt 导入的时间戳会延迟 6 到 7 个小时？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/35966083/

python - 为什么通过 Python 的 numpy.loadtxt 导入的时间戳会延迟 6 到 7 个小时？

上一篇：python - 用BeautifulSoup抓取: object has no attribute

下一篇：python - 如何安装 python twins 旧版本？