python - 使用 Python 的 HDF 文件中的数据丢失

我正在尝试读取一个 hdf 文件，但没有显示组。我尝试了几种使用表和 h5py 的不同方法，但都无法在文件中显示组。我检查了一下，文件是“分层数据格式(版本 5)数据”(参见更新)。文件信息为here供引用。

可以找到示例数据here

import h5py
import tables as tb

hdffile = "TRMM_LIS_SC.04.1_2010.260.73132"

使用 h5py:

f = h5py.File(hdffile,'w')
print(f)

输出:

< HDF5 file "TRMM_LIS_SC.04.1_2010.260.73132" (mode r+) >
[]

使用表格:

fi=tb.openFile(hdffile,'r')
print(fi)

输出:

TRMM_LIS_SC.04.1_2010.260.73132 (File) ''
Last modif.: 'Wed Aug 10 18:41:44 2016'
Object Tree:
/ (RootGroup) ''

Closing remaining open files:TRMM_LIS_SC.04.1_2010.260.73132...done

更新

h5py.File(hdffile,'w') overwrote the file and emptied it.

现在我的问题是如何将 hdf 版本 4 文件读入 python，因为 h5py 和表都不起作用？

最佳答案

文件有多大？我认为 h5py.File(hdffile,'w') 会覆盖它，所以它是空的。使用 h5py.File(hdffile,'r') 读取。

我没有足够的业力来回复@Luke H 的回答，但将它读入 pandas 可能不是一个好主意。 Pandas hdf5 使用 pytables，这是一种使用 hdf5 的“自以为是”的方式。这意味着它存储额外的元数据(例如索引)。所以我只会使用 pytables 来读取文件，如果它是用 pytables 制作的。

关于python - 使用 Python 的 HDF 文件中的数据丢失，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/38881326/

上一篇：python - 如何覆盖 UpdateView 生成的表单的 clean 方法？

下一篇：python - 从数组中随机翻转 m 个值

相关文章：

python - h5py:存储字符串列表

在 Matlab 中转置的 Python 创建的 HDF5 数据集

python - 计算从 hdf5 文件进行内存映射的大型 numpy 数组的平均值

python - Django 模型字段未出现在管理员中

python - 打包可执行文件、共享库和 Python 绑定(bind)未找到库

python - 我怎样才能反向追加？ Python

python - web2py PDF - 我将该代码放在哪里？

python - 在 Pandas 中 reshape GroupBy 并用 nan 填充(如果丢失)

Pandas 回填日期多索引数据

python - 绘制 Pandas 数据框的饼图和表格