python - 如何循环遍历 hdf5 文件中的所有键和值并确定哪些包含数据?

标签 python hdf5 h5py

我将模型模拟的结果存储在 hdf5 文件 (.hdf) 中。

我知道如何使用 h5py 模块打开文件并仔细阅读数据。

问题是,嵌套键和数据集太多,要真正找到所有这些键和数据集并确定其中实际包含数据是一件非常痛苦的事情。

这就是我目前正在处理的问题:

import h5py
f = h5py.File('results.hdf') #to read the file

k1 = f.keys() #shows the keys in the first level

k1
<KeysViewHDF5 ['Event Conditions', 'Geometry', 'Plan Data', 'Results']>

现在,要查看存储的所有数据,我可以执行以下操作:

for k1 in f:
    for k2 in f[k1].keys():
        for k3 in f[k1][k2].keys():
            print(f[k1][k2][k3])  

<HDF5 group "/Event Conditions/Unsteady/Boundary Conditions" (2 members)>
<HDF5 group "/Event Conditions/Unsteady/Initial Conditions" (0 members)>
<HDF5 dataset "Attributes": shape (350,), type "|V45">
<HDF5 dataset "Polyline Info": shape (350, 4), type "<i4">
<HDF5 dataset "Polyline Parts": shape (350, 2), type "<i4">
<HDF5 dataset "Polyline Points": shape (3598, 2), type "<f8">
<HDF5 dataset "Attributes": shape (3,), type "|V37">
<HDF5 dataset "Polygon Info": shape (3, 4), type "<i4">
<HDF5 dataset "Polygon Parts": shape (3, 2), type "<i4">
<HDF5 dataset "Polygon Points": shape (344, 2), type "<f8">
<HDF5 dataset "Attributes": shape (1,), type "|V64">
<HDF5 dataset "Cell Info": shape (1, 2), type "<i4">
<HDF5 dataset "Cell Points": shape (586635, 2), type "<f8">
<HDF5 group "/Geometry/2D Flow Areas/Delta" (0 members)>
<HDF5 group "/Geometry/2D Flow Areas/Perimeter 1" (25 members)>
<HDF5 dataset "Polygon Info": shape (1, 4), type "<i4">
<HDF5 dataset "Polygon Parts": shape (1, 2), type "<i4">
<HDF5 dataset "Polygon Points": shape (610, 2), type "<f8">
<HDF5 dataset "Attributes": shape (1,), type "|V60">
<HDF5 dataset "External Faces": shape (177,), type "|V24">
<HDF5 dataset "Polyline Info": shape (1, 4), type "<i4">
<HDF5 dataset "Polyline Parts": shape (1, 2), type "<i4">
<HDF5 dataset "Polyline Points": shape (5, 2), type "<f8">
<HDF5 dataset "TIN Info": shape (347, 4), type "<i4">
<HDF5 dataset "TIN Points": shape (13591, 4), type "<f8">
<HDF5 dataset "TIN Triangles": shape (20008, 3), type "<i4">
<HDF5 dataset "XSIDs": shape (347, 2), type "<i4">
<HDF5 dataset "Attributes": shape (348,), type "|V676">
<HDF5 group "/Geometry/Cross Sections/Flow Distribution" (5 members)>
<HDF5 dataset "Manning's n Info": shape (348, 2), type "<i4">
<HDF5 dataset "Manning's n Values": shape (1044, 2), type "<f4">
<HDF5 dataset "Polyline Info": shape (348, 4), type "<i4">
<HDF5 dataset "Polyline Parts": shape (348, 2), type "<i4">
<HDF5 dataset "Polyline Points": shape (696, 2), type "<f8">
<HDF5 dataset "Station Elevation Info": shape (348, 2), type "<i4">
<HDF5 dataset "Station Elevation Values": shape (151973, 2), type "<f4">
<HDF5 dataset "Attributes": shape (41,), type "|V32">
<HDF5 dataset "Calibration Table": shape (2,), type "|V200">
<HDF5 dataset "Polygon Info": shape (41, 4), type "<i4">
<HDF5 dataset "Polygon Parts": shape (41, 2), type "<i4">
<HDF5 dataset "Polygon Points": shape (45442, 2), type "<f8">
<HDF5 dataset "Polyline Info": shape (2, 4), type "<i4">
<HDF5 dataset "Polyline Parts": shape (2, 2), type "<i4">
<HDF5 dataset "Polyline Points": shape (1768, 2), type "<f8">
<HDF5 dataset "Attributes": shape (1,), type "|V96">
<HDF5 dataset "Polyline Info": shape (1, 4), type "<i4">
<HDF5 dataset "Polyline Parts": shape (1, 2), type "<i4">
<HDF5 dataset "Polyline Points": shape (2042, 2), type "<f8">
<HDF5 dataset "Polyline Info": shape (2, 4), type "<i4">
<HDF5 dataset "Polyline Parts": shape (2, 2), type "<i4">
<HDF5 dataset "Polyline Points": shape (1152, 2), type "<f8">
<HDF5 dataset "Attributes": shape (1,), type "|V253">
<HDF5 dataset "Centerline Info": shape (1, 4), type "<i4">
<HDF5 dataset "Centerline Parts": shape (1, 2), type "<i4">
<HDF5 dataset "Centerline Points": shape (48, 2), type "<f8">
<HDF5 dataset "Profiles": shape (500,), type "|V28">
<HDF5 dataset "Compute Messages (rtf)": shape (1,), type "|S293107">
<HDF5 dataset "Compute Messages (text)": shape (1,), type "|S215682">
<HDF5 dataset "Compute Processes": shape (6,), type "|V332">
<HDF5 group "/Results/Unsteady/Geometry Info" (3 members)>
<HDF5 group "/Results/Unsteady/Output" (1 members)>
<HDF5 group "/Results/Unsteady/Summary" (0 members)>

但是如果我继续这样做,首先它开始变得荒谬,并且显然有一种更干净的方法,其次它开始崩溃,因为某些键只会下降一定数量的级别。

我想知道 hdf 文件中数据的所有可能的键/路径,以及它们是否包含数据(有些不包含)。

可能是某种带有 try/except 的循环来处理路径的结尾?

如果您知道怎么做,请帮助任何人!

谢谢。

最佳答案

来自here文档链接是这个 http://docs.h5py.org/en/latest/high/group.html#Group.visit ,

def print_attrs(name, obj):
    print(name)
    for key, val in obj.attrs.items():
        print("    %s: %s" % (key, val))

f = h5py.File('foo.hdf5', 'r')
f.visititems(print_attrs)

它使用委托(delegate)模式。您需要传递一个可调用对象,h5py 将使用名称和对象值来调用它。在您的可调用中,您可以检查并决定要做什么。

关于python - 如何循环遍历 hdf5 文件中的所有键和值并确定哪些包含数据?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57716312/

相关文章:

python - 使用 VL 格式将字符串列表从 Python 存储到 HDF5 数据集

python - 从损坏的文件中恢复数据

python - 如何使用 rc.d 脚本从 python 应用程序构建 FreeBSD pkg?

python - Pandas 和 HDF5 聚合性能

python - 如何在 Selenium Python 绑定(bind)中等待并获取 Span 对象的值

c - 并行读取C结构的HDF5单维复合数据集

python - HDFStore:将数据附加到现有表和重新索引与创建新表之间的效率

python - 如何使用 h5py 覆盖 h5 文件中的数组

python - 如何从 pandas 数据帧创建字典的字典?

python - 无法在 python 3.7 中安装 http 模块