我尝试使用以下形状 (1000,)
的代码加载数据集 .hdf5
格式,但出现错误 ValueError: 无效位置标识符(无效位置标识符)
。当我尝试将数据集加载到 pytorch 数据加载器中时,会弹出错误。
with h5py.File(dataset_path, 'r') as f:
data = f['default']
print(data.shape)
Ouput:
(1000,)
# Define the dataset
class MyDataset(Dataset):
def __init__(self, dataset_path):
super().__init__()
with h5py.File(dataset_path, 'r') as f:
self.data = f['default']
def __len__(self):
return len(self.data)
def __getitem__(self, idx):
return self.data[idx]
# Load the dataset
dataset_path = 'dataset.hdf5'
train_dataset = MyDataset(dataset_path)
train_loader = DataLoader(train_dataset, shuffle=True)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-80-c3c741b81eff> in <module>
22
23 train_dataset = MyDataset(dataset_path)
---> 24 train_loader = DataLoader(train_dataset, shuffle=True)
6 frames
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
/usr/local/lib/python3.9/dist-packages/h5py/_hl/dataset.py in shape(self)
472
473 with phil:
--> 474 shape = self.id.shape
475
476 # If the file is read-only, cache the shape to speed-up future uses.
h5py/h5d.pyx in h5py.h5d.DatasetID.shape.__get__()
h5py/h5d.pyx in h5py.h5d.DatasetID.shape.__get__()
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
h5py/h5d.pyx in h5py.h5d.DatasetID.get_space()
ValueError: Invalid dataset identifier (invalid dataset identifier)
当我这样做时,我收到了同样的错误 f.get("default")
最佳答案
问题在于,使用 with
读取 HDF 文件会导致它在构造函数返回时立即关闭。 h5py
模块的设计思想是使文件保持打开状态,以便可以根据需要使用惰性方法而不是预先读取(或写入)数据。
关于python - 值错误: Invalid dataset identifier (invalid dataset identifier),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/75787491/