python - 如何使用 Blob 存储帐户在 azure-ml 中训练 YOLO-NAS 模型

我想在 azure-ml 中训练用于对象检测的 yolo-nas 模型，其中我的训练和测试数据位于 blob 存储帐户中。

因此，在配置详细信息中，我们将指定数据目录为训练、测试和有效的 Blob 存储帐户路径。

现在我想知道如何将数据提供给 yolo-nas 模型，或者 yolo-nas 将拉取数据进行数据加载、预处理和训练。

配置示例:-

DATA_DIR = 'azureml://subscriptions/xxxxxxxx/resourcegroups/Faucet-poc/workspaces/faucet-ob-poc/datastores/faucetpoc/paths/' #parent directory to where data lives
TRAIN_IMAGES_DIR = 'train/images' #child dir of DATA_DIR where train images are
TRAIN_LABELS_DIR = 'train/labels' #child dir of DATA_DIR where train labels are

VAL_IMAGES_DIR = 'valid/images' #child dir of DATA_DIR where validation images are
VAL_LABELS_DIR = 'valid/labels' #child dir of DATA_DIR where validation labels are

# if you have a test set
TEST_IMAGES_DIR = 'test/images/' #child dir of DATA_DIR where test images are
TEST_LABELS_DIR = 'test/labels' #child dir of DATA_DIR where test labels are

CLASSES = ['faucet','sink','toilet'] #what class names do you have

NUM_CLASSES = len(CLASSES)



dataset_params = {
    'data_dir': DATA_DIR,
    'train_images_dir':TRAIN_IMAGES_DIR,
    'train_labels_dir':TRAIN_LABELS_DIR,
    'val_images_dir':VAL_IMAGES_DIR,
    'val_labels_dir':VAL_LABELS_DIR,
    'test_images_dir':TEST_IMAGES_DIR,
    'test_labels_dir':TEST_LABELS_DIR,
    'classes': CLASSES
}

这是我们在 yolo-nas 模型中加载数据的代码

from super_gradients.training.dataloaders.dataloaders import (
    coco_detection_yolo_format_train, coco_detection_yolo_format_val)

train_data = coco_detection_yolo_format_train(
    dataset_params={
        'data_dir': dataset_params['data_dir'],
        'images_dir': dataset_params['train_images_dir'],
        'labels_dir': dataset_params['train_labels_dir'],
        'classes': dataset_params['classes']
    },
    dataloader_params={
        'batch_size': BATCH_SIZE,
        'num_workers': 2
    }
)

val_data = coco_detection_yolo_format_val(
    dataset_params={
        'data_dir': dataset_params['data_dir'],
        'images_dir': dataset_params['val_images_dir'],
        'labels_dir': dataset_params['val_labels_dir'],
        'classes': dataset_params['classes']
    },
    dataloader_params={
        'batch_size': BATCH_SIZE,
        'num_workers': 2
    }
)

test_data = coco_detection_yolo_format_val(
    dataset_params={
        'data_dir': dataset_params['data_dir'],
        'images_dir': dataset_params['test_images_dir'],
        'labels_dir': dataset_params['test_labels_dir'],
        'classes': dataset_params['classes']
    },
    dataloader_params={
        'batch_size': BATCH_SIZE,
        'num_workers': 2
    }
)

当我运行代码时，我遇到了“未找到”错误

错误:-

RuntimeError: 
data_dir=azureml://subscriptions/xxxxxxxx/resourcegroups/Faucet- 
poc/workspaces/faucet-ob-poc/datastores/faucetpoc/paths/ not found. 
Please make sure that data_dir points toward your dataset.

我在此处添加 azure-ml 中安装点的代码片段

最佳答案

我根据场景重现了这个问题。

训练数据上传到数据存储中。

要访问笔记本中的数据，您可以安装数据存储。请按照以下步骤安装数据存储:

enter image description here

enter image description here 使用上述步骤创建安装后，安装路径将为:

/home/azureuser/cloudfiles/data/datastore/data/

然后您可以使用它作为您的父目录DATA_DIR。

通过上述设置，我可以访问训练数据。

更新:要使用 python 挂载数据存储，您可以使用以下代码。

from  azureml.core  import  Workspace, Dataset, Datastore
workspace = Workspace.from_config()
datastore = Datastore.get(workspace, "workspaceblobstore")
dataset = Dataset.File.from_files(path=(datastore, 'Data'))
mounted_path = "/tmp/test"
mount_cont = dataset.mount(mounted_path)
mount_cont.start()

然后您可以使用mounted_path作为您的父目录。 enter image description here

注意:请检查您的权限和分配的角色。这可能是您无法看到数据操作的原因，并且上述情况也可能出现类似问题。

关于python - 如何使用 Blob 存储帐户在 azure-ml 中训练 YOLO-NAS 模型，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/76652302/

python - 如何使用 Blob 存储帐户在 azure-ml 中训练 YOLO-NAS 模型

上一篇：mongodb - 通过 Azure 数据工厂增量加载 MongoDB 数据

下一篇：python - 如何对 Azure 日志提取 API 进行故障排除