amazon-web-services - AWS SageMaker - 如何加载经过训练的 sklearn 模型以进行推理？

我正在尝试将使用 sklearn 训练的模型部署到端点并将其用作预测的 API。我只想使用 sagemaker 来部署我使用 joblib 序列化的服务器模型，仅此而已。我读过的每篇博客和 sagemaker python 文档都表明，必须在 sagemaker 上训练 sklearn 模型才能在 sagemaker 中部署。

当我浏览 SageMaker 文档时，我了解到 sagemaker 确实让用户 load a serialised model存储在S3中如下图:

def model_fn(model_dir):
    clf = joblib.load(os.path.join(model_dir, "model.joblib"))
    return clf

这就是文档中关于参数 model_dir 的说明:

SageMaker will inject the directory where your model files and sub-directories, saved by save, have been mounted. Your model function should return a model object that can be used for model serving.

这又意味着必须在 sagemaker 上进行培训。

那么，有没有一种方法可以让我只指定序列化模型的 S3 位置，并让 sagemaker 从 S3 反序列化(或加载)模型并将其用于推理？

编辑 1:

我在应用程序的答案中使用了代码，但在尝试从 SageMaker Studio 的笔记本部署时出现以下错误。我相信 SageMaker 是在大声疾呼没有在 SageMaker 上进行培训。

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-6662bbae6010> in <module>
      1 predictor = model.deploy(
      2     initial_instance_count=1,
----> 3     instance_type='ml.m4.xlarge'
      4 )

/opt/conda/lib/python3.7/site-packages/sagemaker/estimator.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, use_compiled_model, wait, model_name, kms_key, data_capture_config, tags, **kwargs)
    770         """
    771         removed_kwargs("update_endpoint", kwargs)
--> 772         self._ensure_latest_training_job()
    773         self._ensure_base_job_name()
    774         default_name = name_from_base(self.base_job_name)

/opt/conda/lib/python3.7/site-packages/sagemaker/estimator.py in _ensure_latest_training_job(self, error_message)
   1128         """
   1129         if self.latest_training_job is None:
-> 1130             raise ValueError(error_message)
   1131 
   1132     delete_endpoint = removed_function("delete_endpoint")

ValueError: Estimator is not associated with a training job

我的代码:

import sagemaker
from sagemaker import get_execution_role
# from sagemaker.pytorch import PyTorchModel
from sagemaker.sklearn import SKLearn
from sagemaker.predictor import RealTimePredictor, json_serializer, json_deserializer

sm_role = sagemaker.get_execution_role()  # IAM role to run SageMaker, access S3 and ECR

model_file = "s3://sagemaker-manual-bucket/sm_model_artifacts/model.tar.gz"   # Must be ".tar.gz" suffix

class AnalysisClass(RealTimePredictor):
    def __init__(self, endpoint_name, sagemaker_session):
        super().__init__(
            endpoint_name,
            sagemaker_session=sagemaker_session,
            serializer=json_serializer,
            deserializer=json_deserializer,   # To be able to use JSON serialization
            content_type='application/json'   # To be able to send JSON as HTTP body
        )

model = SKLearn(model_data=model_file,
                entry_point='inference.py',
                name='rf_try_1',
                role=sm_role,
                source_dir='code',
                framework_version='0.20.0',
                instance_count=1,
                instance_type='ml.m4.xlarge',
                predictor_cls=AnalysisClass)
predictor = model.deploy(initial_instance_count=1,
                         instance_type='ml.m4.xlarge')

最佳答案

是的，你可以。 AWS 文档侧重于 SageMaker 中从培训到部署的端到端，给人的印象是必须在 sagemaker 上进行培训。 AWS 文档和示例应明确区分 Estimator 中的训练、保存和加载模型以及到 SageMaker 端点的部署模型。

SageMaker 模型

您需要创建 AWS::SageMaker::Model资源，指的是您训练过的“模型”等等。 AWS::SageMaker::Model 在 CloudFormation 文档中，但它只是说明您需要什么 AWS 资源。

CreateModel API 创建 SageMaker 模型资源。参数指定要使用的 docker 镜像、S3 中的模型位置、要使用的 IAM 角色等。参见 How SageMaker Loads Your Model Artifacts .

Docker 镜像

显然您需要框架，例如ScikitLearn、TensorFlow、PyTorch 等，用于训练模型以获取推理。您需要一个具有框架的 docker 镜像和 HTTP 前端来响应预测调用。见 SageMaker Inference Toolkit和 Using the SageMaker Training and Inference Toolkits .

构建图像并不容易。因此，AWS 提供了名为 AWS Deep Learning Containers 的预构建镜像。和可用的图像在Github .

如果那里列出了您的框架和版本，您可以将其用作图像。否则，您需要自己构建。见 Building a docker container for training/deploying our classifier .

适用于框架的 SageMaker Python SDK

使用 API 自己创建 SageMaker 模型很难。因此，AWS SageMaker Python SDK 提供了实用程序来为多个框架创建 SageMaker 模型。见 Frameworks对于可用的框架。如果它不存在，您可能仍然可以使用 sagemaker.model.FrameworkModel和 Model加载训练有素的模型。对于您的情况，请参阅 Using Scikit-learn with the SageMaker Python SDK .

model.tar.gz

例如，如果您使用 PyTorch 并将模型保存为 model.pth。要加载模型和推理代码以从模型中获取预测，您需要创建一个 model.tar.gz 文件。 Model Directory Structure 中解释了 model.tar.gz 中的结构。 .如果您使用 Windows，请注意 CRLF 到 LF。 AWS SageMaker 在 *NIX 环境中运行。见 Create the directory structure for your model files .

|- model.pth        # model file is inside / directory.
|- code/            # Code artefacts must be inside /code
  |- inference.py   # Your inference code for the framework
  |- requirements.txt  # only for versions 1.3.1 and higher. Name must be "requirements.txt"

将 tar.gz 文件保存在 S3 中。确保 IAM 角色可以访问 S3 存储桶和对象。

加载模型并进行推理

见 Create a PyTorchModel object .在实例化 PyTorchModel 类时，SageMaker 会自动为 framework_version 中指定的版本选择用于 PyTorch 的 AWS Deep Learning Container 镜像。如果该版本的图像不存在，则它会失败。这尚未在 AWS 中记录，但需要注意。 SageMaker 然后使用 S3 模型文件位置和 AWS Deep Learning Container 图像 URL 在内部调用 CreateModel API。

import sagemaker
from sagemaker import get_execution_role
from sagemaker.pytorch import PyTorchModel
from sagemaker.predictor import RealTimePredictor, json_serializer, json_deserializer

role = sagemaker.get_execution_role()  # IAM role to run SageMaker, access S3 and ECR
model_file = "s3://YOUR_BUCKET/YOUR_FOLDER/model.tar.gz"   # Must be ".tar.gz" suffix


class AnalysisClass(RealTimePredictor):
    def __init__(self, endpoint_name, sagemaker_session):
        super().__init__(
            endpoint_name,
            sagemaker_session=sagemaker_session,
            serializer=json_serializer,
            deserializer=json_deserializer,   # To be able to use JSON serialization
            content_type='application/json'   # To be able to send JSON as HTTP body
        )

model = PyTorchModel(
    model_data=model_file,
    name='YOUR_MODEL_NAME_WHATEVER',
    role=role,
    entry_point='inference.py',
    source_dir='code',              # Location of the inference code
    framework_version='1.5.0',      # Availble AWS Deep Learning PyTorch container version must be specified
    predictor_cls=AnalysisClass     # To specify the HTTP request body format (application/json)
)

predictor = model.deploy(
    initial_instance_count=1,
    instance_type='ml.m5.xlarge'
)

test_data = {"body": "YOUR PREDICTION REQUEST"}
prediction = predictor.predict(test_data)

默认情况下，SageMaker 使用 NumPy 作为序列化格式。为了能够使用 JSON，需要指定序列化程序和 content_type。您可以将它们指定为预测器，而不是使用 RealTimePredictor 类。

predictor.serializer=json_serializer
predictor.predict(test_data)

或者

predictor.serializer=None # As the serializer is None, predictor won't serialize the data
serialized_test_data=json.dumps(test_data) 
predictor.predict(serialized_test_data)

推理代码示例

见 Process Model Input , Get Predictions from a PyTorch Model和 Process Model Output .在本示例中，预测请求在 HTTP 请求正文中以 JSON 格式发送。

import os
import sys
import datetime
import json
import torch
import numpy as np

CONTENT_TYPE_JSON = 'application/json'

def model_fn(model_dir):
    # SageMaker automatically load the model.tar.gz from the S3 and 
    # mount the folders inside the docker container. The  'model_dir'
    # points to the root of the extracted tar.gz file.

    model_path = f'{model_dir}/'
    
    # Load the model
    # You can load whatever from the Internet, S3, wherever <--- Answer to your Question
    # NO Need to use the model in tar.gz. You can place a dummy model file.
    ...

    return model


def predict_fn(input_data, model):
    # Do your inference
    ...

def input_fn(serialized_input_data, content_type=CONTENT_TYPE_JSON):
    input_data = json.loads(serialized_input_data)
    return input_data


def output_fn(prediction_output, accept=CONTENT_TYPE_JSON):
    if accept == CONTENT_TYPE_JSON:
        return json.dumps(prediction_output), accept
    raise Exception('Unsupported content type')

注意

SageMaker 团队不断更改实现，文档经常过时。当您确定您确实遵循了文档并且它不起作用时，很可能是过时的文档。在这种情况下，需要向 AWS 支持澄清，或在 Github 中提出问题。

关于amazon-web-services - AWS SageMaker - 如何加载经过训练的 sklearn 模型以进行推理？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/65168915/