google-cloud-ml - PyTorch 模型在 AI Platform 中的部署

我在 Google Cloud AI Platform 中部署 Pytorch 模型时出现以下错误:

ERROR: (gcloud.beta.ai-platform.versions.create) Create Version failed. Bad model detected with error: Model requires more memory than allowed. Please try to decrease the model size and re-deploy. If you continue to have error, please contact Cloud ML.

配置:

设置.py

from setuptools import setup

REQUIRED_PACKAGES = ['torch']

setup(
    name="iris-custom-model",
    version="0.1",
    scripts=["model.py"],
    install_requires=REQUIRED_PACKAGES
)

模型版本创建

MODEL_VERSION='v1'
RUNTIME_VERSION='1.15'
MODEL_CLASS='model.PyTorchIrisClassifier'

!gcloud beta ai-platform versions create {MODEL_VERSION} --model={MODEL_NAME} \
            --origin=gs://{BUCKET}/{GCS_MODEL_DIR} \
            --python-version=3.7 \
            --runtime-version={RUNTIME_VERSION} \
            --package-uris=gs://{BUCKET}/{GCS_PACKAGE_URI} \
            --prediction-class={MODEL_CLASS}

最佳答案

需要使用兼容云AI平台的Pytorch编译包信息here

This bucket包含与 Cloud AI Platform 预测兼容的 PyTorch 编译包。这些文件是从 https://download.pytorch.org/whl/cpu/torch_stable.html 的官方构建镜像而来的。

来自文档

In order to deploy a PyTorch model on Cloud AI Platform Online Predictions, you must add one of these packages to the packageURIs field on the version you deploy. Pick the package matching your Python and PyTorch version. The package names follow this template:

Package name = torch-{TORCH_VERSION_NUMBER}-{PYTHON_VERSION}-linux_x86_64.whl where PYTHON_VERSION = cp35-cp35m for Python 3 with runtime versions < 1.15, cp37-cp37m for Python 3 with runtime versions >= 1.15

For example, if I were to deploy a PyTorch model based on PyTorch 1.1.0 and Python 3, my gcloud command would look like:
gcloud beta ai-platform versions create {VERSION_NAME} --model {MODEL_NAME} 
 ...
--package-uris=gs://{MY_PACKAGE_BUCKET}/my_package-0.1.tar.gz,gs://cloud->ai-pytorch/torch-1.1.0-cp35-cp35m-linux_x86_64.whl

总结:

1) 从 setup.py

中的 install_requires 依赖项中删除 torch

2) 在创建版本模型时包含 torch 包。

!gcloud beta ai-platform versions create {VERSION_NAME} --model {MODEL_NAME} \
 --origin=gs://{BUCKET}/{MODEL_DIR}/ \
 --python-version=3.7 \
 --runtime-version={RUNTIME_VERSION} \
 --package-uris=gs://{BUCKET}/{PACKAGES_DIR}/text_classification-0.1.tar.gz,gs://cloud-ai-pytorch/torch-1.3.1+cpu-cp37-cp37m-linux_x86_64.whl \
 --prediction-class=model_prediction.CustomModelPrediction

关于google-cloud-ml - PyTorch 模型在 AI Platform 中的部署，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/60423140/

google-cloud-ml - PyTorch 模型在 AI Platform 中的部署

上一篇：tensorflow - Keras 未导入 : TypeError: can only concatenate str (not "list") to str?

下一篇：Angular Material 表动态列