我在 Google Cloud AI Platform 中部署 Pytorch 模型时出现以下错误:
ERROR: (gcloud.beta.ai-platform.versions.create) Create Version failed. Bad model detected with error: Model requires more memory than allowed. Please try to decrease the model size and re-deploy. If you continue to have error, please contact Cloud ML.
配置:
设置.py
from setuptools import setup
REQUIRED_PACKAGES = ['torch']
setup(
name="iris-custom-model",
version="0.1",
scripts=["model.py"],
install_requires=REQUIRED_PACKAGES
)
模型版本创建
MODEL_VERSION='v1'
RUNTIME_VERSION='1.15'
MODEL_CLASS='model.PyTorchIrisClassifier'
!gcloud beta ai-platform versions create {MODEL_VERSION} --model={MODEL_NAME} \
--origin=gs://{BUCKET}/{GCS_MODEL_DIR} \
--python-version=3.7 \
--runtime-version={RUNTIME_VERSION} \
--package-uris=gs://{BUCKET}/{GCS_PACKAGE_URI} \
--prediction-class={MODEL_CLASS}
最佳答案
需要使用兼容云AI平台的Pytorch编译包信息here
This bucket包含与 Cloud AI Platform 预测兼容的 PyTorch 编译包。这些文件是从 https://download.pytorch.org/whl/cpu/torch_stable.html 的官方构建镜像而来的。
来自文档
In order to deploy a PyTorch model on Cloud AI Platform Online Predictions, you must add one of these packages to the packageURIs field on the version you deploy. Pick the package matching your Python and PyTorch version. The package names follow this template:
Package name =
torch-{TORCH_VERSION_NUMBER}-{PYTHON_VERSION}-linux_x86_64.whl
wherePYTHON_VERSION
= cp35-cp35m for Python 3 with runtime versions < 1.15, cp37-cp37m for Python 3 with runtime versions >= 1.15For example, if I were to deploy a PyTorch model based on PyTorch 1.1.0 and Python 3, my gcloud command would look like:
gcloud beta ai-platform versions create {VERSION_NAME} --model {MODEL_NAME} ... --package-uris=gs://{MY_PACKAGE_BUCKET}/my_package-0.1.tar.gz,gs://cloud->ai-pytorch/torch-1.1.0-cp35-cp35m-linux_x86_64.whl
总结:
1) 从 setup.py
install_requires
依赖项中删除 torch
2) 在创建版本模型时包含 torch
包。
!gcloud beta ai-platform versions create {VERSION_NAME} --model {MODEL_NAME} \
--origin=gs://{BUCKET}/{MODEL_DIR}/ \
--python-version=3.7 \
--runtime-version={RUNTIME_VERSION} \
--package-uris=gs://{BUCKET}/{PACKAGES_DIR}/text_classification-0.1.tar.gz,gs://cloud-ai-pytorch/torch-1.3.1+cpu-cp37-cp37m-linux_x86_64.whl \
--prediction-class=model_prediction.CustomModelPrediction
关于google-cloud-ml - PyTorch 模型在 AI Platform 中的部署,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60423140/