amazon-web-services - 初始化 : User data is limited to 16384 bytes 时出现 Dask aws 集群错误

标签 amazon-web-services amazon-ec2 conda dask dask-distributed

我正在遵循此处的指南:https://cloudprovider.dask.org/en/latest/packer.html#ec2cluster-with-rapids
特别是我用 packer 设置了我的实例,现在我正在尝试运行最后一段代码:

cluster = EC2Cluster(
            ami=pack_ami,  # AMI ID provided by Packer
            region="eu-west-2",
            docker_image="rapidsai/rapidsai:cuda10.1-runtime-ubuntu18.04-py3.8",
            instance_type="p3.2xlarge",
            bootstrap=False,
            filesystem_size=120,
        )
cluster.scale(1)
client = Client(cluster)
请注意,我必须添加区域以避免提示。不幸的是,现在我收到此错误:
botocore.exceptions.ClientError: An error occurred (InvalidParameterValue) when calling the RunInstances operation: User data is limited to 16384 bytes
Creating scheduler instance .
完整跟踪在这里:
Creating scheduler instance
Traceback (most recent call last):
  File "tpotmodel.py", line 124, in <module>
    main()
  File "tpotmodel.py", line 83, in main
    bootstrap=False,
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/dask_cloudprovider/aws/ec2.py", line 474, in __init__
    super().__init__(**kwargs)
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/dask_cloudprovider/generic/vmcluster.py", line 284, in __init__
    super().__init__(**kwargs, security=self.security)
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/distributed/deploy/spec.py", line 281, in __init__
    self.sync(self._start)
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/distributed/deploy/cluster.py", line 189, in sync
    return sync(self.loop, func, *args, **kwargs)
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/distributed/utils.py", line 340, in sync
    raise exc.with_traceback(tb)
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/distributed/utils.py", line 324, in f
    result[0] = yield future
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/dask_cloudprovider/generic/vmcluster.py", line 324, in _start
    await super()._start()
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/distributed/deploy/spec.py", line 309, in _start
    self.scheduler = await self.scheduler
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/dask_cloudprovider/generic/vmcluster.py", line 86, in start
    ip = await self.create_vm()
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/dask_cloudprovider/aws/ec2.py", line 139, in create_vm
    response = await client.run_instances(**vm_kwargs)
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/aiobotocore/client.py", line 154, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidParameterValue) when calling the RunInstances operation: User data is limited to 16384 bytes
如果有任何改变,我将与 conda 一起使用。

最佳答案

Dask 社区正在此处跟踪此问题:github.com/dask/dask-cloudprovider/issues/249和一个潜在的解决方案 github.com/dask/distributed/pull/4465 . 4465 应该可以解决问题。

关于amazon-web-services - 初始化 : User data is limited to 16384 bytes 时出现 Dask aws 集群错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65982439/

相关文章:

python-2.7 - 使用非常旧的 python 版本设置 conda 环境

amazon-web-services - 云信息 : User-data in EC2 as well as in Launch Template

amazon-web-services - 在无服务器 AWS 上处理相当大的文本文件

Git不使用全局配置文件中的用户

android - 我应该选择什么 AWS 开发工具包?

amazon-web-services - Kubernetes 与 AWS 弹性 block 存储

python - pip 安装与 conda 安装

amazon-web-services - 适用于 .net 的亚马逊 Glacier sdk

amazon-web-services - SSH - 匹配地址已删除?

apache-spark - 将pyspark相关的JAR包安装到Foundry中