python - 尝试在 AWS Data Pipeline 上构建自动化脚本

我正在尝试以下列方式使用 AWS Data Pipeline 服务:

将事件类型选择为 Shell 命令事件，并将脚本 uri 设置(到 s3 存储桶)并将阶段输入设置为 true。
将事件的资源类型设置为 EC2。
使用 S3 作为数据节点。
对于 ec2 资源，我已将实例类型选择为 t2.medium，将实例 ID 选择为我创建的自定义 AMI。
安排管道在每天晚上 10 点运行。

第 1 步中指定的脚本(即作为事件中脚本 uri 的一部分)有 2 行: 1. 将S3 bucket数据复制到实例中。 2.运行python命令来执行我的程序。我创建的 AMI 基于 ec2 的 Ubuntu 实例，它包含一些 python 软件以及我想要运行的代码。

现在，在启动管道时，我注意到确实创建了 ec2 实例，并且复制了 S3 数据并可供实例使用，但未运行 python 命令。实例处于运行状态，管道处于等待运行器状态一段时间，然后数据管道失败并显示消息:“资源停止”。

如果我做错了什么，或者为什么我的 python 代码没有被执行，或者为什么我会收到 Resource stalled 错误，有人可以告诉我吗？如果我在没有管道的情况下手动运行代码，则代码工作正常。

提前致谢!

最佳答案

“资源停滞”几乎总是意味着自定义 AMI 的设置存在问题。记录了要求 here .短子弹:

A custom AMI must meet the following requirements for AWS Data Pipeline to use it successfully for Task Runner:

Create the AMI in the same region that the instances will run in.

Ensure that the virtualization type of the AMI is supported by the instance type you plan to use. For example, the I2 and G2 instance types require an HVM AMI and the T1, C1, M1, and M2 instance types require a PV AMI.

Install the following software:

Linux

Bash

wget

unzip

Java 1.6 or newer

cloud-init

Create and configure a user account named ec2-user.

关于python - 尝试在 AWS Data Pipeline 上构建自动化脚本，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28228066/

python - 尝试在 AWS Data Pipeline 上构建自动化脚本

上一篇：python - 黄油过滤痕迹中的时间延迟，为什么以及如何去除它？

下一篇：python 在情节之上绘图