我正在尝试为使用hadoop的Amazon ElasticMapReduce服务使用aws sdk ruby运行步骤,虽然我可以创建集群和步骤,但是该步骤始终会失败,但是使用Web界面手动设置时不会失败
emr = Aws::EMR::Client.new
cluster_id = "*******"
resp = emr.add_job_flow_steps({
job_flow_id: cluster_id, # required
steps: [ # required
{
name: "TestStep", # required
action_on_failure: "CANCEL_AND_WAIT", # accepts TERMINATE_JOB_FLOW, TERMINATE_CLUSTER, CANCEL_AND_WAIT, CONTINUE
hadoop_jar_step: { # required
jar: 'command-runner.jar',
args:[
"-files",
"s3://source123/mapper.py,s3://source123/source_reducer.py",
"-mapper",
"mapper.py",
"-reducer",
"source_reducer.py",
"-input",
"s3://source123/input/",
"-output",
"s3://source123/output/"
]
},
},
],
})
我得到的错误是这个
Cannot run program "-files" (in directory "."): error=2, No such file or directory
有什么线索吗?
最佳答案
似乎添加hadoop-streaming的工作方式如下
emr = Aws::EMR::Client.new
cluster_id = "*******"
resp = emr.add_job_flow_steps({
job_flow_id: cluster_id, # required
steps: [ # required
{
name: "TestStep", # required
action_on_failure: "CANCEL_AND_WAIT", # accepts TERMINATE_JOB_FLOW, TERMINATE_CLUSTER, CANCEL_AND_WAIT, CONTINUE
hadoop_jar_step: { # required
jar: 'command-runner.jar',
args:[
"hadoop-streaming",
"-files",
"s3://source123/mapper.py,s3://source123/source_reducer.py",
"-mapper",
"mapper.py",
"-reducer",
"source_reducer.py",
"-input",
"s3://source123/input/",
"-output",
"s3://source123/output/"
]
},
},
],
})
关于ruby-on-rails - 无法运行程序 “-files”(在 “.”目录中):error = 2,没有这样的文件或目录,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36899395/