javascript - Amazon EMR - 无法运行程序 <path> ./mapper.js": error=2, 没有此类文件或目录

标签 javascript node.js amazon-emr

我正在使用 Nodejs 执行 Amazon EMR 作业。我已尝试更改文件以使用 UNIX 行结尾,但仍然无法正常工作。这是错误:-

2016-11-27 09:16:53,794 INFO [main] org.apache.hadoop.streaming.PipeMapRed: PipeMapRed exec [/mnt1/yarn/usercache/hadoop/appcache/application_1480232881564_0005/container_1480232881564_0005_01_000002/./mapper.js]
2016-11-27 09:16:53,803 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.work.output.dir is deprecated. Instead, use mapreduce.task.output.dir
2016-11-27 09:16:53,804 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.start is deprecated. Instead, use mapreduce.map.input.start
2016-11-27 09:16:53,804 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: job.local.dir is deprecated. Instead, use mapreduce.job.local.dir
2016-11-27 09:16:53,804 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
2016-11-27 09:16:53,805 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
2016-11-27 09:16:53,805 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
2016-11-27 09:16:53,805 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.local.dir is deprecated. Instead, use mapreduce.cluster.local.dir
2016-11-27 09:16:53,806 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.file is deprecated. Instead, use mapreduce.map.input.file
2016-11-27 09:16:53,806 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
2016-11-27 09:16:53,806 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.length is deprecated. Instead, use mapreduce.map.input.length
2016-11-27 09:16:53,806 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.cache.localFiles is deprecated. Instead, use mapreduce.job.cache.local.files
2016-11-27 09:16:53,807 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
2016-11-27 09:16:53,807 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
2016-11-27 09:16:53,816 ERROR [main] org.apache.hadoop.streaming.PipeMapRed: configuration exception
java.io.IOException: Cannot run program "/mnt1/yarn/usercache/hadoop/appcache/application_1480232881564_0005/container_1480232881564_0005_01_000002/./mapper.js": error=2, No such file or directory
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
    at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)

这是我的集群命令:-

aws emr create-cluster 
--auto-scaling-role EMR_AutoScaling_DefaultRole 
--applications Name=Hadoop --bootstrap-actions '[{"Path":"s3://ccvikas/installNode.sh","Name":"Custom action"}]' 
--ec2-attributes '{"InstanceProfile":"EMR_EC2_DefaultRole","SubnetId":"subnet-9db906c6","EmrManagedSlaveSecurityGroup":"sg-d9ee70a4","EmrManagedMasterSecurityGroup":"sg-deee70a3"}' 
--service-role EMR_DefaultRole 
--release-label emr-5.2.0 
--steps '[{"Args":["hadoop-streaming","-files","s3://ccvikas/js/mapper.js","-mapper","mapper.js","-reducer","mapper.js","-input","s3://commoncrawl/crawl-data/CC-MAIN-2016-40/segments/1474738659496.36/warc/CC-MAIN-20160924173739-00000-ip-10-143-35-109.ec2.internal.warc.gz","-output","s3://ccvikas/out8"],"Type":"CUSTOM_JAR","ActionOnFailure":"CANCEL_AND_WAIT","Jar":"command-runner.jar","Properties":"","Name":"Streaming program"}]' 
--name 'My cluster' --instance-groups '[{"InstanceCount":1,"InstanceGroupType":"MASTER","InstanceType":"m1.xlarge","Name":"Master - 1"},{"InstanceCount":1,"InstanceGroupType":"CORE","InstanceType":"m1.xlarge","Name":"Core - 2"}]' 
--scale-down-behavior TERMINATE_AT_INSTANCE_HOUR 
--region us-east-1

这是我的步骤命令:-

hadoop-streaming 
-files s3://ccvikas/js/mapper.js,s3://ccvikas/js/reducer.js 
-mapper mapper.js 
-reducer reducer.js 
-input s3://commoncrawl/crawl-data/CC-MAIN-2016-40/segments/1474738659496.36/warc/CC-MAIN-20160924173739-00000-ip-10-143-35-109.ec2.internal.warc.gz 
-output s3://ccvikas/out

最佳答案

问题是我的引导操作没有正确安装nodejs。所以我修改了我的引导操作如下以安装最新的nodejs。

#!/bin/bash
is_aml=`uname -r | grep amzn1.x86_64 | wc -l`

if [ is_aml=1 ]; then

   sudo curl --silent --location https://rpm.nodesource.com/setup_7.x | sudo bash -

   sudo yum -y install nodejs

else
   echo "Unsupported OS"
   exit -1
fi

此类错误的另一个原因可能是:-
- 在映射器和化简器文件中没有使用正确的 shebang 行以及
- 传递保存在 Windows 环境中的映射器和化简器文件(在 Windows 行中) endings) => 使用 UNIX 行结尾来解决问题。

关于javascript - Amazon EMR - 无法运行程序 <path> ./mapper.js": error=2, 没有此类文件或目录,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40828052/

相关文章:

amazon-web-services - AWS EMR 与 Glue 目录,显式指定 CatalogId

amazon-s3 - 从集群将整数/字符串写入 pyspark 中的文本文件

javascript - 使用 node.js 构建站点

amazon-web-services - 如何在多个子网上运行AWS EMR集群?

javascript - 正确的 UV 贴图 Three.js

javascript - 我只需要构建一个像这样的散点图?我尝试过使用 Google 图表,但找不到合适的图表

javascript - 将数据保存到 MongoDb 中会返回一个 ObjectParameterError

javascript - 如何在 JavaScript 错误中指定 "caused by"?

javascript - 如何在页面中从起点开始的一定数量的段落之后添加内容

javascript - 尝试让 jQuery 在某个类的文本区域聚焦时触发警报