hadoop - 无法在 EMR 中运行 Spark 步骤

如果你能给我一些启发，我将不胜感激。

我在 Amazon EMR 中作为 Spark 步骤运行字数统计 map 缩减时遇到问题。但是我设法通过 ssh 连接到主节点并在 spark-shell 中运行字数统计逻辑没有问题。

它提示说 __spark_conf_xx.zip 在主 HDFS 上不存在，虽然复制时没有错误

16/04/05 07:20:21 INFO yarn.Client: Uploading resource file:/mnt/tmp/spark-1d701ab0-7990-4ca2-bee2-099aed8e8e6b/__spark_conf__9006968814682693730.zip -> hdfs://ip-172-31-26-247.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1459839685827_0001/__spark_conf__9006968814682693730.zip

日志如下:

    16/04/05 07:20:16 INFO client.RMProxy: Connecting to ResourceManager at ip-172-31-26-247.ap-northeast-1.compute.internal/172.31.26.247:8032
16/04/05 07:20:16 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
16/04/05 07:20:16 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (11520 MB per container)
16/04/05 07:20:16 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
16/04/05 07:20:16 INFO yarn.Client: Setting up container launch context for our AM
16/04/05 07:20:16 INFO yarn.Client: Setting up the launch environment for our AM container
16/04/05 07:20:16 INFO yarn.Client: Preparing resources for our AM container
16/04/05 07:20:17 INFO yarn.Client: Uploading resource file:/usr/lib/spark/lib/spark-assembly-1.6.1-hadoop2.7.2-amzn-0.jar -> hdfs://ip-172-31-26-247.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1459839685827_0001/spark-assembly-1.6.1-hadoop2.7.2-amzn-0.jar
16/04/05 07:20:18 INFO metrics.MetricsSaver: MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 clusterEngineCycleSec: 60 disableClusterEngine: false maxMemoryMb: 3072 maxInstanceCount: 500 lastModified: 1459839695291 
16/04/05 07:20:18 INFO metrics.MetricsSaver: Created MetricsSaver j-3AZL0AH5ALBBL:i-96753119:SparkSubmit:11699 period:60 /mnt/var/em/raw/i-96753119_20160405_SparkSubmit_11699_raw.bin
16/04/05 07:20:19 INFO metrics.MetricsSaver: 1 aggregated HDFSWriteDelay 2327 raw values into 1 aggregated values, total 1
16/04/05 07:20:20 INFO fs.EmrFileSystem: Consistency disabled, using com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem as filesystem implementation
16/04/05 07:20:20 INFO yarn.Client: Uploading resource s3://gda-test/logic/wordCount.jar -> hdfs://ip-172-31-26-247.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1459839685827_0001/wordCount.jar
16/04/05 07:20:20 INFO s3n.S3NativeFileSystem: Opening 's3://gda-test/logic/wordCount.jar' for reading
16/04/05 07:20:20 INFO metrics.MetricsSaver: Thread 1 created MetricsLockFreeSaver 1
16/04/05 07:20:21 INFO metrics.MetricsSaver: 1 MetricsLockFreeSaver 1 comitted 33 matured S3ReadDelay values
16/04/05 07:20:21 INFO yarn.Client: Uploading resource file:/mnt/tmp/spark-1d701ab0-7990-4ca2-bee2-099aed8e8e6b/__spark_conf__9006968814682693730.zip -> hdfs://ip-172-31-26-247.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1459839685827_0001/__spark_conf__9006968814682693730.zip
16/04/05 07:20:21 INFO spark.SecurityManager: Changing view acls to: hadoop
16/04/05 07:20:21 INFO spark.SecurityManager: Changing modify acls to: hadoop
16/04/05 07:20:21 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
16/04/05 07:20:21 INFO yarn.Client: Submitting application 1 to ResourceManager
16/04/05 07:20:21 INFO impl.YarnClientImpl: Submitted application application_1459839685827_0001
16/04/05 07:20:22 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:22 INFO yarn.Client: 
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1459840821323
     final status: UNDEFINED
     tracking URL: http://ip-172-31-26-247.ap-northeast-1.compute.internal:20888/proxy/application_1459839685827_0001/
     user: hadoop
16/04/05 07:20:23 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:24 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:25 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:26 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:27 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:28 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:29 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:30 INFO yarn.Client: Application report for application_1459839685827_0001 (state: ACCEPTED)
16/04/05 07:20:31 INFO yarn.Client: Application report for application_1459839685827_0001 (state: FAILED)
16/04/05 07:20:31 INFO yarn.Client: 
     client token: N/A
     diagnostics: Application application_1459839685827_0001 failed 2 times due to AM Container for appattempt_1459839685827_0001_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:http://ip-172-31-26-247.ap-northeast-1.compute.internal:8088/cluster/app/application_1459839685827_0001Then, click on links to logs of each attempt.
Diagnostics: File does not exist: hdfs://ip-172-31-26-247.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1459839685827_0001/__spark_conf__9006968814682693730.zip
java.io.FileNotFoundException: File does not exist: hdfs://ip-172-31-26-247.ap-northeast-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1459839685827_0001/__spark_conf__9006968814682693730.zip
    at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
    at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
    at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
    at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

Failing this attempt. Failing the application.
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1459840821323
     final status: FAILED
     tracking URL: http://ip-172-31-26-247.ap-northeast-1.compute.internal:8088/cluster/app/application_1459839685827_0001
     user: hadoop
Exception in thread "main" org.apache.spark.SparkException: Application application_1459839685827_0001 finished with failed status
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034)
    at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
    at org.apache.spark.deploy.yarn.Client.main(Client.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/04/05 07:20:31 INFO util.ShutdownHookManager: Shutdown hook called
16/04/05 07:20:31 INFO util.ShutdownHookManager: Deleting directory /mnt/tmp/spark-1d701ab0-7990-4ca2-bee2-099aed8e8e6b
Command exiting with ret '1'

最佳答案

我找到了解决方案。

是Java版本不匹配导致的，logic和jar是Java8的，而EMR集群默认使用Java7。

在我的 Spark & Hadoop 案例中，我需要在创建集群时使用 Advanced Option 如下自定义 env。 http://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-configure-apps.html#configuring-java8

希望这些信息对遇到同样问题的人有用。

关于hadoop - 无法在 EMR 中运行 Spark 步骤，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/36421761/

hadoop - 无法在 EMR 中运行 Spark 步骤

上一篇：hadoop - 重置 hive 中的排名

下一篇：Hadoop DNS 解析