java - apache spark2.3.0以master作为 yarn 启动时,失败并出现错误无法找到或加载主类org.apache.spark.deploy.yarn.ApplicationMaster

标签 java apache-spark hadoop hadoop2 hadoop2.7.3

我已经安装了Apache Hadoop 2.7.5Apache Spark 2.3.0
当我使用--master local[*]提交工作时,它运行良好。但是当我运行--master yarn时,网络日志中的错误提示

Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster

这是我触发的命令:
spark-submit --class com.spark.SparkTest --master yarn --deploy-mode cluster /root/Downloads/SimpleSpark-0.0.1-SNAPSHOT.jar

控制台显示:
[root@localhost sbin]# spark-submit --class com.spark.SparkTest --master yarn --deploy-mode cluster /root/Downloads/SimpleSpark-0.0.1-SNAPSHOT.jar
2018-05-12 17:24:37 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-05-12 17:24:39 INFO  RMProxy:98 - Connecting to ResourceManager at /0.0.0.0:8032
2018-05-12 17:24:40 INFO  Client:54 - Requesting a new application from cluster with 1 NodeManagers
2018-05-12 17:24:40 INFO  Client:54 - Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
2018-05-12 17:24:40 INFO  Client:54 - Will allocate AM container, with 1408 MB memory including 384 MB overhead
2018-05-12 17:24:40 INFO  Client:54 - Setting up container launch context for our AM
2018-05-12 17:24:40 INFO  Client:54 - Setting up the launch environment for our AM container
2018-05-12 17:24:40 INFO  Client:54 - Preparing resources for our AM container
2018-05-12 17:24:43 INFO  Client:54 - Uploading resource file:/opt/spark-2.3.0/yarn/spark-2.3.0-yarn-shuffle.jar -> hdfs://localhost:9000/user/root/.sparkStaging/application_1526143826498_0001/spark-2.3.0-yarn-shuffle.jar
2018-05-12 17:24:45 INFO  Client:54 - Uploading resource file:/root/Downloads/SimpleSpark-0.0.1-SNAPSHOT.jar -> hdfs://localhost:9000/user/root/.sparkStaging/application_1526143826498_0001/SimpleSpark-0.0.1-SNAPSHOT.jar
2018-05-12 17:24:45 WARN  DFSClient:611 - Caught exception
java.lang.InterruptedException
        at java.lang.Object.wait(Native Method)
        at java.lang.Thread.join(Thread.java:1252)
        at java.lang.Thread.join(Thread.java:1326)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:609)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:370)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:546)
2018-05-12 17:24:45 WARN  Client:66 - Same name resource file:/opt/spark-2.3.0/yarn/spark-2.3.0-yarn-shuffle.jar added multiple times to distributed cache
2018-05-12 17:24:45 INFO  Client:54 - Uploading resource file:/tmp/spark-6db13382-d02d-4e8a-b5bf-5aafd535ba1e/__spark_conf__789951835863303071.zip -> hdfs://localhost:9000/user/root/.sparkStaging/application_1526143826498_0001/__spark_conf__.zip
2018-05-12 17:24:46 WARN  DFSClient:611 - Caught exception
java.lang.InterruptedException
        at java.lang.Object.wait(Native Method)
        at java.lang.Thread.join(Thread.java:1252)
        at java.lang.Thread.join(Thread.java:1326)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:609)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:370)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:546)
2018-05-12 17:24:46 INFO  SecurityManager:54 - Changing view acls to: root
2018-05-12 17:24:46 INFO  SecurityManager:54 - Changing modify acls to: root
2018-05-12 17:24:46 INFO  SecurityManager:54 - Changing view acls groups to:
2018-05-12 17:24:46 INFO  SecurityManager:54 - Changing modify acls groups to:
2018-05-12 17:24:46 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2018-05-12 17:24:46 INFO  Client:54 - Submitting application application_1526143826498_0001 to ResourceManager
2018-05-12 17:24:46 INFO  YarnClientImpl:273 - Submitted application application_1526143826498_0001
2018-05-12 17:24:47 INFO  Client:54 - Application report for application_1526143826498_0001 (state: ACCEPTED)
2018-05-12 17:24:47 INFO  Client:54 -
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: default
         start time: 1526145886541
         final status: UNDEFINED
         tracking URL: http://localhost.localdomain:8088/proxy/application_1526143826498_0001/
         user: root
2018-05-12 17:24:48 INFO  Client:54 - Application report for application_1526143826498_0001 (state: ACCEPTED)
2018-05-12 17:24:49 INFO  Client:54 - Application report for application_1526143826498_0001 (state: ACCEPTED)
2018-05-12 17:24:50 INFO  Client:54 - Application report for application_1526143826498_0001 (state: ACCEPTED)
2018-05-12 17:24:51 INFO  Client:54 - Application report for application_1526143826498_0001 (state: ACCEPTED)
2018-05-12 17:24:52 INFO  Client:54 - Application report for application_1526143826498_0001 (state: ACCEPTED)
2018-05-12 17:24:53 INFO  Client:54 - Application report for application_1526143826498_0001 (state: ACCEPTED)
2018-05-12 17:24:54 INFO  Client:54 - Application report for application_1526143826498_0001 (state: FAILED)
2018-05-12 17:24:54 INFO  Client:54 -
         client token: N/A
         diagnostics: Application application_1526143826498_0001 failed 2 times due to AM Container for appattempt_1526143826498_0001_000002 exited with  exitCode: 1
For more detailed output, check application tracking page:http://localhost.localdomain:8088/cluster/app/application_1526143826498_0001Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1526143826498_0001_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)
        at org.apache.hadoop.util.Shell.run(Shell.java:482)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)


Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: default
         start time: 1526145886541
         final status: FAILED
         tracking URL: http://localhost.localdomain:8088/cluster/app/application_1526143826498_0001
         user: root
2018-05-12 17:24:54 INFO  Client:54 - Deleted staging directory hdfs://localhost:9000/user/root/.sparkStaging/application_1526143826498_0001
Exception in thread "main" org.apache.spark.SparkException: Application application_1526143826498_0001 finished with failed status
        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1159)
        at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1518)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2018-05-12 17:24:55 INFO  ShutdownHookManager:54 - Shutdown hook called
2018-05-12 17:24:55 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-6db13382-d02d-4e8a-b5bf-5aafd535ba1e
2018-05-12 17:24:55 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-1218ca67-7fae-4c0b-b678-002963a1cf08

诊断为:
Application application_1526143826498_0001 failed 2 times due to AM Container for appattempt_1526143826498_0001_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://localhost.localdomain:8088/cluster/app/application_1526143826498_0001Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1526143826498_0001_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)
at org.apache.hadoop.util.Shell.run(Shell.java:482)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.

当我单击日志以获取详细信息时:
Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster

这是我的spark-defaults.conf:
spark.master                     spark://localhost.localdomain:7077
spark.eventLog.enabled           true
spark.eventLog.dir               hdfs://localhost.localdomain:8021/user/spark/logs
spark.serializer                 org.apache.spark.serializer.KryoSerializer
spark.driver.memory              1g
spark.executor.memory            1g
spark.yarn.dist.jars             /opt/spark-2.3.0/yarn/spark-2.3.0-yarn-shuffle.jar
spark.yarn.jars                  /opt/spark-2.3.0/yarn/spark-2.3.0-yarn-shuffle.jar
# spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"

我的spark-env.sh:
SPARK_MASTER_HOST=localhost.localdomain
SPARK_MASTER_PORT=7077
SPARK_LOCAL_IP=localhost.localdomain
SPARK_CONF_DIR=${SPARK_HOME}/conf
HADOOP_CONF_DIR=/opt/hadoop-2.7.5/etc/hadoop
YARN_CONF_DIR=/opt/hadoop-2.7.5/etc/hadoop
SPARK_EXECUTOR_CORES=2
SPARK_EXECUTOR_MEMORY=500M
SPARK_DRIVER_MEMORY=500M

还有我的yarn-site.xml:
<configuration>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
                <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
        <property>
                <name>yarn.nodemanager.vmem-check-enabled</name>
                <value>false</value>
        </property>
        <property>
                <name>yarn.application.classpath</name>
                <value>
                /opt/hadoop-2.7.5/etc/hadoop,
                /opt/hadoop-2.7.5/*,
                /opt/hadoop-2.7.5/lib/*,
                /opt/hadoop-2.7.5/share/hadoop/common/*,
                /opt/hadoop-2.7.5/share/hadoop/common/lib/*
                /opt/hadoop-2.7.5/share/hadoop/hdfs/*,
                /opt/hadoop-2.7.5/share/hadoop/hdfs/lib/*,
                /opt/hadoop-2.7.5/share/hadoop/mapreduce/*,
                /opt/hadoop-2.7.5/share/hadoop/mapreduce/lib/*,
                /opt/hadoop-2.7.5/share/hadoop/tools/lib/*,
                /opt/hadoop-2.7.5/share/hadoop/yarn/*,
                /opt/hadoop-2.7.5/share/hadoop/yarn/lib/*
                </value>
        </property>
</configuration>

我已将spark-yarn_2.11-2.3.0.jar复制到/opt/hadoop-2.7.5/share/hadoop/yarn/*
我通过了几种stackoverflow解决方案,其中提到了有关传递--conf "spark.driver.extraJavaOptions=-Diop.version=4.1.0.0"的问题,但不适用于我的情况。
一些解决方案说缺少日志记录jar,但是我不确定哪个jar。
我是否缺少任何配置?

最佳答案

尝试通过使用--jars添加jar来运行命令

spark-submit --class com.spark.SparkTest --master yarn  --jars /fullpath/first.jar,/fullpath/second.jar --deploy-mode cluster /root/Downloads/SimpleSpark-0.0.1-SNAPSHOT.jar 

或通过添加如下行添加conf/spark-defaults.conf中的jar:
spark.driver.extraClassPath /fullpath/yarn-jar.jar:/fullpath/second.jar
spark.executor.extraClassPath /fullpath/yarn-jar.jar:/fullpath/second.jar

关于java - apache spark2.3.0以master作为 yarn 启动时,失败并出现错误无法找到或加载主类org.apache.spark.deploy.yarn.ApplicationMaster,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50309511/

相关文章:

apache-spark - 如果文件夹为空,如何正确读取据称包含来自 Spark 的 Parquet 文件的文件夹

java - 如果 Spark 数据集中的记录键相同,如何创建值列表

java - Giraph 的 workers 在顶点接收消息时采用什么机制?

sql - 在 Hive 中创建具有当前时间戳(以纳秒为单位)的新列

java - Box2d 过滤蒙版,组类别

java - Spring @ConditionalOnProperty havingValue = "value1"或 "value2"

java - Apache Spark Java 设置内存大小

python - 在每个映射器之间共享特定数据

java - 在浏览器的正常 UI 模式下使用 Selenium-Java 运行测试的 headless (headless) chrome

java - Oracle JDeveloper 12C 和集成 Weblogic Server 未运行 Windows 10