我对 Spark 还很陌生。我尝试搜索但找不到合适的解决方案。我已经在两个机器上安装了 hadoop 2.7.2(一个主节点和另一个工作节点)我已经通过以下链接 http://javadev.org/docs/hadoop/centos/6/installation/multi-node-installation-on-centos-6-non-sucure-mode/ 设置了集群 我以 root 用户身份运行 hadoop 和 Spark 应用程序来测试集群。
我已在主节点上安装了 Spark,并且 Spark 正在启动,没有任何错误。但是,当我使用 Spark Submit 提交作业时,即使该文件存在于错误中同一位置的主节点中,我也会收到 File Not Found 异常。我正在执行 Spark Submit 命令,请在下面找到日志输出命令。
/bin/spark-submit --class com.test.Engine --master yarn --deploy-mode cluster /app/spark-test.jar
16/04/21 19:16:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/04/21 19:16:13 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 16/04/21 19:16:14 INFO Client: Requesting a new application from cluster with 1 NodeManagers 16/04/21 19:16:14 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 16/04/21 19:16:14 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead 16/04/21 19:16:14 INFO Client: Setting up container launch context for our AM 16/04/21 19:16:14 INFO Client: Setting up the launch environment for our AM container 16/04/21 19:16:14 INFO Client: Preparing resources for our AM container 16/04/21 19:16:14 INFO Client: Source and destination file systems are the same. Not copying file:/mi/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar 16/04/21 19:16:14 INFO Client: Source and destination file systems are the same. Not copying file:/app/spark-test.jar 16/04/21 19:16:14 INFO Client: Source and destination file systems are the same. Not copying file:/tmp/spark-120aeddc-0f87-4411-9400-22ba01096249/__spark_conf__5619348744221830008.zip 16/04/21 19:16:14 INFO SecurityManager: Changing view acls to: root 16/04/21 19:16:14 INFO SecurityManager: Changing modify acls to: root 16/04/21 19:16:14 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root) 16/04/21 19:16:15 INFO Client: Submitting application 1 to ResourceManager 16/04/21 19:16:15 INFO YarnClientImpl: Submitted application application_1461246306015_0001 16/04/21 19:16:16 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:16 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1461246375622 final status: UNDEFINEDsparkcluster01.testing.com tracking URL: http://sparkcluster01.testing.com:8088/proxy/application_1461246306015_0001/ user: root 16/04/21 19:16:17 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:18 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:19 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:20 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED) 16/04/21 19:16:21 INFO Client: Application report for application_1461246306015_0001 (state: FAILED) 16/04/21 19:16:21 INFO Client: client token: N/A diagnostics: Application application_1461246306015_0001 failed 2 times due to AM Container for appattempt_1461246306015_0001_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://sparkcluster01.testing.com:8088/cluster/app/application_1461246306015_0001Then, click on links to logs of each attempt. Diagnostics: java.io.FileNotFoundException: File file:/app/spark-test.jar does not exist Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1461246375622 final status: FAILED tracking URL: http://sparkcluster01.testing.com:8088/cluster/app/application_1461246306015_0001 user: root Exception in thread "main" org.ap/app/spark-test.jarache.spark.SparkException: Application application_1461246306015_0001 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I even tried running the spark on HDFS file system by placing my application on HDFS and giving the HDFS path in the Spark Submit command. Even then its throwing File Not Found Exception on some Spark Conf file. I am executing below Spark Submit command and please find the logs output below the command.
./bin/spark-submit --class com.test.Engine --master yarn --deploy-mode cluster hdfs://sparkcluster01.testing.com:9000/beacon/job/spark-test.jar
16/04/21 18:11:45 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 16/04/21 18:11:46 INFO Client: Requesting a new application from cluster with 1 NodeManagers 16/04/21 18:11:46 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 16/04/21 18:11:46 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead 16/04/21 18:11:46 INFO Client: Setting up container launch context for our AM 16/04/21 18:11:46 INFO Client: Setting up the launch environment for our AM container 16/04/21 18:11:46 INFO Client: Preparing resources for our AM container 16/04/21 18:11:46 INFO Client: Source and destination file systems are the same. Not copying file:/mi/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar 16/04/21 18:11:47 INFO Client: Uploading resource hdfs://sparkcluster01.testing.com:9000/beacon/job/spark-test.jar -> file:/root/.sparkStaging/application_1461234217994_0017/spark-test.jar 16/04/21 18:11:49 INFO Client: Source and destination file systems are the same. Not copying file:/tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21/__spark_conf__6818051470272245610.zip 16/04/21 18:11:50 INFO SecurityManager: Changing view acls to: root 16/04/21 18:11:50 INFO SecurityManager: Changing modify acls to: root 16/04/21 18:11:50 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root) 16/04/21 18:11:50 INFO Client: Submitting application 17 to ResourceManager 16/04/21 18:11:50 INFO YarnClientImpl: Submitted application application_1461234217994_0017 16/04/21 18:11:51 INFO Client: Application report for application_1461234217994_0017 (state: ACCEPTED) 16/04/21 18:11:51 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1461242510849 final status: UNDEFINED tracking URL: http://sparkcluster01.testing.com:8088/proxy/application_1461234217994_0017/ user: root 16/04/21 18:11:52 INFO Client: Application report for application_1461234217994_0017 (state: ACCEPTED) 16/04/21 18:11:53 INFO Client: Application report for application_1461234217994_0017 (state: ACCEPTED) 16/04/21 18:11:54 INFO Client: Application report for application_1461234217994_0017 (state: FAILED) 16/04/21 18:11:54 INFO Client: client token: N/A diagnostics: Application application_1461234217994_0017 failed 2 times due to AM Container for appattempt_1461234217994_0017_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://sparkcluster01.testing.com:8088/cluster/app/application_1461234217994_0017Then, click on links to logs of each attempt. Diagnostics: File file:/tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21/__spark_conf__6818051470272245610.zip does not exist java.io.FileNotFoundException: File file:/tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21/__spark_conf__6818051470272245610.zip does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1461242510849 final status: FAILED tracking URL: http://sparkcluster01.testing.com:8088/cluster/app/application_1461234217994_0017 user: root Exception in thread "main" org.apache.spark.SparkException: Application application_1461234217994_0017 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 16/04/21 18:11:55 INFO ShutdownHookManager: Shutdown hook called 16/04/21 18:11:55 INFO ShutdownHookManager: Deleting directory /tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21
最佳答案
spark 配置未指向正确的 hadoop 配置目录。 2.7.2 的 hadoop 配置位于文件路径 hadoop 2.7.2./etc/hadoop/而不是/root/hadoop2.7.2/conf。当我在spark-env.sh下指向HADOOP_CONF_DIR=/root/hadoop2.7.2/etc/hadoop/时,spark提交开始工作并且文件未找到异常消失了。早些时候它指向/root/hadoop2.7.2/conf (它不存在)。如果 Spark 没有指向正确的 hadoop 配置目录,则可能会导致类似的错误。我认为这可能是 Spark 中的一个错误,它应该优雅地处理它,而不是抛出模棱两可的错误消息。
关于hadoop - Spark 作业在 Yarn 集群上运行 java.io.FileNotFoundException : File does not exits ,,即使文件存在于主节点上,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36753546/