我的 spark 作业失败了,因为用户无权访问 spark 尝试写入暂存或临时数据集的目录。
2017-03-10 10:25:47,0928 ERROR JniCommon fs/client/fileclient/cc/jni_MapRClient.cc:2072 Thread: 26413 mkdirs failed for /user/cxpdiprod/.sparkStaging/application_1488190062017_14041, error 13 Exception in thread "main" org.apache.hadoop.security.AccessControlException: User cxpdiprod(user id 99871) has been denied access to create application_1488190062017_14041 at com.mapr.fs.MapRFileSystem.makeDir(MapRFileSystem.java:1250) at com.mapr.fs.MapRFileSystem.mkdirs(MapRFileSystem.java:1270) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1913) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:634) at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:356) at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:727) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142) at org.apache.spark.deploy.yarn.Client.run(Client.scala:1021) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:742)
用户“cxpdiprod”可以访问集群中的其他目录,即 /mapr/ui/abc
。是否有任何属性可以为临时文件和暂存文件设置不同的目录?
最佳答案
在 spark-defaults.conf
中添加此属性 spark.yarn.stagingDir
以及所需的暂存位置。默认情况下,暂存位置是当前用户在 HDFS 文件系统中的主目录 /user/username/
。
关于hadoop - 如何设置 Spark 作业暂存位置,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42723604/