(我现在已经通过添加本文末尾指定的依赖项解决了这个问题,但想知道是否有更好的选择或者我是否遗漏了一些重要的东西?)
当尝试运行 mapreduce 作业时,行
JobClient.runJob(conf)
给出以下错误堆栈:
Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:119)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:81)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:74)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:465)
at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:444)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:826)
我的设置如下:
public static void main(String[] args) throws IOException {
JobConf conf = new JobConf(Reduce.class);
conf.set("mapreduce.framework.name","yarn");
conf.set("mapreduce.jobhistory.address","s17.myserver.com:10020");
conf.set("mapreduce.jobhistory.webapp.address","s17.myserver.com:19888");
conf.set("yarn.resourcemanager.address","s6.myserver.com:8032");
conf.set("yarn.resourcemanager.scheduler.address","s6.myserver.com:8030");
conf.set("yarn.resourcemanager.resource-tracker.address","s6.myserver.com:8031");
conf.set("yarn.resourcemanager.admin.address","s6.myserver.com:8033");
conf.set("yarn.resourcemanager.webapp.address","s6.myserver.com:8088");
/// error on the following line
JobClient.runJob(conf);
}
在花费大量时间尝试检查和重新检查我的配置后,我设法通过向我的项目添加以下依赖项来解决问题:
hadoop-mapreduce-client-jobclient
我是不是遗漏了什么,或者错误消息是否特别具有误导性?
最佳答案
这种类型的配置应该由您的集群管理员完成,并作为 yarn-site 的一部分提供。它不应该需要由每个工作添加。话虽这么说,错误消息并不是特别有用并且可以改进,但 Hadoop 中的几乎所有错误消息都是如此......
关于java - Hadoop JobClient.runJob : Cannot initialize cluster - Misleading error message(? ) 和建议的解决方案,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20659417/