hadoop - Apache Giraph - 无法在分离主/工作模式下运行,因为一次只有 1 个任务

标签 hadoop mapreduce giraph

我使用 PageRank Benchmark 示例运行 Giraph 1.0.0 和 hadoop 2.2.0 here .

突然我得到这个错误结果:

Exception in thread "main" java.lang.IllegalArgumentException: checkLocalJobRunnerConfiguration: When using LocalJobRunner, must have only one worker since only 1 task at a time! at org.apache.giraph.job.GiraphJob.checkLocalJobRunnerConfiguration(GiraphJob.java:151) at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:225) at org.apache.giraph.benchmark.GiraphBenchmark.run(GiraphBenchmark.java:90) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.giraph.benchmark.PageRankBenchmark.main(PageRankBenchmark.java:71) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

当我将 worker 数量改为 1 时,我得到:

Exception in thread "main" java.lang.IllegalArgumentException: checkLocalJobRunnerConfiguration: When using LocalJobRunner, you cannot run in split master / worker mode since there is only 1 task at a time! at org.apache.giraph.job.GiraphJob.checkLocalJobRunnerConfiguration(GiraphJob.java:157) at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:225) at org.apache.giraph.benchmark.GiraphBenchmark.run(GiraphBenchmark.java:90) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.giraph.benchmark.PageRankBenchmark.main(PageRankBenchmark.java:71) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

有什么解决办法吗?

最佳答案

您好,我假设您没有在集群上运行?如果我在我们的演示 VM 中运行,我会遇到同样的错误。

您可以在 giraph-site.xml 中禁用拆分 master worker 行为

giraph.SplitMasterWorker=false

如果您只想在一次性执行期间禁用它,您也可以将它作为命令行参数传递给您的程序。

-ca giraph.SplitMasterWorker=false

例如,我为我的大数据讲座运行了一个这样的演示:

#!/bin/bash

yarn jar /root/giraph-0.0.1-SNAPSHOT-jar-with-dependencies.jar org.apache.giraph.GiraphRunner at.jku.tk.steinbauer.bigdata.giraph.MaxInDegreeComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /user/hue/graph/tinygraph.txt -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat  -op /user/hue/graph/degree -w 1 -ca giraph.SplitMasterWorker=false

关于hadoop - Apache Giraph - 无法在分离主/工作模式下运行,因为一次只有 1 个任务,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26175116/

相关文章:

sorting - Hadoop 对值进行二次排序。对松散的值(value)观进行分类

hadoop - 如何将用户添加到Hadoop集群中的特定队列

mapreduce - CouchDB 链接多个文档

hadoop - 如何在级联中重命名管道字段?

java - 什么相当于hadoop 2.7.1中的hadoop-core-xxx.jar

hadoop - 比较 2 个配置单元表以查找没有任何唯一列/时间戳的更新/插入/删除记录并将其附加到 Hadoop 中的基表

hadoop - Cloudera中的配置单元查询问题

hadoop - 如何在Hadoop中处理大文件?

hadoop - 吉拉夫最短路径示例

hadoop - "map.tasks.maximum"可以改善我的作业延迟吗?