java - 如何在Hadoop 2.6中访问JobCounters和FileSystemCounters?

标签 java hadoop mapreduce counter

在我的MapReduce程序的Reducer中,我希望读取JobCounterFileSystemCounter。运行命令mapred job -status <job id>时,我需要的计数器按其显示名称列出:

...
File System Counters
    FILE: Number of bytes read=148874
    FILE: Number of bytes written=22010065
    FILE: Number of read operations=0
    FILE: Number of large read operations=0
    FILE: Number of write operations=0
    HDFS: Number of bytes read=135823
    HDFS: Number of bytes written=44423504133
    HDFS: Number of read operations=2185
    HDFS: Number of large read operations=0
    HDFS: Number of write operations=1316
Job Counters 
    Launched map tasks=1
    Launched reduce tasks=200
    Rack-local map tasks=1
    Total time spent by all maps in occupied slots (ms)=5293
    Total time spent by all reduces in occupied slots (ms)=972893
    Total time spent by all map tasks (ms)=5293
    Total time spent by all reduce tasks (ms)=972893
    Total vcore-seconds taken by all map tasks=5293
    Total vcore-seconds taken by all reduce tasks=972893
    Total megabyte-seconds taken by all map tasks=5420032
    Total megabyte-seconds taken by all reduce tasks=996242432
...

如何在运行时从Reducer的代码中访问这些计数器?

使用Google,我找不到任何有关如何访问这些计数器的有用信息。使用Context.getCounter(String groupName, String counterName)的直接尝试无法检索Counter实例,因此在调用NullPointerException时抛出getValue():
long bytes = context.getCounter(
    FileSystemCounter.class.getName(),
    FileSystemCounter.BYTES_WRITTEN.name()
).getValue();
long milliseconds = context.getCounter(
    JobCounter.class.getName(),
    JobCounter.MILLIS_REDUCES.name()
).getValue();

最佳答案

Counters counters = job.getCounters();

for (CounterGroup group : counters) {
      System.out.println("* Counter Group: " + group.getDisplayName() + " (" + group.getName() + ")");
      System.out.println("  number of counters in this group: " + group.size());
      for (Counter counter : group) {
        System.out.println("  - " + counter.getDisplayName() + ": " + counter.getName() + ": "+counter.getValue());
      }
    }

我认为这将有助于打印所有计数器及其值。

关于java - 如何在Hadoop 2.6中访问JobCounters和FileSystemCounters?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29449006/

相关文章:

java - 将 wget 与 Hadoop 一起使用?

java - Gradle 在 Eclipse 项目中找不到服务器运行时

java - 如何从一个文件中读取大量 JSON-s

hadoop - 如何指定 Hadoop MapReduce 作业生成的目录的权限?

hadoop - Hbase Schema设计

hadoop - Hadoop在Avro工具Concat上给出错误

hadoop - hadoop真的能处理数据节点故障吗?

java - 检查 char 是否为空

java - 如何将 Firebase 数据转换为 Java 对象...?

java - Hadoop 命令行配置不覆盖默认值?