hadoop - 使用多个输出将输出写入 hbase 表和文件

有没有办法使用 MultipleOutputs 将 MapReduce 输出写入 hbase 表和文件？

我在文件中获取输出，但 hbase 表为空。这是代码。

FileInputFormat.setInputPaths(job, new Path(
            "/path/to/input.txt"));
    FileOutputFormat.setOutputPath(job, new Path("/path/to/output"));
    job.getConfiguration().set(TableOutputFormat.OUTPUT_TABLE, "test");
    MultipleOutputs.addNamedOutput(job, "file", FileOutputFormat.class,
            Text.class, Text.class);
    MultipleOutputs.addNamedOutput(job, "table", TableOutputFormat.class,
            ImmutableBytesWritable.class, Put.class);

最佳答案

我可以使用 MultipleOutputs 写入 hbase 表和文件。可以使用多重输出写入除作业默认输出之外的其他输出。将 TableOutputFormat 设置为作业默认输出，并使用 MultipleOutputs 定义一个可以写入文件的附加输出。请找到下面的代码片段。这是驱动程序代码。

    Configuration conf = HBaseConfiguration.create();
    Job job = new Job(conf);
    job.setMapperClass(HBaseMapper.class);
    job.setReducerClass(HBaseReducer.class);
    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    job.setOutputFormatClass(TableOutputFormat.class);
    job.getConfiguration().set(TableOutputFormat.OUTPUT_TABLE, "table");
    job.setJarByClass(TableDriver.class);
    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(IntWritable.class);
    job.setOutputKeyClass(WritableComparable.class);
    job.setOutputValueClass(Writable.class);

    //additional output using TextOutputFormat.
    MultipleOutputs.addNamedOutput(job, "text", TextOutputFormat.class,
            WritableComparable.class, Writable.class);

    job.waitForCompletion(true);

在Reducer中

private MultipleOutputs mos;
String path;

protected void setup(Context context) throws java.io.IOException,
        InterruptedException {
    mos = new MultipleOutputs(context);

}

protected void reduce(WritableComparable key, Iterable<Writable> values,
        Context context) throws java.io.IOException, InterruptedException {
    int sum = 0;
    IntWritable result = new IntWritable();
    for (Writable val : values) {

        sum += ((IntWritable) val).get();
    }
    result.set(sum);
    KeyValue kv = new KeyValue(key.toString().getBytes(), "C".getBytes(),
            "result".getBytes(), Bytes.toBytes(String.valueOf(sum)));
    Put put = new Put(key.toString().getBytes());
    put.add(kv);

    mos.write("text", key, result, "text_file");

    context.write(new ImmutableBytesWritable(key.toString().getBytes()), put);
}

protected void cleanup(Context context) throws java.io.IOException,
        InterruptedException {
    mos.close();
}

关于hadoop - 使用多个输出将输出写入 hbase 表和文件，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/20186606/

hadoop - 使用多个输出将输出写入 hbase 表和文件

上一篇：Mesos 上的 Hadoop 失败并显示 "Could not find or load main class org.apache.hadoop.mapred.MesosExecutor"

下一篇：hadoop - 如何从Hadoop中的hdfs文件中删除一些数据