java - Hadoop MapReduce Java 实现中的 Reducer

标签 java hadoop

我正在Hadoop MapReduce Framework 中编写一个Java 实现程序。我正在编写一个名为 CombinePatternReduce.class 的类.为了在 Eclipse 中调试 reducer,我写了一个 main()功能如下:

@SuppressWarnings("unchecked")
public static void main(String[] args) throws IOException, InterruptedException{
    Text key = new Text("key2:::key1:::_ performs better than _");
    IntWritable count5 = new IntWritable(5);
    IntWritable count3 = new IntWritable(3);
    IntWritable count8 = new IntWritable(8);
    List<IntWritable> values = new ArrayList<IntWritable>();
    values.add(count5);
    values.add(count3);
    values.add(count8);
    CombinePatternReduce reducer = new CombinePatternReduce();
    Context dcontext = new DebugTools.DebugReducerContext<Text, IntWritable, KeyPairWritableComparable, WrapperDoubleOrPatternWithWeightWritable>(reducer, key, count3); // here is the problem
    reducer.reduce(key, values, dcontext);      
}

DebugTools.DebugReducerContext是我为了方便调试过程而写的一个类,如下所示:

public static class DebugReducerContext<KIN, VIN, KOUT, VOUT> extends Reducer<KIN, VIN, KOUT, VOUT>.Context {
    DebugTools dtools = new DebugTools();
    DataOutput out = dtools.new DebugDataOutputStream(System.out);

    public DebugReducerContext(Reducer<KIN, VIN, KOUT, VOUT> reducer, Class<KIN> keyClass, Class<VIN> valueClass) throws IOException, InterruptedException{
        reducer.super(new Configuration(), new TaskAttemptID(), new DebugRawKeyValueIterator(), null, null, 
                null, null, null, null, keyClass, valueClass);
    }

    @Override
    public void write(Object key, Object value) throws IOException, InterruptedException {
        writeKeyValue(key, value, out);
    }

    @Override
    public void setStatus(String status) {
        System.err.println(status);
    }
}

问题出在代码的第一部分,即main() .当我写作时

Context dcontext = new DebugTools.DebugReducerContext<Text, IntWritable, KeyPairWritableComparable, WrapperDoubleOrPatternWithWeightWritable>(reducer, key, count3);

有一个错误

The constructor DebugTools.DebugReducerContext<Text,IntWritable,KeyPairWritableComparable,WrapperDoubleOrPatternWithWeightWritable>(CombinePatternReduce, Text, IntWritable) is undefined.

当我写作时

Context dcontext = new DebugTools.DebugReducerContext<Text, IntWritable, KeyPairWritableComparable, WrapperDoubleOrPatternWithWeightWritable>(reducer, key, values);

有一个错误

The constructor DebugTools.DebugReducerContext<Text,IntWritable,KeyPairWritableComparable,WrapperDoubleOrPatternWithWeightWritable>(CombinePatternReduce, Text, List<IntWritable>) is undefined.

Reducer.Context 的文档以来是

public Reducer.Context(Configuration conf,
                       TaskAttemptID taskid,
                       RawKeyValueIterator input,
                       Counter inputKeyCounter,
                       Counter inputValueCounter,
                       RecordWriter<KEYOUT,VALUEOUT> output,
                       OutputCommitter committer,
                       StatusReporter reporter,
                       RawComparator<KEYIN> comparator,
                       Class<KEYIN> keyClass,
                       Class<VALUEIN> valueClass)
                throws IOException,
                       InterruptedException

我需要传入一个 Class<KEYIN> keyClassClass<VALUEIN> valueClass .那么如何编写main()函数(尤其是有错误的那句)来调试reducer类呢?

最佳答案

很明显,类构造函数有 3 个参数。 reducer 的实例、键的类和值的类。

而不是实际传递键和值。您需要为其提供类(class)链接

Context dcontext = new DebugTools.DebugReducerContext<Text, IntWritable, KeyPairWritableComparable, WrapperDoubleOrPatternWithWeightWritable>(reducer, Text.class, IntWritable.class);

本质上,这是重申上下文应该能够处理以减少的值的类型。

关于java - Hadoop MapReduce Java 实现中的 Reducer,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13108796/

相关文章:

java - 如何使用 java 以编程方式检索 yarn 的日志

java - 如何将 InputStream 转换为视频文件(avi)

Java 相当于 C# 中的#region

java - 如何在 spring boot 到达 Controller 之前修改请求体

java - 如何增加 JavaFx ControlsFx 自动完成建议列表的高度?

hadoop - 在格式化namenode之前,Java环境变量不可用

hadoop - 什么是管理AWS的代码部署和管理的正确方法

hadoop - PIG - HBASE - HBaseStorage key 过滤器(gt,lt)

java - Mockito 允许设置模拟返回值

java - 使用 SAX 解析器遍历 xml 文档并以所需格式打印输出