java - 当我指定 IntWritable 时，为什么我的 map reduce 程序以文本形式输出

我的测试集是:

Onida|Lucid|18|Uttar Pradesh|232401|16200
Akai|Decent|16|Kerala|922401|12200
Lava|Attention|20|Assam|454601|24200
Zen|Super|14|Maharashtra|619082|9200
Samsung|Optima|14|Madhya Pradesh|132401|14200

我的映射器类:

public class UnitsSoldPerCompanyMapper extends Mapper<LongWritable,Text,Text,Text>{

    public void map(LongWritable inputKey, Text inputValue,Context context) throws IOException, InterruptedException{
        String[] lineArray= inputValue.toString().split("\\|");
        Text companyName = new Text(lineArray[0]);
        Text productName = new Text(lineArray[1]);
        context.write(companyName,productName);
    }
}

reducer 类:

public class UnitsSoldPerCompanyReducer extends Reducer<Text,Iterable<Text>,Text,IntWritable>{

    public void reduce(Text companyKey,Iterable<Text> productName,Context context) throws IOException, InterruptedException{

        IntWritable counter1= new IntWritable();
        int counter =0;

        for(Text values : productName ){
            System.out.println(values);
            counter++;
        }
        counter1.set(counter);
        //IntWritable sum= new IntWritable(counter);
        context.write(companyKey, new IntWritable(1));
    }
}

驱动类:

public class UnitsSoldPerCompanyDriver {

public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {

    Configuration conf = new Configuration();// To set job related
                                                // configuration

    // @SuppressWarnings("deprecation")
    @SuppressWarnings("deprecation")
    Job job = new Job(conf, "TaskofJob");
    job.setJarByClass(UnitsSoldPerCompanyDriver.class);

    // Job job = new Job(conf,"TvSalesAcrossLocations");

    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    job.setMapperClass(UnitsSoldPerCompanyMapper.class);
    job.setReducerClass(UnitsSoldPerCompanyReducer.class);

    BasicConfigurator.configure();
    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(Text.class);

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);

    // job.setInputFormatClass(TextInputFormat.class);
    // job.setOutputFormatClass(TextOutputFormat.class);

    job.waitForCompletion(true);

}

我的出场是:

Akai    Decent
Lava    Attention
Lava    Attention
Lava    Attention
NA  Lucid
Onida   NA
Onida   Decent
Onida   Lucid
Onida   Lucid
Samsung Super
Samsung Super
Samsung Super
Samsung Decent
Samsung Optima
Samsung Optima
Samsung Optima

然而，我正在尝试查找每家公司的销量。

最佳答案

我相信输出是由身份(默认)reducer 生成的(它只输出带有制表符的映射器键和值)，而不是你的。不知道为什么会这样，但我怀疑 BasicConfigurator.configure();

在 ResourceManager UI 中，您可以验证 mapred.reducer.class，转到作业，在左侧菜单中您可以看到实际使用的作业属性。

关于java - 当我指定 IntWritable 时，为什么我的 map reduce 程序以文本形式输出，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46074413/

java - 当我指定 IntWritable 时，为什么我的 map reduce 程序以文本形式输出

上一篇：java - 从 Windows 运行时出现 Phoenix CsvBulkLoadTool 错误

下一篇：hadoop - PySpark:在连接中处理 NULL