hadoop - 使用 Hadoop MapReduce 对字数进行排序

我是 MapReduce 的新手，我完成了一个 Hadoop 字数统计示例。

在该示例中，它生成未排序的字数统计文件(带有键值对)。那么是否可以通过将另一个 MapReduce 任务与较早的任务组合来按单词出现次数对其进行排序？

最佳答案

在简单的单词计数 map reduce 程序中，我们得到的输出是按单词排序的。示例输出可以是:
苹果 1
男孩 30
猫 2
Frog 20
斑马 1
如果您希望根据单词的出现次数对输出进行排序，即采用以下格式
1 个苹果
1 斑马
2只猫
20只 Frog
30岁男孩
您可以使用下面的映射器和缩减器创建另一个 MR 程序，其中输入将是从简单的字数统计程序获得的输出。

class Map1 extends MapReduceBase implements Mapper<Object, Text, IntWritable, Text>
{
    public void map(Object key, Text value, OutputCollector<IntWritable, Text> collector, Reporter arg3) throws IOException 
    {
        String line = value.toString();
        StringTokenizer stringTokenizer = new StringTokenizer(line);
        {
            int number = 999; 
            String word = "empty";

            if(stringTokenizer.hasMoreTokens())
            {
                String str0= stringTokenizer.nextToken();
                word = str0.trim();
            }

            if(stringTokenizer.hasMoreElements())
            {
                String str1 = stringTokenizer.nextToken();
                number = Integer.parseInt(str1.trim());
            }

            collector.collect(new IntWritable(number), new Text(word));
        }

    }

}


class Reduce1 extends MapReduceBase implements Reducer<IntWritable, Text, IntWritable, Text>
{
    public void reduce(IntWritable key, Iterator<Text> values, OutputCollector<IntWritable, Text> arg2, Reporter arg3) throws IOException
    {
        while((values.hasNext()))
        {
            arg2.collect(key, values.next());
        }

    }

}

关于hadoop - 使用 Hadoop MapReduce 对字数进行排序，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/2550784/

hadoop - 使用 Hadoop MapReduce 对字数进行排序

上一篇：hadoop - GlusterFS 作为 Hadoop 的后端

下一篇：hadoop - 如何更改 Hive 分区列名