Hadoop Mapper参数含义

我是Hadoop新手,对参数有疑问: 对于字数统计示例,请参见下面的代码片段:

public static class TokenizerMapper
   extends Mapper<LongWritable, Text, Text, IntWritable> {


   public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException 





键是 LongWritable 类型,因为 wordcount 程序将输入作为 TextInputFormat

根据 JavDoc对于 TextInputFormat

An InputFormat for plain text files. Files are broken into lines. Either linefeed or carriage-return are used to signal end of line. Keys are the position in the file, and values are the line of text..


We are fine.
How are you?
All are fine.


键:1 值:我们很好。

键:14 值:How are you?(第一行包括换行符在内大约有 13 个字符,因此行位置为 14)

Key:28 Value:All are fine.(第二行还有大约 13 个字符,包括换行符,所以从文件开始的行位置是 28)

