我是 hadoop 的新手。我从网上得到这段代码
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
public class Gender {
private static String genderCheck = "female";
public static class Map extends MapReduceBase implements Mapper {
private final static IntWritable one = new IntWritable(1);
private Text locText = new Text();
public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException {
String line = value.toString();
String location = line.split(",")[14] + "," + line.split(",")[15];
long male = 0L;
long female = 0L;
if (line.split(",")[17].matches("\d+") && line.split(",")[18].matches("\d+")) {
male = Long.parseLong(line.split(",")[17]);
female = Long.parseLong(line.split(",")[18]);
}
long diff = male - female;
locText.set(location);
if (Gender.genderCheck.toLowerCase().equals("female") && diff < 0) {
output.collect(locText, new LongWritable(diff * -1L));
}
else if (Gender.genderCheck.toLowerCase().equals("male") && diff
> 0) {
output.collect(locText, new LongWritable(diff));
}
} }
public static void main(String[] args) throws Exception {
JobConf conf = new JobConf(Gender.class);
conf.setJobName("gender");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(LongWritable.class);
conf.setMapperClass(Map.class);
if (args.length != 3) {
System.out.println("Usage:");
System.out.println("[male/female] /path/to/2kh/files /path/to/output");
System.exit(1);
}
if (!args[0].equalsIgnoreCase("male") && !args[0].equalsIgnoreCase("female")) {
System.out.println("first argument must be male or female");
System.exit(1);
}
Gender.genderCheck = args[0];
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(args[1]));
FileOutputFormat.setOutputPath(conf, new Path(args[2]));
JobClient.runJob(conf); }
}
当我使用“javac -cp/usr/local/hadoop/hadoop-core-1.0.3.jar Gender.java”编译这段代码时
出现以下错误:
"Gender.Map is not abstract and does not override abstract method map(java.lang.Object,java.lang.Object,org.apache.hadoop.mapred.OutputCollector,org.apache.hadoop.mapred.Reporter) in org.apache.hadoop.mapred.Mapper public static class Map extends MapReduceBase implements Mapper "
如何正确编译?
最佳答案
修改类Maper类声明如下:
public static class Map extends MapReduceBase implements Mapper<LongWritable,Text,Text, LongWritable>
如果不指定任何特定的类名,则需要具有如下映射函数:
@Override
public void map(Object arg0, Object arg1, OutputCollector arg2, Reporter arg3) throws IOException {
// TODO Auto-generated method stub
}
现在,特定类型在这里表示预期的输入键值对类型和映射器的输出键值类型。
在您的情况下,输入键值对是 LongWritable-Text
.
然后,根据您的 output.collect
进行猜测方法调用,您的映射器输出键值对是 Text-LongWritable
.
因此,您的 Map 类应实现 Mapper<LongWritable,Text,Text, LongWritable>
.
您的代码中还有一处错误 -
使用 "\d+"
不会编译为 \d
没有意义,在反斜杠之后它需要一个特殊的转义序列,所以我想你应该使用以下内容:
line.split(",")[17].matches("\\d+")
关于hadoop编译,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15872133/