java - Hadoop 找不到映射器类

标签 java hadoop mapreduce mapper

我是 Hadoop 的新手,我想运行 MapReduce 作业。但是,我得到了 hadoop 找不到映射器类的错误。这是错误:

INFO mapred.JobClient: Task Id : attempt_201608292140_0023_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: TransMapper1
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
    at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:718)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)

我检查了我的 jar 文件的权限,没有问题。这里是jar文件的权限:

-rwxrwxrwx.

这是启动 mapreduce 作业的代码:

import java.io.File;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class mp{

public static void main(String[] args) throws Exception {

    Job job1 = new Job();
    job1.setJarByClass(mp.class);
    FileInputFormat.addInputPath(job1, new Path(args[0]));                  
    String oFolder = args[0] + "/output";
    FileOutputFormat.setOutputPath(job1, new Path(oFolder));
    job1.setMapperClass(TransMapper1.class);
    job1.setReducerClass(TransReducer1.class);
    job1.setMapOutputKeyClass(LongWritable.class);
    job1.setMapOutputValueClass(DnaWritable.class);
    job1.setOutputKeyClass(LongWritable.class);
    job1.setOutputValueClass(Text.class);
}
}

这是映射器类 (TransMapper1):

import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class TransMapper1 extends  Mapper<LongWritable, Text, LongWritable, DnaWritable> {

    @Override
    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

        String line = value.toString();
        StringTokenizer tokenizer = new StringTokenizer(line);
        LongWritable bamWindow = new LongWritable(Long.parseLong(tokenizer.nextToken()));
        LongWritable read = new LongWritable(Long.parseLong(tokenizer.nextToken()));
        LongWritable refWindow = new LongWritable(Long.parseLong(tokenizer.nextToken()));
        IntWritable chr = new IntWritable(Integer.parseInt(tokenizer.nextToken()));
        DoubleWritable dist = new DoubleWritable(Double.parseDouble(tokenizer.nextToken()));
        DnaWritable dnaW = new DnaWritable(bamWindow,read,refWindow,chr,dist);
        context.write(bamWindow,dnaW);
    }
}

我正在使用以下命令编译包:

javac -classpath $MR_HADOOPJAR ${rootPath}mp/src/*.java
jar cvfm $mpJar $MR_MANIFEST ${rootPath}mp/src/*.class

这是 jar -tf mp/src/mp.jar 命令的结果:

META-INF/
META-INF/MANIFEST.MF
mnt/miczfs/tide/mp/src/DnaWritable.class
mnt/miczfs/tide/mp/src/mp.class
mnt/miczfs/tide/mp/src/TransMapper1.class
mnt/miczfs/tide/mp/src/TransMapper2.class
mnt/miczfs/tide/mp/src/TransReducer1.class
mnt/miczfs/tide/mp/src/TransReducer2.class

我正在用这个运行作业:

mpJar=${rootPath}mp/src/mp.jar
mp_exec=mp
export HADOOP_CLASSPATH=$mpJar
hadoop $mp_exec <input path>

此外,我也试过这个命令:

hadoop jar $mp_exec <input path>

我将创建 jar 文件的方式更改为以下命令:

jar cf $mpJar $MR_MANIFEST ${rootPath}mp/src/*.class

有了这个改变,错误就变成了这样:

Exception in thread "main" java.lang.ClassNotFoundException: mp
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:153)

所以,之前我的问题是程序找不到mapper类,现在找不到主类!!!有什么想法吗??

谢谢大家

最佳答案

HADOOP_CLASSPATH 必须指定 JAR 文件所在的文件夹,因此无法找到类定义。

关于java - Hadoop 找不到映射器类,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39359412/

相关文章:

java - MySQLIntegrityConstraintViolationException : Column 'Firstname' cannot be null

java - 如何在数组中查找大于、小于或等于某个值的数字?

scala - 如何使用 Scala 计算 Hbase 表上的所有行

java - 足够用于Hadoop的Java

java - hbase上计算平均温度时出错

Java MIDI - 从钢琴获取数据?

java - Lucene中的"createComponents"有时只被调用,为什么?

python - PySpark应用程序在纱簇模式和独立模式下提交错误

map - 打印独特或独特的值(value)

c# - 尝试在 MongoDB MapReduce 调用中包含 Query