java - 与 MapClass 相关的 Hadoop ClassNotFoundException

标签 java exception hadoop

我看到许多与 ClassNotFoundExceptions、“No job jar file set”和 Hadoop 相关的问题。他们中的大多数指出配置中缺少 setJarByClass 方法(使用 JobConfJob)。我对我遇到的异常感到有点困惑,因为我有那个设置。以下是我认为相关的所有内容(如果我遗漏了任何内容,请告诉我):

 echo $CLASS_PATH
/root/javajars/mysql-connector-java-5.1.22/mysql-connector-java-5.1.22-bin.jar:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u5.jar:.

代码(大部分省略)

import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.util.ToolRunner;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.GenericOptionsParser;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.IntWritable;

import java.io.IOException;
import java.util.Iterator;
import java.lang.System;
import java.net.URL;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.sql.Statement;
import java.sql.ResultSet;

public class QueryTable extends Configured implements Tool {

    public static class MapClass extends Mapper<Object, Text, Text, IntWritable>{

    public void map(Object key, Text value, Context context)
            throws IOException, InterruptedException {
            ...
        }
    }

    public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable>{
        private IntWritable result = new IntWritable();

        public void reduce (Text key, Iterable<IntWritable> values,
                            Context context) throws IOException, InterruptedException {
            ...
        }
    }

    public int run(String[] args) throws Exception {
         //Configuration conf = getConf();                                                                                                                                                                                                                                       
        Configuration conf = new Configuration();

        Job job = new Job(conf, "QueryTable");
        job.setJarByClass(QueryTable.class);

        Path in =  new Path(args[0]);
        Path out = new Path(args[1]);
        FileInputFormat.setInputPaths(job, in);
        //FileInputFormat.addInputPath(job, in);                                                                                                                                                                                                                                
        FileOutputFormat.setOutputPath(job, out);

        job.setMapperClass(MapClass.class);
        job.setCombinerClass(Reduce.class); // new                                                                                                                                                                                                                              
        job.setReducerClass(Reduce.class);

        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);

        System.exit(job.waitForCompletion(true)?0:1);
        return 0;
    }

    public static void main(String[] args) throws Exception {
        int res = ToolRunner.run(new Configuration(), new QueryTable(), args);
        System.exit(res);
    }
}

然后我编译、创建 jar 并运行:

javac QueryTable.java -d QueryTable
jar -cvf QueryTable.jar -C QueryTable/ .
hadoop jar QueryTable.jar QueryTable input output

异常(exception)情况:

13/01/14 17:09:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
**13/01/14 17:09:30 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).**
13/01/14 17:09:30 INFO input.FileInputFormat: Total input paths to process : 1
13/01/14 17:09:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/01/14 17:09:30 WARN snappy.LoadSnappy: Snappy native library not loaded
13/01/14 17:09:31 INFO mapred.JobClient: Running job: job_201301081120_0045
13/01/14 17:09:33 INFO mapred.JobClient:  map 0% reduce 0%
    13/01/14 17:09:39 INFO mapred.JobClient: Task Id : attempt_201301081120_0045_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: QueryTable$MapClass
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1004)
    at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:217)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:602)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
    at org.apache.hadoop.mapred.Child.main(Child.java:260)
Caused by: java.lang.ClassNotFoundException: QueryTable$MapClass
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadCl

对不起,那面巨大的文字墙。我不明白为什么我会收到关于没有作业 jar 文件集的警告。我在我的运行方法中设置了它。此外,警告是由 JobClient 发出的,在我的代码中我使用的是 Job 而不是 JobClient。如果您有任何想法或反馈,我非常感兴趣。感谢您的宝贵时间!

编辑

jar 的内容:

jar -tvf QueryTable.jar
    0 Tue Jan 15 14:40:46 EST 2013 META-INF/
   68 Tue Jan 15 14:40:46 EST 2013 META-INF/MANIFEST.MF
 3091 Tue Jan 15 14:40:10 EST 2013 QueryTable.class
 3173 Tue Jan 15 14:40:10 EST 2013 QueryTable$MapClass.class
 1699 Tue Jan 15 14:40:10 EST 2013 QueryTable$Reduce.class

最佳答案

我能够通过在我的源代码顶部声明一个包来解决这个问题。

package com.foo.hadoop;

然后我编译,创建 jar,并显式调用 hadoop 并在类名前加上包。

hadoop jar QueryTable.jar com.foo.hadoop.QueryTable input output

我知道这是大多数人一开始会做的,但我认为它仍然可以在不指定包的情况下工作。不过,这绝对是更好的做法,它让我能够继续。

关于java - 与 MapClass 相关的 Hadoop ClassNotFoundException,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14339356/

相关文章:

Java - 颜色识别程序永远找不到像素

java - 如何显示 - ArrayList<Integer[]> lista = new ArrayList<>()

java - 使用Java SecurityManager阻止我读取文件

c# - 异常属于堆栈跟踪中的哪个方法?

c# - 我应该捕获并包装一般异常吗?

hadoop - 结果中没有列的排序依据或排序依据

java - 通知线程时出现 IllegalMonitorStateException

java - 使用不同的路径将war文件部署到Tomcat

java - 构建 Hadoop 2.5 时出错

java - 将Hashmap作为Mapper的输入而不是文件