java - 添加外部Jar时出现Hadoop NoClassDefFoundError

标签 java hadoop jar noclassdeffounderror distributed-cache

我正在尝试使用外部Jar在Hadoop上运行MapReduce作业。我已使用copyFromLocal/user/hduser/lib/将Jar添加到HDFS中。

在我的主要方法中,我将Jar添加到DistributedCache。但是,当我运行MapReduce程序时,在Mapper类中收到一个NoClassDefFoundError。我已经尝试过其他有类似错误的人发布在SO上的许多解决方案,但是我还没有解决问题。任何指导表示赞赏。

从主要方法:

Configuration conf = new Configuration();
String jarToAdd1 = "/user/hduser/lib/joda-time-2.9.1-no-tzdb.jar";
String jarToAdd2 = "/user/hduser/lib/joda-time-2.9.1.jar";
addJarToDistributedCache(jarToAdd1, conf);
addJarToDistributedCache(jarToAdd2, conf);
.........

添加到分布式缓存:
private static void addJarToDistributedCache(String jarToAdd, Configuration conf) throws IOException {

        Path hdfsJar = new Path(jarToAdd);
        DistributedCache.addFileToClassPath(hdfsJar,conf);
    }

发生错误的映射器:
public static class Map1 extends Mapper<LongWritable, Text, IntWritable, UserData> {

Map<IntWritable, UserData> userLog = new HashMap<IntWritable, UserData>();

public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

    String line = value.toString();                             
    StringTokenizer tokenizer = new StringTokenizer(line);      
    DateTimeFormatter formatter = DateTimeFormat.forPattern("yyyyddmm HH:mm:ss");    // *********ERROR HAPPENS HERE **********

堆栈跟踪:
16/01/30 20:45:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/01/30 20:45:16 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
16/01/30 20:45:16 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
16/01/30 20:45:16 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
16/01/30 20:45:16 INFO input.FileInputFormat: Total input paths to process : 1
16/01/30 20:45:16 INFO input.FileInputFormat: Total input paths to process : 1
16/01/30 20:45:16 INFO mapreduce.JobSubmitter: number of splits:3
16/01/30 20:45:17 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1787244132_0001
16/01/30 20:45:17 INFO mapred.LocalDistributedCacheManager: Creating symlink: /app/hadoop/tmp/mapred/local/1454204717451/joda-time-2.9.1-no-tzdb.jar <- /usr/local/hadoop/sbin/joda-time-2.9.1-no-tzdb.jar
16/01/30 20:45:17 INFO mapred.LocalDistributedCacheManager: Localized hdfs://localhost:54310/user/hduser/lib/joda-time-2.9.1-no-tzdb.jar as file:/app/hadoop/tmp/mapred/local/1454204717451/joda-time-2.9.1-no-tzdb.jar
16/01/30 20:45:17 INFO mapred.LocalDistributedCacheManager: Creating symlink: /app/hadoop/tmp/mapred/local/1454204717452/joda-time-2.9.1.jar <- /usr/local/hadoop/sbin/joda-time-2.9.1.jar
16/01/30 20:45:17 INFO mapred.LocalDistributedCacheManager: Localized hdfs://localhost:54310/user/hduser/lib/joda-time-2.9.1.jar as file:/app/hadoop/tmp/mapred/local/1454204717452/joda-time-2.9.1.jar
16/01/30 20:45:17 INFO mapred.LocalDistributedCacheManager: file:/app/hadoop/tmp/mapred/local/1454204717451/joda-time-2.9.1-no-tzdb.jar
16/01/30 20:45:17 INFO mapred.LocalDistributedCacheManager: file:/app/hadoop/tmp/mapred/local/1454204717452/joda-time-2.9.1.jar
16/01/30 20:45:17 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
16/01/30 20:45:17 INFO mapreduce.Job: Running job: job_local1787244132_0001
16/01/30 20:45:17 INFO mapred.LocalJobRunner: OutputCommitter set in config null
16/01/30 20:45:18 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
16/01/30 20:45:18 INFO mapred.LocalJobRunner: Waiting for map tasks
16/01/30 20:45:18 INFO mapred.LocalJobRunner: Starting task: attempt_local1787244132_0001_m_000000_0
16/01/30 20:45:18 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/01/30 20:45:18 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/input/sentimentFeedback7.csv:0+143748596
16/01/30 20:45:18 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/01/30 20:45:18 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/01/30 20:45:18 INFO mapred.MapTask: soft limit at 83886080
16/01/30 20:45:18 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/01/30 20:45:18 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/01/30 20:45:18 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/01/30 20:45:18 INFO mapred.MapTask: Starting flush of map output
16/01/30 20:45:18 INFO mapred.LocalJobRunner: Starting task: attempt_local1787244132_0001_m_000001_0
16/01/30 20:45:18 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/01/30 20:45:18 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/input/allergyConsumption7.csv:0+134217728
16/01/30 20:45:18 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/01/30 20:45:18 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/01/30 20:45:18 INFO mapred.MapTask: soft limit at 83886080
16/01/30 20:45:18 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/01/30 20:45:18 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/01/30 20:45:18 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/01/30 20:45:18 INFO mapred.MapTask: Starting flush of map output
16/01/30 20:45:18 INFO mapred.LocalJobRunner: Starting task: attempt_local1787244132_0001_m_000002_0
16/01/30 20:45:18 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/01/30 20:45:18 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/input/allergyConsumption7.csv:134217728+105486421
16/01/30 20:45:18 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/01/30 20:45:18 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/01/30 20:45:18 INFO mapred.MapTask: soft limit at 83886080
16/01/30 20:45:18 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/01/30 20:45:18 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/01/30 20:45:18 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/01/30 20:45:18 INFO mapred.MapTask: Starting flush of map output
16/01/30 20:45:18 INFO mapred.LocalJobRunner: map task executor complete.
16/01/30 20:45:18 WARN mapred.LocalJobRunner: job_local1787244132_0001
java.lang.Exception: java.lang.NoClassDefFoundError: org/joda/time/format/DateTimeFormat
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.NoClassDefFoundError: org/joda/time/format/DateTimeFormat
    at org.peach.fooddiary.FoodDiaryMR$Map1.map(FoodDiaryMR.java:45)
    at org.peach.fooddiary.FoodDiaryMR$Map1.map(FoodDiaryMR.java:1)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.hadoop.mapreduce.lib.input.DelegatingMapper.run(DelegatingMapper.java:55)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: org.joda.time.format.DateTimeFormat
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 12 more
16/01/30 20:45:18 INFO mapreduce.Job: Job job_local1787244132_0001 running in uber mode : false
16/01/30 20:45:18 INFO mapreduce.Job:  map 0% reduce 0%
16/01/30 20:45:18 INFO mapreduce.Job: Job job_local1787244132_0001 failed with state FAILED due to: NA
16/01/30 20:45:19 INFO mapreduce.Job: Counters: 0

最佳答案

首先,现已弃用DistributedCache.addFileToClassPath,但您可以使用:

    Configuration conf = getConf();
    Job job = Job.getInstance(conf, "My Job Name");
    job.addFileToClassPath(new Path("path"));

仍然遇到相同的问题,手动进行操作,我的意思是通过将jar文件放入hadoop的类路径中,执行以下步骤将代码部署到hadoop本身:

1)使用以下命令将需要执行的类导出到hadoop中:
  • 在项目资源管理器中,右键单击项目名称,然后选择“导出...”。
  • 在“Java”部分下,选择“Jar文件”,然后选择“下一步”。
  • 选择要将导出的“jar”存储在其中的目录

  • 2)转到运行hadoop的服务器,然后运行以下命令以了解hadoop正在查找的目录是什么,其中一个示例是
    “$ HADOOP_HOME / share / hadoop / common / lib / *”
    $ hadoop classpath 
    

    3)将步骤1中导出的jar文件复制到您在步骤2中获得的目录之一,然后再次开始对其进行测试。

    提示最好直接在服务器中测试map reduce并使用以下命令在服务器内部运行它:
    $ hadoop jar JarFileName.jar PakageName.ClassName /path/to/input /path/to/output
    

    有关详细说明的完整示例,请检查this

    关于java - 添加外部Jar时出现Hadoop NoClassDefFoundError,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35109677/

    相关文章:

    java - 如何在 Maven 中包含远程 jar 依赖项

    java - 启用/禁用 AWT 按钮

    java - 对类型删除和桥接方法感到困惑

    Hadoop Pig fs 测试命令

    java - 依赖项问题 - Eclipse Java 项目

    android - 添加 volley.jar 库后没有主 list 属性错误

    java - 如何从java中的图像创建mpeg视频

    java - 运行作业时更新 Play Framework 中的行

    java - 为什么 IdentityMapper 在 org.apache.hadoop.mapreduce 库中消失了?

    java - 运行hadoop流和mapreduce作业:PipeMapRed.waitOutputThreads():子进程失败,代码为127