java - 从 eclipse 运行 hadoop(Cloudera-2.0.0-cdh4.4.0) 作业时出错？

您好，我正在从 eclipse 运行 hadoop wordcount 示例，但出现以下错误:-

13/11/24 22:17:08 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder sending #12
13/11/24 22:17:08 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder got value #12
13/11/24 22:17:08 DEBUG ipc.ProtobufRpcEngine: Call: delete took 11ms
13/11/24 22:17:08 WARN mapred.LocalJobRunner: job_local1690217234_0001
java.lang.AbstractMethodError
    at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:96)
    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54)
    at java.io.DataOutputStream.write(DataOutputStream.java:90)
    at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.writeObject(TextOutputFormat.java:76)
    at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.write(TextOutputFormat.java:91)
    at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:483)
    at org.hadoop.par.WordCount$Reduce.reduce(WordCount.java:34)
    at org.hadoop.par.WordCount$Reduce.reduce(WordCount.java:1)
    at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
13/11/24 22:17:09 INFO mapred.JobClient:  map 100% reduce 0%
13/11/24 22:17:09 INFO mapred.JobClient: Job complete: job_local1690217234_0001
13/11/24 22:17:09 INFO mapred.JobClient: Counters: 26
13/11/24 22:17:09 INFO mapred.JobClient:   File System Counters
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of bytes read=172
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of bytes written=91974
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of read operations=0
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of large read operations=0
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of write operations=0
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of bytes read=91
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of bytes written=0
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of read operations=5
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of large read operations=0
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of write operations=1
13/11/24 22:17:09 INFO mapred.JobClient:   Map-Reduce Framework
13/11/24 22:17:09 INFO mapred.JobClient:     Map input records=15
13/11/24 22:17:09 INFO mapred.JobClient:     Map output records=17
13/11/24 22:17:09 INFO mapred.JobClient:     Map output bytes=152
13/11/24 22:17:09 INFO mapred.JobClient:     Input split bytes=112
13/11/24 22:17:09 INFO mapred.JobClient:     Combine input records=17
13/11/24 22:17:09 INFO mapred.JobClient:     Combine output records=13
13/11/24 22:17:09 INFO mapred.JobClient:     Reduce input groups=1
13/11/24 22:17:09 INFO mapred.JobClient:     Reduce shuffle bytes=0
13/11/24 22:17:09 INFO mapred.JobClient:     Reduce input records=1
13/11/24 22:17:09 INFO mapred.JobClient:     Reduce output records=0
13/11/24 22:17:09 INFO mapred.JobClient:     Spilled Records=13
13/11/24 22:17:09 INFO mapred.JobClient:     CPU time spent (ms)=0
13/11/24 22:17:09 INFO mapred.JobClient:     Physical memory (bytes) snapshot=0
13/11/24 22:17:09 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
13/11/24 22:17:09 INFO mapred.JobClient:     Total committed heap usage (bytes)=138477568
13/11/24 22:17:09 INFO mapred.JobClient:   org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
13/11/24 22:17:09 INFO mapred.JobClient:     BYTES_READ=91
13/11/24 22:17:09 INFO mapred.JobClient: Job Failed: NA
Exception in thread "main" java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1372)
    at org.hadoop.par.WordCount.main(WordCount.java:62)
13/11/24 22:17:09 DEBUG hdfs.DFSClient: Waiting for ack for: -1
13/11/24 22:17:09 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder sending #13
13/11/24 22:17:09 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder got value #13
13/11/24 22:17:09 ERROR hdfs.DFSClient: Failed to close file /user/harinder/test_output/_temporary/_attempt_local1690217234_0001_r_000000_0/part-00000
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /user/harinder/test_output/_temporary/_attempt_local1690217234_0001_r_000000_0/part-00000: File does not exist. Holder DFSClient_NONMAPREDUCE_-559950586_1 does not have any open files.
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2445)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2437)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:2503)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2480)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:556)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:337)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44958)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745)

    at org.apache.hadoop.ipc.Client.call(Client.java:1237)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    at com.sun.proxy.$Proxy9.complete(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
    at com.sun.proxy.$Proxy9.complete(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:329)
    at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:1769)
    at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1756)
    at org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:696)
    at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:713)
    at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:559)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2399)
    at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2415)
    at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
13/11/24 22:17:09 DEBUG ipc.Client: Stopping client
13/11/24 22:17:09 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder: closed
13/11/24 22:17:09 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder: stopped, remaining connections 0

我已经在我的项目中添加了所有必需的 JAR。以下是我正在运行的代码:-

import java.io.IOException;
import java.util.*;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;

public class WordCount {

  public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
      String line = value.toString();
      StringTokenizer tokenizer = new StringTokenizer(line);
      while (tokenizer.hasMoreTokens()) {
        word.set(tokenizer.nextToken());
        output.collect(word, one);
      }
    }
  }

  public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {
    public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
      int sum = 0;
      while (values.hasNext()) {
        sum += values.next().get();
      }
      output.collect(key, new IntWritable(sum));
    }
  }

  public static void main(String[] args) throws Exception {
    JobConf conf = new JobConf(WordCount.class);

    conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
    conf.addResource(new Path("/etc/hadoop/conf/hdfs-site.xml"));

    conf.setJobName("wordcount");

    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapperClass(Map.class);
    conf.setCombinerClass(Reduce.class);
    conf.setReducerClass(Reduce.class);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);


    FileInputFormat.setInputPaths(conf, new Path("/user/harinder/test_data/"));
    FileOutputFormat.setOutputPath(conf, new Path("/user/harinder/test_output"));
    //FileInputFormat.setInputPaths(conf, new Path(args[0]));
    //FileOutputFormat.setOutputPath(conf, new Path(args[1]));

    JobClient.runJob(conf);
  }
}

以下是添加到项目中的 jar 文件:-

enter image description here

最佳答案

为您的 MapReduce 程序创建一个 JAR 并将其放入您的项目中。然后运行它。这个 JAR 在集群上提交

关于java - 从 eclipse 运行 hadoop(Cloudera-2.0.0-cdh4.4.0) 作业时出错？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/20177219/

java - 从 eclipse 运行 hadoop(Cloudera-2.0.0-cdh4.4.0) 作业时出错？

上一篇：hadoop - 更改 hadoop 中的 block 大小后会发生什么

下一篇：hadoop - Hadoop Reducer 中的结果是什么？