java - 在Ubuntu 14.04 hadoop 2.6版的单节点群集设置中运行hadoop程序

标签 java hadoop mapreduce ubuntu-14.04

我是大数据世界的新手,hadoop我正在尝试在google中运行代码,它包含四个步骤,例如将数据放入hadoop文件系统中,然后向数据中添加索引,然后创建一个使用map和reduce减少数据。

我能够执行前两个步骤:
该代码使用xml来处理位置:

我使用的代码是http://asterixdb.ics.uci.edu/fuzzyjoin/

当我执行模糊连接的最后一步时,它给了我一系列错误:

从而将跟踪文件附加到:

  hduser@ubuntu:/home/midhu/fuzzyjoin$ cd fuzzyjoin-hadoop
hduser@ubuntu:/home/midhu/fuzzyjoin/fuzzyjoin-hadoop$ hadoop jar target/fuzzyjoin-hadoop-0.0.2-SNAPSHOT.jar fuzzyjoin -conf src/main/resources/fuzzyjoin/dblp.quickstart.xml
16/04/03 13:55:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Complete-Job started: Sun Apr 03 13:55:42 IST 2016
Multi-Job started: Sun Apr 03 13:55:42 IST 2016
FuzzyJoinDriver(TokensBasic.phase1)
  Input Path:  {hdfs://localhost:54310/user/hduser/dblp-small/records-000}
  Output Path: hdfs://localhost:54310/user/hduser/dblp-small/tokens.phase1-000
  Map Jobs:    2
  Reduce Jobs: 1
  Properties:  {fuzzyjoin.similarity.name=Jaccard
                fuzzyjoin.similarity.threshold=.5
                fuzzyjoin.tokenizer=Word
                fuzzyjoin.tokens.package=Scalar
                fuzzyjoin.tokens.lengthstats=false
                fuzzyjoin.ridpairs.group.class=TokenIdentity
                fuzzyjoin.ridpairs.group.factor=1
                fuzzyjoin.data.tokens=
                fuzzyjoin.data.joinindex=}
Job started: Sun Apr 03 13:55:42 IST 2016
16/04/03 13:55:42 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
16/04/03 13:55:42 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
16/04/03 13:55:42 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
16/04/03 13:55:43 INFO mapred.FileInputFormat: Total input paths to process : 1
16/04/03 13:55:43 INFO mapreduce.JobSubmitter: number of splits:1
16/04/03 13:55:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1780986358_0001
16/04/03 13:55:44 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
16/04/03 13:55:44 INFO mapreduce.Job: Running job: job_local1780986358_0001
16/04/03 13:55:44 INFO mapred.LocalJobRunner: OutputCommitter set in config null
16/04/03 13:55:44 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter
16/04/03 13:55:45 INFO mapred.LocalJobRunner: Waiting for map tasks
16/04/03 13:55:45 INFO mapred.LocalJobRunner: Starting task: attempt_local1780986358_0001_m_000000_0
16/04/03 13:55:46 INFO mapreduce.Job: Job job_local1780986358_0001 running in uber mode : false
16/04/03 13:55:46 INFO mapreduce.Job:  map 0% reduce 0%
16/04/03 13:55:46 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/04/03 13:55:46 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/dblp-small/records-000/part-00000:0+36687
16/04/03 13:55:46 INFO mapred.MapTask: numReduceTasks: 1
16/04/03 13:55:49 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/04/03 13:55:49 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/04/03 13:55:49 INFO mapred.MapTask: soft limit at 83886080
16/04/03 13:55:49 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/04/03 13:55:49 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/04/03 13:55:49 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/04/03 13:55:52 INFO mapred.LocalJobRunner: hdfs://localhost:54310/user/hduser/dblp-small/records-000/part-00000:0+36687 > map
16/04/03 13:55:54 INFO mapred.LocalJobRunner: hdfs://localhost:54310/user/hduser/dblp-small/records-000/part-00000:0+36687 > map
16/04/03 13:55:54 INFO mapred.MapTask: Starting flush of map output
16/04/03 13:55:54 INFO mapred.MapTask: Spilling map output
16/04/03 13:55:54 INFO mapred.MapTask: bufstart = 0; bufend = 15588; bufvoid = 104857600
16/04/03 13:55:54 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26209408(104837632); length = 4989/6553600
16/04/03 13:55:54 INFO mapred.MapTask: Finished spill 0
16/04/03 13:55:54 INFO mapred.Task: Task:attempt_local1780986358_0001_m_000000_0 is done. And is in the process of committing
16/04/03 13:55:54 INFO mapred.LocalJobRunner: hdfs://localhost:54310/user/hduser/dblp-small/records-000/part-00000:0+36687
16/04/03 13:55:54 INFO mapred.Task: Task 'attempt_local1780986358_0001_m_000000_0' done.
16/04/03 13:55:54 INFO mapred.LocalJobRunner: Finishing task: attempt_local1780986358_0001_m_000000_0
16/04/03 13:55:54 INFO mapred.LocalJobRunner: map task executor complete.
16/04/03 13:55:54 INFO mapred.LocalJobRunner: Waiting for reduce tasks
16/04/03 13:55:54 INFO mapred.LocalJobRunner: Starting task: attempt_local1780986358_0001_r_000000_0
16/04/03 13:55:54 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/04/03 13:55:54 INFO mapreduce.Job:  map 100% reduce 0%
16/04/03 13:55:54 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@3209e0
16/04/03 13:55:54 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10
16/04/03 13:55:54 INFO reduce.EventFetcher: attempt_local1780986358_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
16/04/03 13:55:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1780986358_0001_m_000000_0 decomp: 9062 len: 9066 to MEMORY
16/04/03 13:55:56 INFO reduce.InMemoryMapOutput: Read 9062 bytes from map-output for attempt_local1780986358_0001_m_000000_0
16/04/03 13:55:57 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 9062, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->9062
16/04/03 13:55:57 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
16/04/03 13:55:57 INFO mapred.LocalJobRunner: 1 / 1 copied.
16/04/03 13:55:57 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
16/04/03 13:55:57 INFO mapred.Merger: Merging 1 sorted segments
16/04/03 13:55:57 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 9056 bytes
16/04/03 13:55:57 INFO reduce.MergeManagerImpl: Merged 1 segments, 9062 bytes to disk to satisfy reduce memory limit
16/04/03 13:55:57 INFO reduce.MergeManagerImpl: Merging 1 files, 9066 bytes from disk
16/04/03 13:55:57 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
16/04/03 13:55:57 INFO mapred.Merger: Merging 1 sorted segments
16/04/03 13:55:57 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 9056 bytes
16/04/03 13:55:57 INFO mapred.LocalJobRunner: 1 / 1 copied.
16/04/03 13:56:00 INFO mapred.LocalJobRunner: reduce > reduce
16/04/03 13:56:00 INFO mapreduce.Job:  map 100% reduce 100%
16/04/03 13:56:01 INFO mapred.Task: Task:attempt_local1780986358_0001_r_000000_0 is done. And is in the process of committing
16/04/03 13:56:01 INFO mapred.LocalJobRunner: reduce > reduce
16/04/03 13:56:01 INFO mapred.Task: Task attempt_local1780986358_0001_r_000000_0 is allowed to commit now
16/04/03 13:56:02 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1780986358_0001_r_000000_0' to hdfs://localhost:54310/user/hduser/dblp-small/tokens.phase1-000/_temporary/0/task_local1780986358_0001_r_000000
16/04/03 13:56:02 INFO mapred.LocalJobRunner: reduce > reduce
16/04/03 13:56:02 INFO mapred.Task: Task 'attempt_local1780986358_0001_r_000000_0' done.
16/04/03 13:56:02 INFO mapred.LocalJobRunner: Finishing task: attempt_local1780986358_0001_r_000000_0
16/04/03 13:56:02 INFO mapred.LocalJobRunner: reduce task executor complete.
16/04/03 13:56:02 INFO mapreduce.Job: Job job_local1780986358_0001 completed successfully
16/04/03 13:56:03 INFO mapreduce.Job: Counters: 38
    File System Counters
        FILE: Number of bytes read=1080562
        FILE: Number of bytes written=1589660
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=73374
        HDFS: Number of bytes written=12847
        HDFS: Number of read operations=15
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=18
    Map-Reduce Framework
        Map input records=100
        Map output records=1248
        Map output bytes=15588
        Map output materialized bytes=9066
        Input split bytes=120
        Combine input records=1248
        Combine output records=597
        Reduce input groups=597
        Reduce shuffle bytes=9066
        Reduce input records=597
        Reduce output records=597
        Spilled Records=1194
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=176
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0
        Total committed heap usage (bytes)=241836032
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=36687
    File Output Format Counters 
        Bytes Written=12847
Job ended: Sun Apr 03 13:56:04 IST 2016
The job took 21.44 seconds.
FuzzyJoinDriver(TokensBasic.phase2)
  Input Path:  {hdfs://localhost:54310/user/hduser/dblp-small/tokens.phase1-000}
  Output Path: hdfs://localhost:54310/user/hduser/dblp-small/tokens-000
  Map Jobs:    2
  Reduce Jobs: 1
  Properties:  {fuzzyjoin.similarity.name=Jaccard
                fuzzyjoin.similarity.threshold=.5
                fuzzyjoin.tokenizer=Word
                fuzzyjoin.tokens.package=Scalar
                fuzzyjoin.tokens.lengthstats=false
                fuzzyjoin.ridpairs.group.class=TokenIdentity
                fuzzyjoin.ridpairs.group.factor=1
                fuzzyjoin.data.tokens=
                fuzzyjoin.data.joinindex=}
Job started: Sun Apr 03 13:56:04 IST 2016
16/04/03 13:56:04 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
16/04/03 13:56:04 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
16/04/03 13:56:05 INFO mapred.FileInputFormat: Total input paths to process : 1
16/04/03 13:56:05 INFO mapreduce.JobSubmitter: number of splits:1
16/04/03 13:56:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local954589393_0002
16/04/03 13:56:05 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
16/04/03 13:56:05 INFO mapreduce.Job: Running job: job_local954589393_0002
16/04/03 13:56:05 INFO mapred.LocalJobRunner: OutputCommitter set in config null
16/04/03 13:56:05 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter
16/04/03 13:56:05 INFO mapred.LocalJobRunner: Waiting for map tasks
16/04/03 13:56:05 INFO mapred.LocalJobRunner: Starting task: attempt_local954589393_0002_m_000000_0
16/04/03 13:56:05 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/04/03 13:56:05 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/dblp-small/tokens.phase1-000/part-00000:0+12847
16/04/03 13:56:05 INFO mapred.MapTask: numReduceTasks: 1
16/04/03 13:56:06 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/04/03 13:56:06 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/04/03 13:56:06 INFO mapred.MapTask: soft limit at 83886080
16/04/03 13:56:06 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/04/03 13:56:06 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/04/03 13:56:06 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/04/03 13:56:06 INFO mapred.LocalJobRunner: 
16/04/03 13:56:06 INFO mapred.MapTask: Starting flush of map output
16/04/03 13:56:06 INFO mapred.MapTask: Spilling map output
16/04/03 13:56:06 INFO mapred.MapTask: bufstart = 0; bufend = 7866; bufvoid = 104857600
16/04/03 13:56:06 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26212012(104848048); length = 2385/6553600
16/04/03 13:56:06 INFO mapred.MapTask: Finished spill 0
16/04/03 13:56:06 INFO mapred.Task: Task:attempt_local954589393_0002_m_000000_0 is done. And is in the process of committing
16/04/03 13:56:06 INFO mapred.LocalJobRunner: hdfs://localhost:54310/user/hduser/dblp-small/tokens.phase1-000/part-00000:0+12847
16/04/03 13:56:06 INFO mapred.Task: Task 'attempt_local954589393_0002_m_000000_0' done.
16/04/03 13:56:06 INFO mapred.LocalJobRunner: Finishing task: attempt_local954589393_0002_m_000000_0
16/04/03 13:56:06 INFO mapred.LocalJobRunner: map task executor complete.
16/04/03 13:56:06 INFO mapred.LocalJobRunner: Waiting for reduce tasks
16/04/03 13:56:06 INFO mapred.LocalJobRunner: Starting task: attempt_local954589393_0002_r_000000_0
16/04/03 13:56:06 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/04/03 13:56:06 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@4950dd
16/04/03 13:56:06 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10
16/04/03 13:56:06 INFO reduce.EventFetcher: attempt_local954589393_0002_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
16/04/03 13:56:06 INFO reduce.LocalFetcher: localfetcher#2 about to shuffle output of map attempt_local954589393_0002_m_000000_0 decomp: 9062 len: 9066 to MEMORY
16/04/03 13:56:06 INFO reduce.InMemoryMapOutput: Read 9062 bytes from map-output for attempt_local954589393_0002_m_000000_0
16/04/03 13:56:06 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 9062, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->9062
16/04/03 13:56:06 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
16/04/03 13:56:06 INFO mapred.LocalJobRunner: 1 / 1 copied.
16/04/03 13:56:06 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
16/04/03 13:56:06 INFO mapred.Merger: Merging 1 sorted segments
16/04/03 13:56:06 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 9056 bytes
16/04/03 13:56:06 INFO reduce.MergeManagerImpl: Merged 1 segments, 9062 bytes to disk to satisfy reduce memory limit
16/04/03 13:56:06 INFO reduce.MergeManagerImpl: Merging 1 files, 9066 bytes from disk
16/04/03 13:56:06 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
16/04/03 13:56:06 INFO mapred.Merger: Merging 1 sorted segments
16/04/03 13:56:06 INFO mapreduce.Job: Job job_local954589393_0002 running in uber mode : false
16/04/03 13:56:06 INFO mapreduce.Job:  map 100% reduce 0%
16/04/03 13:56:06 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 9056 bytes
16/04/03 13:56:06 INFO mapred.LocalJobRunner: 1 / 1 copied.
16/04/03 13:56:06 INFO mapred.Task: Task:attempt_local954589393_0002_r_000000_0 is done. And is in the process of committing
16/04/03 13:56:06 INFO mapred.LocalJobRunner: 1 / 1 copied.
16/04/03 13:56:06 INFO mapred.Task: Task attempt_local954589393_0002_r_000000_0 is allowed to commit now
16/04/03 13:56:06 INFO output.FileOutputCommitter: Saved output of task 'attempt_local954589393_0002_r_000000_0' to hdfs://localhost:54310/user/hduser/dblp-small/tokens-000/_temporary/0/task_local954589393_0002_r_000000
16/04/03 13:56:06 INFO mapred.LocalJobRunner: reduce > reduce
16/04/03 13:56:06 INFO mapred.Task: Task 'attempt_local954589393_0002_r_000000_0' done.
16/04/03 13:56:06 INFO mapred.LocalJobRunner: Finishing task: attempt_local954589393_0002_r_000000_0
16/04/03 13:56:06 INFO mapred.LocalJobRunner: reduce task executor complete.
16/04/03 13:56:07 INFO mapreduce.Job:  map 100% reduce 100%
16/04/03 13:56:07 INFO mapreduce.Job: Job job_local954589393_0002 completed successfully
16/04/03 13:56:07 INFO mapreduce.Job: Counters: 38
    File System Counters
        FILE: Number of bytes read=2179300
        FILE: Number of bytes written=3182466
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=99068
        HDFS: Number of bytes written=31172
        HDFS: Number of read operations=45
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=30
    Map-Reduce Framework
        Map input records=597
        Map output records=597
        Map output bytes=7866
        Map output materialized bytes=9066
        Input split bytes=126
        Combine input records=0
        Combine output records=0
        Reduce input groups=18
        Reduce shuffle bytes=9066
        Reduce input records=597
        Reduce output records=597
        Spilled Records=1194
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=488
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0
        Total committed heap usage (bytes)=336207872
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=12847
    File Output Format Counters 
        Bytes Written=5478
Job ended: Sun Apr 03 13:56:07 IST 2016
The job took 3.563 seconds.
Multi-Job ended: Sun Apr 03 13:56:07 IST 2016
The multi-job took 25.128 seconds.
FuzzyJoinDriver(RIDPairsImproved)
  Input Path:  {hdfs://localhost:54310/user/hduser/dblp-small/records-000}
  Output Path: hdfs://localhost:54310/user/hduser/dblp-small/ridpairs-000
  Map Jobs:    2
  Reduce Jobs: 1
  Properties:  {fuzzyjoin.similarity.name=Jaccard
                fuzzyjoin.similarity.threshold=.5
                fuzzyjoin.tokenizer=Word
                fuzzyjoin.tokens.package=Scalar
                fuzzyjoin.tokens.lengthstats=false
                fuzzyjoin.ridpairs.group.class=TokenIdentity
                fuzzyjoin.ridpairs.group.factor=1
                fuzzyjoin.data.tokens=dblp-small/tokens-000/part-00000
                fuzzyjoin.data.joinindex=}
Job started: Sun Apr 03 13:56:08 IST 2016
16/04/03 13:56:08 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
16/04/03 13:56:08 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
16/04/03 13:56:09 INFO mapred.FileInputFormat: Total input paths to process : 1
16/04/03 13:56:09 INFO mapreduce.JobSubmitter: number of splits:1
16/04/03 13:56:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1951342027_0003
16/04/03 13:56:16 INFO mapred.LocalDistributedCacheManager: Creating symlink: /tmp/mapred/local/1459671970648/part-00000 <- /home/midhu/fuzzyjoin/fuzzyjoin-hadoop/part-00000
16/04/03 13:56:16 INFO mapred.LocalDistributedCacheManager: Localized hdfs://localhost:54310/user/hduser/dblp-small/tokens-000/part-00000 as file:/tmp/mapred/local/1459671970648/part-00000
16/04/03 13:56:17 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
16/04/03 13:56:17 INFO mapreduce.Job: Running job: job_local1951342027_0003
16/04/03 13:56:17 INFO mapred.LocalJobRunner: OutputCommitter set in config null
16/04/03 13:56:17 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter
16/04/03 13:56:17 INFO mapred.LocalJobRunner: Waiting for map tasks
16/04/03 13:56:17 INFO mapred.LocalJobRunner: Starting task: attempt_local1951342027_0003_m_000000_0
16/04/03 13:56:17 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
16/04/03 13:56:17 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/hduser/dblp-small/records-000/part-00000:0+36687
16/04/03 13:56:17 INFO mapred.MapTask: numReduceTasks: 1
16/04/03 13:56:17 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
16/04/03 13:56:17 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
16/04/03 13:56:17 INFO mapred.MapTask: soft limit at 83886080
16/04/03 13:56:17 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
16/04/03 13:56:17 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
16/04/03 13:56:17 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
16/04/03 13:56:17 INFO mapred.LocalJobRunner: map task executor complete.
16/04/03 13:56:17 WARN mapred.LocalJobRunner: job_local1951342027_0003
java.lang.Exception: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
    ... 10 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
    ... 15 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
    ... 18 more
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: file:/tmp/mapred/local/1459671970648/part-00000 (No such file or directory)
    at edu.uci.ics.fuzzyjoin.tokenorder.TokenLoad.loadTokenRank(TokenLoad.java:60)
    at edu.uci.ics.fuzzyjoin.tokenorder.TokenLoad.loadTokenRank(TokenLoad.java:40)
    at edu.uci.ics.fuzzyjoin.hadoop.ridpairs.token.MapSelfJoin.configure(MapSelfJoin.java:98)
    ... 23 more
Caused by: java.io.FileNotFoundException: file:/tmp/mapred/local/1459671970648/part-00000 (No such file or directory)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.<init>(FileInputStream.java:146)
    at java.io.FileInputStream.<init>(FileInputStream.java:101)
    at edu.uci.ics.fuzzyjoin.tokenorder.TokenLoad.loadTokenRank(TokenLoad.java:45)
    ... 25 more
16/04/03 13:56:18 INFO mapreduce.Job: Job job_local1951342027_0003 running in uber mode : false
16/04/03 13:56:18 INFO mapreduce.Job:  map 0% reduce 0%
16/04/03 13:56:18 INFO mapreduce.Job: Job job_local1951342027_0003 failed with state FAILED due to: NA
16/04/03 13:56:18 INFO mapreduce.Job: Counters: 0
java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
    at edu.uci.ics.fuzzyjoin.hadoop.FuzzyJoinDriver.run(FuzzyJoinDriver.java:179)
    at edu.uci.ics.fuzzyjoin.hadoop.ridpairs.RIDPairsImproved.main(RIDPairsImproved.java:108)
    at edu.uci.ics.fuzzyjoin.hadoop.FuzzyJoin.bib(FuzzyJoin.java:39)
    at edu.uci.ics.fuzzyjoin.hadoop.FuzzyJoin.main(FuzzyJoin.java:86)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
    at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152)
    at edu.uci.ics.fuzzyjoin.hadoop.FuzzyJoinDriver.main(FuzzyJoinDriver.java:121)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

我认为这是Ubuntu中hadoop的配置错误,我使用了本教程中的配置
http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php

最佳答案

最后,我成功运行了代码并纠正了错误。 该错误是由于在机器上本地运行mapreduce程序导致的,我将其更改为在yarn中运行,并且该代码对于所有数据类型均适用

关于java - 在Ubuntu 14.04 hadoop 2.6版的单节点群集设置中运行hadoop程序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36383516/

相关文章:

hadoop - MapReduce 中映射器和缩减器任务的数量

file - hadoop中的序列文件是什么?

java - 如何向ViewModel注册StartActivityForResult事件? [MVVM]

java - 如何在 JFrame 弹出错误后使主窗口处于非 Activity 状态

java - Mockito 不能模拟依赖于 Unmarshaller 的类

java - Oracle sql STRUCT 创建花费太多时间

apache - 如何在bin/yarn-session.sh中指定ResourceManager的地址?

java - 如何将外部 jar 添加到 hadoop 作业?

hadoop - mapper类在hadoop mapreduce程序中是强制性的吗

scala - Spark 在 hdfs 上写入 Parquet