java - 为什么 hadoop 输出文件 part-r-00000 是空的

标签 java hadoop mapreduce

我的 MR 日志是:

[root@sicongli hadoop-2.4.1]# hadoop jar flowcount.jar   
cn.itheima.bigdata.hadoop.mr.flowcount.FlowCount /data/join.txt /out
16/04/13 23:32:20 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
16/04/13 23:32:22 INFO client.RMProxy: Connecting to ResourceManager at sicongli/192.168.218.111:8032

16/04/13 23:32:28 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
16/04/13 23:32:35 INFO input.FileInputFormat: Total input paths to process : 1
16/04/13 23:32:38 INFO mapreduce.JobSubmitter: number of splits:1
16/04/13 23:32:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1460601112521_0002
16/04/13 23:32:47 INFO impl.YarnClientImpl: Submitted application application_1460601112521_0002
16/04/13 23:32:47 INFO mapreduce.Job: The url to track the job: http://sicongli:8088/proxy/application_1460601112521_0002/
16/04/13 23:32:47 INFO mapreduce.Job: Running job: job_1460601112521_0002
16/04/13 23:35:20 INFO mapreduce.Job: Job job_1460601112521_0002 running in uber mode : false
16/04/13 23:35:28 INFO mapreduce.Job:  map 0% reduce 0%
16/04/13 23:36:47 INFO mapreduce.Job:  map 100% reduce 0%
16/04/13 23:37:25 INFO mapreduce.Job:  map 100% reduce 100%
16/04/13 23:37:48 INFO mapreduce.Job: Job job_1460601112521_0002 completed successfully
16/04/13 23:38:16 INFO mapreduce.Job: Counters: 49
    File System Counters
            FILE: Number of bytes read=6
            FILE: Number of bytes written=186579
            FILE: Number of read operations=0
            FILE: Number of large read operations=0
            FILE: Number of write operations=0
            HDFS: Number of bytes read=399
            HDFS: Number of bytes written=0
            HDFS: Number of read operations=6
            HDFS: Number of large read operations=0
            HDFS: Number of write operations=2
    Job Counters 
            Launched map tasks=1
            Launched reduce tasks=1
            Data-local map tasks=1
            Total time spent by all maps in occupied slots (ms)=17296
            Total time spent by all reduces in occupied slots (ms)=36727
            Total time spent by all map tasks (ms)=17296
            Total time spent by all reduce tasks (ms)=36727
            Total vcore-seconds taken by all map tasks=17296
            Total vcore-seconds taken by all reduce tasks=36727
            Total megabyte-seconds taken by all map tasks=17711104
            Total megabyte-seconds taken by all reduce tasks=37608448
    Map-Reduce Framework
            Map input records=23
            Map output records=0
            Map output bytes=0
            Map output materialized bytes=6
            Input split bytes=99
            Combine input records=0
            Combine output records=0
            Reduce input groups=0
            Reduce shuffle bytes=6
            Reduce input records=0
            Reduce output records=0
            Spilled Records=0
            Shuffled Maps =1
            Failed Shuffles=0
            Merged Map outputs=1
            GC time elapsed (ms)=217
            CPU time spent (ms)=1150
            Physical memory (bytes) snapshot=277962752
            Virtual memory (bytes) snapshot=1689296896
            Total committed heap usage (bytes)=127127552
    Shuffle Errors
            BAD_ID=0
            CONNECTION=0
            IO_ERROR=0
            WRONG_LENGTH=0
            WRONG_MAP=0
            WRONG_REDUCE=0
    File Input Format Counters 
            Bytes Read=300
    File Output Format Counters 
            Bytes Written=0
16/04/13 23:38:18 INFO ipc.Client: Retrying connect to server: sicongli/192.168.218.111:49806. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
16/04/13 23:38:19 INFO ipc.Client: Retrying connect to server: sicongli/192.168.218.111:49806. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
16/04/13 23:38:20 INFO ipc.Client: Retrying connect to server: sicongli/192.168.218.111:49806. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
16/04/13 23:38:23 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server

输出是:

[root@sicongli ~]# hadoop fs -ls /out
16/04/14 00:00:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r--   3 root supergroup          0 2016-04-13 23:37 /out/_SUCCESS
-rw-r--r--   3 root supergroup          0 2016-04-13 23:37 /out/part-r-00000

我有两个问题:

一:为什么输出文件part-r-0000为空

tow : 为什么会出现警告:INFO ipc.Client: Retrying connect to server: sicongli/192.168.218.111:49806。已经尝试过 2 次;重试策略是 RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)

最佳答案

问题 1 - 读取计数器:

Map input records=23

Map output records=0

Part-r-00000 是空的,因为您的 map task 中没有任何内容。如果您将 map task 的代码添加到您的帖子中,我们可能会告诉您原因

问题 2 - 阅读 this 的答案问题,他们可能会帮助您。

关于java - 为什么 hadoop 输出文件 part-r-00000 是空的,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36615938/

相关文章:

java - 在同一行中循环不同的结束条件?

找不到 Hadoop 文件系统命令

hadoop - hdfs:无法放置足够的副本:预期大小为2,但只能选择0种存储类型

mapreduce - 如何在hbase中联接表

php - MapReduce 在 CakePHP 3.x 中不工作

java - 无论情况如何,如何检查 Map 中的 key ?

java - 在eclipse中安装selenium-java-2.5.0

java - 如何将 CSV 导入 Neo4j

java - 在 Hadoop MapReduce 中查找除 “the” 、 “am” 、 “is” 和 “are” 之外的前 10 个最常用词?

hadoop - Hadoop存储记录