hadoop - 首次运行Hadoop MapReduce字数计数失败?

标签 hadoop mapreduce

在运行Hadoop字数示例时,它第一次失败。这是我在做什么:

  • 格式名称节点:$HADOOP_HOME/bin/hdfs namenode -format
  • 启动HDFS / YARN:
    $HADOOP_HOME/sbin/start-dfs.sh
    $HADOOP_HOME/sbin/start-yarn.sh
    $HADOOP_HOME/sbin/yarn-daemon.sh start nodemanager
    
  • 运行字数统计:hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount input output

  • (假设输入文件夹已经在HDFS中,我不会在此处放置每个命令)

    输出:
    16/07/17 01:04:34 INFO client.RMProxy: Connecting to ResourceManager at hadoop-master/172.20.0.2:8032
    16/07/17 01:04:35 INFO input.FileInputFormat: Total input paths to process : 2
    16/07/17 01:04:35 INFO mapreduce.JobSubmitter: number of splits:2
    16/07/17 01:04:36 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1468688654488_0001
    16/07/17 01:04:36 INFO impl.YarnClientImpl: Submitted application application_1468688654488_0001
    16/07/17 01:04:36 INFO mapreduce.Job: The url to track the job: http://hadoop-master:8088/proxy/application_1468688654488_0001/
    16/07/17 01:04:36 INFO mapreduce.Job: Running job: job_1468688654488_0001
    16/07/17 01:04:46 INFO mapreduce.Job: Job job_1468688654488_0001 running in uber mode : false
    16/07/17 01:04:46 INFO mapreduce.Job:  map 0% reduce 0%
    Terminated
    

    然后HDFS崩溃,因此我无法访问http://localhost:50070/
    然后,我重新启动eveyrthing(重复步骤2),重新运行示例,一切正常。

    如何在第一次运行时修复它?我的HDFS显然第一次没有数据,也许是问题所在?

    更新:

    运行一个甚至更简单的示例也会失败:
    hadoop@8f98bf86ceba:~$ hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples*.jar pi 3 3
    
    Number of Maps  = 3
    Samples per Map = 3
    Wrote input for Map #0
    Wrote input for Map #1
    Wrote input for Map #2
    Starting Job
    16/07/17 03:21:28 INFO client.RMProxy: Connecting to ResourceManager at hadoop-master/172.20.0.3:8032
    16/07/17 03:21:29 INFO input.FileInputFormat: Total input paths to process : 3
    16/07/17 03:21:29 INFO mapreduce.JobSubmitter: number of splits:3
    16/07/17 03:21:29 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1468696855031_0001
    16/07/17 03:21:31 INFO impl.YarnClientImpl: Submitted application application_1468696855031_0001
    16/07/17 03:21:31 INFO mapreduce.Job: The url to track the job: http://hadoop-master:8088/proxy/application_1468696855031_0001/
    16/07/17 03:21:31 INFO mapreduce.Job: Running job: job_1468696855031_0001
    16/07/17 03:21:43 INFO mapreduce.Job: Job job_1468696855031_0001 running in uber mode : false
    16/07/17 03:21:43 INFO mapreduce.Job:  map 0% reduce 0%
    

    同样的问题,HDFS终止

    最佳答案

    您的帖子看起来不完整,无法推断出问题所在。我的猜测是hadoop-mapreduce-examples-2.7.2-sources.jar不是您想要的。您更有可能需要包含hadoop-mapreduce-examples-2.7.2.jar文件而不是源文件的.class

    关于hadoop - 首次运行Hadoop MapReduce字数计数失败?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38410680/

    相关文章:

    hadoop - 在cloudera演示cdh3u4上运行mapreduce作业(航空公司数据示例)

    hadoop - MapReduce 中 1 个任务的 reducer 数量

    apache-spark - Spark magic 输出提交器设置无法识别

    hadoop - MultipleOutputs map 减少不起作用

    hadoop - 如何编写配置单元UDF

    Hadoop 理解::基础知识

    java - MapReduce中的排序

    hadoop - 无法在具有权限的文件夹中的HDFS上创建登台目录

    java - 正则表达式提取 hive 在以下情况?

    hadoop - pig : Running two aggregation functions