Hadoop 数据节点启动失败

标签 hadoop nodes startup failovercluster

我有一个包含 11 个节点的 Hadoop 集群。一个节点作为主节点,10 个从节点运行 DATANODE & TASKTRACKERS。
TASK TRACKER 在所有从节点上启动。 DATANODE 只启动了 10 个节点中的 6 个 以下是来自 /hadoop/logs/...Datanode....log 文件的日志。

2014-12-03 17:55:05,057 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: 

/************************************************************

STARTUP_MSG: Starting DataNode

STARTUP_MSG:   host = trans9/192.168.0.16

STARTUP_MSG:   args = []

STARTUP_MSG:   version = 1.2.1

STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 
1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013

STARTUP_MSG:   java = 1.7.0_65

************************************************************/

2014-12-03 17:55:05,371 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties

2014-12-03 17:55:05,384 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.

2014-12-03 17:55:05,385 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).

2014-12-03 17:55:05,385 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started

2014-12-03 17:55:05,776 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.

2014-12-03 17:55:05,789 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!

2014-12-03 17:55:08,850 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean

2014-12-03 17:55:08,865 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened data transfer server at 50010

2014-12-03 17:55:08,867 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 1048576 bytes/s

2014-12-03 17:55:08,876 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library

2014-12-03 17:55:08,962 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog

2014-12-03 17:55:09,055 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)

2014-12-03 17:55:09,068 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dfs.webhdfs.enabled = false

2014-12-03 17:55:09,068 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50075

2014-12-03 17:55:09,068 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50075 webServer.getConnectors()[0].getLocalPort() returned 50075

2014-12-03 17:55:09,068 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075

2014-12-03 17:55:09,068 INFO org.mortbay.log: jetty-6.1.26

2014-12-03 17:55:09,804 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50075

2014-12-03 17:55:09,812 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.

2014-12-03 17:55:09,813 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source DataNode registered.

2014-12-03 17:55:09,893 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort50020 registered.

2014-12-03 17:55:09,894 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort50020 registered.

2014-12-03 17:55:09,895 INFO org.apache.hadoop.ipc.Server: Starting SocketReader

2014-12-03 17:55:09,903 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(slave9:50010, storageID=DS-551911532-192.168.0.31-50010-1417617118848, infoPort=50075, ipcPort=50020)

2014-12-03 17:55:09,914 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Finished generating blocks being written report for 1 volumes in 0 seconds

2014-12-03 17:55:09,933 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Finished asynchronous block report scan in 5ms

2014-12-03 17:55:09,933 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.0.16:50010, storageID=DS-551911532-192.168.0.31-50010-1417617118848, infoPort=50075, ipcPort=50020)In DataNode.run, data = FSDataset{dirpath='/home/ubuntu/hadoop/hadoop-data/dfs/data/current'}

2014-12-03 17:55:09,945 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting

2014-12-03 17:55:09,946 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting

2014-12-03 17:55:09,946 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: starting

2014-12-03 17:55:09,955 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020: starting

2014-12-03 17:55:09,955 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: using BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec

2014-12-03 17:55:09,959 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: starting

2014-12-03 17:55:10,140 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is shutting down: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.UnregisteredDatanodeException: Data node 192.168.0.16:50010 is attempting to report storage ID DS-551911532-192.168.0.31-50010-1417617118848. Node 192.168.0.14:50010 is expected to serve this storage.
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDatanode(FSNamesystem.java:5049)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processReport(FSNamesystem.java:3939)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReport(NameNode.java:1095)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)

    at org.apache.hadoop.ipc.Client.call(Client.java:1113)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
    at com.sun.proxy.$Proxy3.blockReport(Unknown Source)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:1084)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1588)
    at java.lang.Thread.run(Thread.java:745)


2014-12-03 17:55:10,144 INFO org.mortbay.log: Stopped SelectChannelConnector@0.0.0.0:50075

2014-12-03 17:55:10,147 INFO org.apache.hadoop.ipc.Server: Stopping server on 50020

2014-12-03 17:55:10,147 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: exiting

2014-12-03 17:55:10,147 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020: exiting

2014-12-03 17:55:10,147 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: exiting

2014-12-03 17:55:10,148 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 50020

2014-12-03 17:55:10,148 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder

2014-12-03 17:55:10,149 INFO org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down

2014-12-03 17:55:10,149 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.0.16:50010, storageID=DS-551911532-192.168.0.31-50010-1417617118848, infoPort=50075, ipcPort=50020):DataXceiveServer:java.nio.channels.AsynchronousCloseException
    at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:205)
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:248)
    at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:100)
    at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:132)
    at java.lang.Thread.run(Thread.java:745)


2014-12-03 17:55:10,149 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting DataXceiveServer

2014-12-03 17:55:10,149 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Waiting for threadgroup to exit, active threads is 0

2014-12-03 17:55:10,150 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...

2014-12-03 17:55:10,151 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down

2014-12-03 17:55:10,151 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.0.16:50010, storageID=DS-551911532-192.168.0.31-50010-1417617118848, infoPort=50075, ipcPort=50020):Finishing DataNode in: FSDataset{dirpath='/home/ubuntu/hadoop/hadoop-data/dfs/data/current'}

2014-12-03 17:55:10,152 WARN org.apache.hadoop.metrics2.util.MBeans: Hadoop:service=DataNode,name=DataNodeInfo
javax.management.InstanceNotFoundException: Hadoop:service=DataNode,name=DataNodeInfo
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
    at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)
    at org.apache.hadoop.metrics2.util.MBeans.unregister(MBeans.java:71)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.unRegisterMXBean(DataNode.java:586)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:855)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1601)
    at java.lang.Thread.run(Thread.java:745)

2014-12-03 17:55:10,152 INFO org.apache.hadoop.ipc.Server: Stopping server on 50020

2014-12-03 17:55:10,152 INFO org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down

2014-12-03 17:55:10,153 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Waiting for threadgroup to exit, active threads is 0

2014-12-03 17:55:10,153 WARN org.apache.hadoop.metrics2.util.MBeans: Hadoop:service=DataNode,name=FSDatasetState-DS-551911532-192.168.0.31-50010-1417617118848
javax.management.InstanceNotFoundException: Hadoop:service=DataNode,name=FSDatasetState-DS-551911532-192.168.0.31-50010-1417617118848
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
    at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)
    at org.apache.hadoop.metrics2.util.MBeans.unregister(MBeans.java:71)
    at org.apache.hadoop.hdfs.server.datanode.FSDataset.shutdown(FSDataset.java:2093)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:917)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1601)
    at java.lang.Thread.run(Thread.java:745)

2014-12-03 17:55:10,159 WARN org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: AsyncDiskService has already shut down.

2014-12-03 17:55:10,159 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode

2014-12-03 17:55:10,166 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: 
/************************************************************

SHUTDOWN_MSG: Shutting down DataNode at trans9/192.168.0.16
************************************************************/

最佳答案

哦,一个系统用于名称节点,其余系统分配给数据节点是不推荐的。我为 Name node 推荐一个系统,为 Job Tracker 推荐另一个系统,为 secondary name node 推荐另一个系统。其余8个系统should data node强烈推荐。

回答你的问题。格式化名称节点以解决此问题。也重新打开终端。有些时候网络也是原因之一。

关于Hadoop 数据节点启动失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27278590/

相关文章:

hadoop - Hadoop 上的代理框架

java - 使用维基百科数据时如何提高性能?网页?

algorithm - 寻找二叉树中的最长路径

algorithm - 在图中找到距离至少为 D(常数) 的两条路径

android - 启动没有 Activity 的android应用程序

c# - UWP 应用程序在启动时自动启动

java - 如何更改 org.apache.commons.logging.Log.info ("massage") 将写入日志文件

hadoop - 确保在Kafka Connect中交付一次

c - 如何仅使用一个变量跟踪链表中的所有节点?

ios - 如何改进我的 Adob​​e Air iOS 应用程序启动时间?