hadoop - Hadoop 2.6多节点集群

标签 hadoop high-availability

我有2个namenodes和4个datanodes.namenodes本身充当datanode并且还专用2个其他datanodes。

lsof -i:2181 -s

它在namenode1上显示namenode1 / x.x.x.x-> namenode1并且
namenode2 / x.x.x.x-> namenode2在namenode2上。
当我通过杀死activ namenode的pid来检查自动故障转移时,两个Web UI均停止并显示hadoop dfwsadmin -report的错误日志是
在namenode2 == >>>

    java.net.ConnectException: Call From namenode2/10.7.1.65 to namenode2:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
    at org.apache.hadoop.ipc.Client.call(Client.java:1472)
    at org.apache.hadoop.ipc.Client.call(Client.java:1399)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
    at com.sun.proxy.$Proxy9.getFsStats(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getStats(ClientNamenodeProtocolTranslatorPB.java:593)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy10.getStats(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.getDiskStatus(DFSClient.java:2360)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getStatus(DistributedFileSystem.java:937)
    at org.apache.hadoop.fs.FileSystem.getStatus(FileSystem.java:2254)
    at org.apache.hadoop.hdfs.tools.DFSAdmin.report(DFSAdmin.java:431)
    at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:1763)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:1941)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:740)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
    at org.apache.hadoop.ipc.Client.call(Client.java:1438)
    ... 19 more
15/09/01 10:57:03 WARN retry.RetryInvocationHandler: Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getStats over namenode1/10.7.1.73:8020. Not retrying because failovers (15) exceeded maximum allowed (15)
java.net.ConnectException: Call From namenode2/10.7.1.65 to namenode1:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
    at org.apache.hadoop.ipc.Client.call(Client.java:1472)
    at org.apache.hadoop.ipc.Client.call(Client.java:1399)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
    at com.sun.proxy.$Proxy9.getFsStats(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getStats(ClientNamenodeProtocolTranslatorPB.java:593)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy10.getStats(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.getDiskStatus(DFSClient.java:2360)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getStatus(DistributedFileSystem.java:937)
    at org.apache.hadoop.fs.FileSystem.getStatus(FileSystem.java:2254)
    at org.apache.hadoop.hdfs.tools.DFSAdmin.report(DFSAdmin.java:431)
    at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:1763)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:1941)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:740)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
    at org.apache.hadoop.ipc.Client.call(Client.java:1438)
    ... 19 more
report: Call From namenode2/10.7.1.65 to namenode1:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused


**on namenode1** == >> 

java.net.ConnectException: Call From namenode1/10.7.1.73 to namenode2:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
    at org.apache.hadoop.ipc.Client.call(Client.java:1472)
    at org.apache.hadoop.ipc.Client.call(Client.java:1399)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
    at com.sun.proxy.$Proxy9.getFsStats(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getStats(ClientNamenodeProtocolTranslatorPB.java:593)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy10.getStats(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.getDiskStatus(DFSClient.java:2360)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getStatus(DistributedFileSystem.java:937)
    at org.apache.hadoop.fs.FileSystem.getStatus(FileSystem.java:2254)
    at org.apache.hadoop.hdfs.tools.DFSAdmin.report(DFSAdmin.java:431)
    at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:1763)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:1941)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:740)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
    at org.apache.hadoop.ipc.Client.call(Client.java:1438)
    ... 19 more
15/09/01 11:32:51 INFO retry.RetryInvocationHandler: Exception while invoking getStats of class ClientNamenodeProtocolTranslatorPB over namenode1/10.7.1.73:8020 after 3 fail over attempts. Trying to fail over after sleeping for 3172ms.
java.net.ConnectException: Call From namenode1/10.7.1.73 to namenode1:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
    at org.apache.hadoop.ipc.Client.call(Client.java:1472)
    at org.apache.hadoop.ipc.Client.call(Client.java:1399)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
    at com.sun.proxy.$Proxy9.getFsStats(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getStats(ClientNamenodeProtocolTranslatorPB.java:593)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy10.getStats(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.getDiskStatus(DFSClient.java:2360)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getStatus(DistributedFileSystem.java:937)
    at org.apache.hadoop.fs.FileSystem.getStatus(FileSystem.java:2254)
    at org.apache.hadoop.hdfs.tools.DFSAdmin.report(DFSAdmin.java:431)
    at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:1763)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:1941)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:740)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
    at org.apache.hadoop.ipc.Client.call(Client.java:1438)
    ... 19 more
15/09/01 11:32:55 INFO retry.RetryInvocationHandler: Exception while invoking getStats of class ClientNamenodeProtocolTranslatorPB over namenode2/10.7.1.65:8020 after 4 fail over attempts. Trying to fail over after sleeping for 5263ms


jps for namenodes before failover ===>>>

namenode1 :::

14187 NodeManager

13959 DFSZKFailoverController

12974 JournalNode

14051 ResourceManager

15270 Jps

12543 NameNode

13480 DataNode

namenode2 :::

9569 DFSZKFailoverController

9699 NodeManager

10731 Jps

8958 JournalNode

8946 NameNode

9278 DataNode


please help me

最佳答案

我认为您可以检查您的配置文件。

core-site.xml

fs.defaultFs

http://asdfg(逻辑网址)

ha.zookeeper.quorum

h1.c1:2181,h1.c2:2181,h1.c3:2181(仲裁列表)

hdfs-site.xml

dfs.nameservices

asdfg

dfs.ha.namenodes.asdfg

nn1,nn2
....
....

see this link

关于hadoop - Hadoop 2.6多节点集群,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32324760/

相关文章:

hadoop - 在同一台机器上运行多个数据节点

redis - 当一个数据中心的应用程序集群故障转移到另一个数据中心时,连接的 Redis 机器会发生什么情况?

hadoop - 如何从sparkql到UI检索数据?

hadoop - shuffle 和 sort 阶段是 map 还是 reduce 阶段的一部分?

azure - Azure Kubernetes 服务 (AKS) 的高可用性/弹性

amazon-web-services - AWS 灾难恢复以及备份和存储

deployment - 为自托管服务/博客构建低功耗 HA 集群

hadoop - Hive无法检测到当前的名称节点

apache - Solr 索引从 1.4.1 升级到 5.2.1

eclipse - 在 org.apache.hadoop.examples.WordCount.main(WordCount.java :84)