我有一个包含 1 个名称节点和 6 个数据节点的集群。停用 3 个数据节点后。我们的 YARN 服务总是很糟糕。并且似乎其中一个数据节点上的节点管理器从未成功启动。然后我尝试重新启动那个盒子上的节点管理器。这是日志。
2014-08-01 11:19:08,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system shutdown complete.
2014-08-01 11:19:08,217 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: Disallowed NodeManager from box708.datafireball.com, Sending SHUTDOWN signal to the NodeManager.
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:185)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:197)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:352)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:398)
Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: Disallowed NodeManager from box708.datafireball.com, Sending SHUTDOWN signal to the NodeManager.
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:255)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:179)
... 6 more
我用谷歌搜索了这个错误,但找不到解决方案,有任何人提供指导吗?
最佳答案
来自 ResourceManager 的消息:不允许的 NodeManager
此消息表示您的 NodeManager 不在允许的节点管理器列表中或在排除列表中。
检查资源管理器的配置以获得以下属性:
yarn.resourcemanager.nodes.include-path
yarn.resourcemanager.nodes.exclude-path
关于hadoop - 部分节点退役后无法启动某节点管理器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25083601/