hadoop - 远程IO异常

标签 hadoop

在运行wordcount时出现此异常。

2014-11-29 09:29:28,179 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1417232449434_0005_r_000000_3: Error: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /outputwords1/_temporary/1/_temporary/attempt_1417232449434_0005_r_000000_3/part-r-00000 could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and no node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
    at org.apache.hadoop.ipc.Client.call(Client.java:1347)
    at org.apache.hadoop.ipc.Client.call(Client.java:1300)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
    at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)

最佳答案

如果可能是由于dfs.replication设置值1引起的,如果没有解决,请检查所有给定的属性是否在文件中。

在hadoop / conf / hdfs-site.xml中设置以下属性

<property>
   <name>dfs.replication</name>
   <value>1</value>
 </property>
 <property>
   <name>hadoop.tmp.dir</name>
   <value>/var/hdfs</value>
 </property>
 <property>
   <name>dfs.datanode.data.dir</name>
   <value>file:/home/user17/mydata/hdfs/datanode</value>
 </property>

在hadoop / conf / core-site.xml中设置以下属性
<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:9000/</value>
</property>

在hadoop / conf / mapred-site.xml中设置以下属性
<property>
   <name>mapred.job.tracker</name>
   <value>localhost:9001</value>
   <description>Host and port for jobtracker. As we use localhost,
    it will be single map and reduce task.</description>
</property>

然后,
stop-all.sh
hadoop namenode -format
start-all.sh

关于hadoop - 远程IO异常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27199182/

相关文章:

hadoop - 具有本地目录输入和 HBase 输出的脚本化 MapReduce

java - 找不到MapReduce停用词

hadoop - 如何访问HDFS头节点群集中安装的pyspark

mongodb - Mongo-Hadoop 流式传输

hadoop - 在HDFS上载期间可以读取数据吗?

hadoop - 是否为 Oozie 操作配置队列可选

bash - 从 Cloudera Hadoop 中删除指定天数内的目录

hadoop - 链接时 Spark 流作业失败

hadoop - 在集群上运行mapreduce程序时,输入路径被视为输出路径

java - mapreduce作业卡在 map 100上(使用元组值)