我基本上是在尝试将数据 append 到 HDFS 中已经存在的文件中。这是我得到的异常
03:49:54,456WARN org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run:628 DataStreamer Exception
java.lang.NullPointerException
at com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336)
at com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$UpdatePipelineRequestProto$Builder.addAllStorageIDs(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.updatePipeline(ClientNamenodeProtocolTranslatorPB.java:842)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1238)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:532)
我的复制因子是 1。我使用的是 Apache 的 Hadoop 发行版 2.5.0。这是我用于创建文件(如果文件不存在)或以追加模式创建(如果存在)的代码片段
String url = getHadoopUrl()+fileName;
Path file = new Path(url);
try {
if(append) {
if(hadoopFileSystem.exists(file))
fsDataOutputStream = hadoopFileSystem.append(file);
else
fsDataOutputStream = hadoopFileSystem.create(file);
}
else
fsDataOutputStream = hadoopFileSystem.create(file);
不太清楚是什么导致了这个异常。在阅读各种资料后,我也很困惑 HDFS 是否支持追加。让我知道我在这里遗漏了什么
编辑:追加我在数据节点日志中找到的堆栈跟踪
2015-10-30 16:19:54,435 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1012136337-192.168.123.103-1411103100884:blk_1073742239_1421 src: /127.0.0.1:54160 dest: /127.0.0.1:50010
2015-10-30 16:19:54,435 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Appending to FinalizedReplica, blk_1073742239_1421, FINALIZED
getNumBytes() = 812
getBytesOnDisk() = 812
getVisibleLength()= 812
getVolume() = /Users/niranjan/hadoop/hdfs/datanode/current
getBlockFile() = /Users/niranjan/hadoop/hdfs/datanode/current/BP- 1012136337-192.168.123.103-1411103100884/current/finalized/blk_1073742239
unlinked =false
2015-10-30 16:19:54,461 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-1012136337-192.168.123.103-1411103100884:blk_1073742239_1422
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:435)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:693)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:569)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:115)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
最佳答案
通过搜索,我发现将以下内容添加到您的 hdfs-site.xml
可能会有所帮助。
<property>
<name>dfs.datanode.max.transfer.threads</name>
<value>8192</value>
</property>
关于java - append 到 HDFS 中的现有文件时出现异常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33434785/