hadoop - 扫描数据时Accumulo平板电脑服务器出现错误

我在Accumulo中有一堆表，其中一个主服务器和2个平板电脑服务器包含一堆存储数百万条记录的表。问题是，每当我扫描表以获取一些记录时，平板电脑服务器日志就会不断抛出此错误

2015-11-12 04:38:56,107 [hdfs.DFSClient] WARN : Failed to connect to /192.168.250.12:50010 for block, add to deadNodes and continue. java.io.IOException: Got error, status message opReadBlock BP-1881591466-192.168.1.111-1438767154643:blk_1073773956_33167 received exception java.io.IOException:  Offset 16320 and length 20 don't match block BP-1881591466-192.168.1.111-1438767154643:blk_1073773956_33167 ( blockLen 0 ), for OP_READ_BLOCK, self=/192.168.250.202:55915, remote=/192.168.250.12:50010, for file /accumulo/tables/1/default_tablet/F0000gne.rf, for pool BP-1881591466-192.168.1.111-1438767154643 block 1073773956_33167
java.io.IOException: Got error, status message opReadBlock BP-1881591466-192.168.1.111-1438767154643:blk_1073773956_33167 received exception java.io.IOException:  Offset 16320 and length 20 don't match block BP-1881591466-192.168.1.111-1438767154643:blk_1073773956_33167 ( blockLen 0 ), for OP_READ_BLOCK, self=/192.168.250.202:55915, remote=/192.168.250.12:50010, for file /accumulo/tables/1/default_tablet/F0000gne.rf, for pool BP-1881591466-192.168.1.111-1438767154643 block 1073773956_33167
 at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:456)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:424)
        at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:818)
        at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:697)
        at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:355)
        at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:618)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:844)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:896)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:697)
        at java.io.DataInputStream.readShort(DataInputStream.java:312)
        at org.apache.accumulo.core.file.rfile.bcfile.Utils$Version.<init>(Utils.java:264)
        at org.apache.accumulo.core.file.rfile.bcfile.BCFile$Reader.<init>(BCFile.java:823)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.init(CachableBlockFile.java:246)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBCFile(CachableBlockFile.java:257)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.access$100(CachableBlockFile.java:137)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$MetaBlockLoader.get(CachableBlockFile.java:209)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBlock(CachableBlockFile.java:313)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:368)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:137)
        at org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:843)
        at org.apache.accumulo.core.file.rfile.RFileOperations.openReader(RFileOperations.java:79)
        at org.apache.accumulo.core.file.DispatchingFileFactory.openReader(DispatchingFileFactory.java:69)
        at org.apache.accumulo.tserver.tablet.Compactor.openMapDataFiles(Compactor.java:279)
        at org.apache.accumulo.tserver.tablet.Compactor.compactLocalityGroup(Compactor.java:322)
        at org.apache.accumulo.tserver.tablet.Compactor.call(Compactor.java:214)
        at org.apache.accumulo.tserver.tablet.Tablet._majorCompact(Tablet.java:1976)
        at org.apache.accumulo.tserver.tablet.Tablet.majorCompact(Tablet.java:2093)
        at org.apache.accumulo.tserver.tablet.CompactionRunner.run(CompactionRunner.java:44)
        at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
        at java.lang.Thread.run(Thread.java:745)

我认为与Accumulo相比，这更多是与HDFS相关的问题，因此我检查了datanode的日志并发现了相同的消息，

Offset 16320 and length 20 don't match block BP-1881591466-192.168.1.111-1438767154643:blk_1073773956_33167 ( blockLen 0 ), for OP_READ_BLOCK, self=/192.168.250.202:55915, remote=/192.168.250.12:50010, for file /accumulo/tables/1/default_tablet/F0000gne.rf, for pool BP-1881591466-192.168.1.111-1438767154643 block 1073773956_33167

但是作为INFO在日志中。我不明白的是，为什么我会收到此错误。

我可以看到我尝试访问的文件(BP-1881591466-192.168.1.111-1438767154643)的池名包含的IP地址(192.168.1.111)与任何服务器(自身)的IP地址都不匹配和远程)。实际上，192.168.1.111是Hadoop Master服务器的旧IP地址，但我已对其进行了更改。我使用域名而不是IP地址，因此进行更改的唯一位置是群集中计算机的主机文件。 Hadoop / Accumulo配置都不使用IP地址。有人知道这里的问题吗？我已经花了几天时间，但仍然无法弄清楚。

最佳答案

您收到的错误表明Accumulo无法从HDFS读取其文件之一的一部分。 NameNode报告一个块位于特定DataNode上(在您的情况下为192.168.250.12)。但是，当Accumulo尝试从该DataNode读取时，它将失败。

这可能表明HDFS中的损坏块或临时网络问题。您可以尝试运行hadoop fsck /(具体命令可能会有所不同，具体取决于版本)以执行HDFS的运行状况检查。

另外，DataNode中的IP地址不匹配似乎表明该DataNode对于作为其一部分的HDFS池感到困惑。仔细检查其配置，DNS和/etc/hosts是否存在异常之后，应重新启动该DataNode。

关于hadoop - 扫描数据时Accumulo平板电脑服务器出现错误，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33682810/

hadoop - 扫描数据时Accumulo平板电脑服务器出现错误

上一篇：postgresql - Liquibase 'includeAll'标签在databasechangelog表中为同一变更集生成2行

下一篇：docker - Travis CI - docker 镜像未推送到 dockerhub