hadoop - AWS S3上的HBase HFile损坏

标签 hadoop amazon-s3 mapreduce hbase elastic-map-reduce

我正在 S3 上启用的 EMR群集(emr-5.7.0)上运行 HBase
我们正在使用“ ImportTsv ”和“ CompleteBulkLoad ”实用程序将数据导入HBase。
在我们的过程中,我们观察到断断续续的失败,指出某些导入的文件存在 HFile损坏。这种情况偶尔会发生,没有可以推断出错误的模式。

经过大量研究并在博客中提出了许多建议,我尝试了以下修复方法,但无济于事,我们仍然面临着差异。

Tech Stack :

  • AWS EMR Cluster (emr-5.7.0 | r3.8xlarge | 15 nodes)

  • AWS S3

  • HBase 1.3.1



Data Volume:

  • ~ 960000 lines (To be upserted) | ~ 7GB TSV file


Commands used in sequence:

 1) hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator="|"  -Dimporttsv.columns="<Column Names (472 Columns)>" -Dimporttsv.bulk.output="<HFiles Path on HDFS>" <Table Name> <TSV file path on HDFS> 
 2) hadoop fs -chmod 777 <HFiles Path on HDFS>
 3) hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles <HFiles Path on HDFS> <Table Name>


Fixes Tried:

  1. Increasing S3 Max Connections:

    • We increased the below property but it did not seem to resolve the issue. fs.s3.maxConnections : Values tried -- 10000, 20000, 50000, 100000.
  2. HBase Repair:

    • Another approach was to execute the HBase repair command but it didn't seem to help either.
      Command : hbase hbase hbck -repair


错误跟踪如下:

[LoadIncrementalHFiles-17] mapreduce.LoadIncrementalHFiles: Received a CorruptHFileException from region server: row '00218333246' on table 'WB_MASTER' at region=WB_MASTER,00218333246,1506304894610.f108f470c00356217d63396aa11cf0bc., hostname=ip-10-244-8-74.ec2.internal,16020,1507907710216, seqNum=198 org.apache.hadoop.hbase.io.hfile.CorruptHFileException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file s3://wbpoc-landingzone/emrfs_test/wb_hbase_compressed/data/default/WB_MASTER/f108f470c00356217d63396aa11cf0bc/cf/2a9ecdc5c3aa4ad8aca535f56c35a32d_SeqId_200_ at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:497) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:525) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.(StoreFile.java:1170) at org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:259) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:427) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:528) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:518) at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:667) at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:659) at org.apache.hadoop.hbase.regionserver.HStore.bulkLoadHFile(HStore.java:799) at org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:5574) at org.apache.hadoop.hbase.regionserver.RSRpcServices.bulkLoadHFile(RSRpcServices.java:2034) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34952) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2339) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168) Caused by: java.io.FileNotFoundException: File not present on S3 at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem$NativeS3FsInputStream.read(S3NativeFileSystem.java:203) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at java.io.DataInputStream.readFully(DataInputStream.java:195) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:391) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:482)



找出造成差异的根本原因的任何建议都将非常有帮助。

感谢您的帮助!谢谢!

最佳答案

经过大量的研究和反复试验后,由于AWS支持人员的帮助,我终于能够找到解决此问题的方法。看来问题出在S3的最终一致性。 AWS团队建议使用以下属性,它的工作原理很吸引人,到目前为止,我们还没有解决HFile损坏问题。如果有人遇到同样的问题,希望这对您有所帮助!

属性(hbase-site.xml):
hbase.bulkload.retries.retryOnIOException:真

关于hadoop - AWS S3上的HBase HFile损坏,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47998979/

相关文章:

java - GridGain MapReduce 函数 G.grid().reduce() 中需要的清晰度

hadoop - 找不到映射器类

hadoop - 在 Hadoop 中,节点获取错误的 IP 地址

java - 将Json Flat文件从本地复制到HDFS

ruby-on-rails-4 - Rails 获取上传到 AMAZON S3 的视频或音频文件时长

android - 将 android 库发布到亚马逊云程序

scala - 合并具有单个 header 的 Spark 输出 CSV 文件

hadoop - 当我尝试将hadoop jar运行到远程节点时出现错误

javascript - Amazon S3 - 上传图像的 URL 下载图像而不是在浏览器中显示

hadoop - 多少个映射器和化简器将为完成一项Hadoop工作而采取行动?