我正在尝试使用 jersey Rest-API 通过 java-Spark 程序从 HBASE 表中获取记录,然后我收到下面提到的错误,但是当我通过 spark-Jar 访问 HBase 表时,代码正在执行而没有错误。
我有 2 个 Hbase 工作节点和 2 个 Spark 工作节点,它们由同一个 Master 维护。
WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, 172.31.16.140): java.lang.IllegalStateException: unread block data at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
最佳答案
好的,我可能知道你的问题,因为我刚刚经历过。
原因很可能是漏掉了一些hbase jar,因为spark运行时,spark需要通过hbase jar读取数据,如果不存在,就会抛出一些异常,怎么办?这很容易
在提交作业之前,您需要添加 params --jars 并加入一些 jar 如下:
--jars
/ROOT/server/hive/lib/hive-hbase-handler-1.2.1.jar,
/ROOT/server/hbase/lib/hbase-client-0.98.12-hadoop2.jar,
/ROOT/server/hbase/lib/hbase-common-0.98.12-hadoop2.jar,
/ROOT/server/hbase/lib/hbase-server-0.98.12-hadoop2.jar,
/ROOT/server/hbase/lib/hbase-hadoop2-compat-0.98.12-hadoop2.jar,
/ROOT/server/hbase/lib/guava-12.0.1.jar,
/ROOT/server/hbase/lib/hbase-protocol-0.98.12-hadoop2.jar,
/ROOT/server/hbase/lib/htrace-core-2.04.jar
如果可以,享受它!
关于apache-spark - Spark-HBASE 错误 java.lang.IllegalStateException : unread block data,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34901331/