hadoop - 在Hive中运行删除或更新时遇到间歇性问题

标签 hadoop hive hiveql tez

我的群集有一个非常奇怪的行为,我在Hive cli中运行Delete / Update语句,该命令有时工作正常,但有时会失败,以下是Command。 表已启用ACID

hive> delete from temptable where name='Jose';

异常如下:

Status: Running (Executing on YARN cluster with App id application_15217_3223)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1                 FAILED     -1          0        0       -1       0       0
Reducer 2             KILLED      2          0        0        2       0       0
--------------------------------------------------------------------------------
VERTICES: 00/02  [>>--------------------------] 0%    ELAPSED TIME: 0.23 s
--------------------------------------------------------------------------------
Status: Failed
Vertex failed, vertexName=Map 1, vertexId=vertex_15217_3223_1_00, diagnostics=[Vertex vertex_15217_3223_1_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: offset initializer failed, vertex=vertex_1514368279217_3223_1_00 [Map 1], java.lang.RuntimeException: serious problem
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1273)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1300)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:307)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:409)
        at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:273)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:266)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:266)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 1
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1268)
        ... 15 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSargColumnNames(OrcInputFormat.java:358)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.setSearchArgument(OrcInputFormat.java:392)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.callInternal(OrcInputFormat.java:1011)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.access$2000(OrcInputFormat.java:838)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:992)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:989)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:989)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:838)
        ... 4 more
]
Vertex killed, vertexName=Reducer 2, vertexId=vertex_15217_3223_1_01, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_15217_3223_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_15217_3223_1_00, diagnostics=[Vertex vertex_15217_3223_1_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: offset initializer failed, vertex=vertex_15217_3223_1_00 [Map 1], java.lang.RuntimeException: serious problem
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1273)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1300)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:307)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:409)
        at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:273)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:266)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:266)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 1
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1268)
        ... 15 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSargColumnNames(OrcInputFormat.java:358)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.setSearchArgument(OrcInputFormat.java:392)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.callInternal(OrcInputFormat.java:1011)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.access$2000(OrcInputFormat.java:838)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:992)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:989)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:989)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:838)
        ... 4 more
]Vertex killed, vertexName=Reducer 2, vertexId=vertex_15217_3223_1_01, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_15217_3223_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1


找不到任何解决方案。

是什么原因,我需要启用某些功能还是需要验证任何配置或cli中缺少某些功能?

最佳答案

您可以尝试增加用于运行查询的内存。在完成查询之前,可能是内存用完了。也尝试使用Tez,我发现它的运行速度通常更快...两者之间的解释可以在这里找到
https://community.hortonworks.com/questions/83394/difference-between-mr-and-tez.html

在运行查询之前,请尝试设置这些值(根据需要增加内存)

SET hive.tez.container.size=4096;
SET hive.tez.java.opts='-Xmx2000m';
SET hive.execution.engine=tez;

或者这两个(如果您打算使用mapreduce)
yarn.scheduler.minimum-allocation-mb 
yarn.scheduler.maximum-allocation-mb

关于hadoop - 在Hive中运行删除或更新时遇到间歇性问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48162425/

相关文章:

sql - 从另一个表创建临时表,包括配置单元中的分区列

time - HIVE:如何计算时间的秒差格式:yyyyMMdd HH:mm:ss

java - 启动 Sqoop2 server (1.99.7) 时无法找到 hadoop 配置类

Hadoop、Hive、Pig、HBase、Cassandra——什么时候用什么?

hadoop - Apache Flume 1.5 未在 Hadoop 2/自动故障转移集群配置中给出预期结果

hadoop - hive SQL为字符串中的每个字母搜索模式

sql - Hive - 计算数据库中每天的记录数

ubuntu - hadoop Web UI 位于 http ://localhost:50070/doesnt work

amazon-web-services - 将数据从 Hive 写入 Amazon S3,同时保持表分区目录结构

hadoop - 导入hive包报错