apache-spark - Hive on Spark超时

标签 apache-spark hadoop hive cloudera

在cloudera分发下,我已经根据在线文档配置了HIVE ON SPARK。

尝试执行简单的查询以进行测试时:

beeline -u "jdbc:hive2://<HOST_NAME>.<DOMAIN>:10000/default" -n mehditazi -p <PASSWORD>  -e "SET hive.execution.engine=spark;SET spark.dynamicAllocation.enabled=true;SET spark.executor.memory=4g;SET spark.executor.cores=4;SET hive.spark.client.connect.timeout=5000;select count(*) from default.sample_07";

我收到以下错误:

<==即时获取==>

处理语句时出错:失败:执行错误,从org.apache.hadoop.hive.ql.exec.spark.SparkTask返回代码1

<==我感染了==>
2018-05-31 18:29:51,625 WARN  [main] mapreduce.TableMapReduceUtil: The hbase-prefix-tree module jar containing PrefixTreeCodec is not present.  Continuing without it.
scan complete in 3ms
Connecting to jdbc:hive2://<HOST_NAME>.<DOMAIN>:10000/default
Connected to: Apache Hive (version 1.1.0-cdh5.8.0)
Driver: Hive JDBC (version 1.1.0-cdh5.8.0)
Transaction isolation: TRANSACTION_REPEATABLE_READ
No rows affected (0.101 seconds)
No rows affected (0.005 seconds)
No rows affected (0.005 seconds)
No rows affected (0.005 seconds)
No rows affected (0.005 seconds)
INFO  : Compiling command(queryId=hive_20180531182929_1e4bd43e-df8a-4b87-b898-dc73eebfbda3): select count(*) from default.sample_07
INFO  : Semantic Analysis Completed
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20180531182929_1e4bd43e-df8a-4b87-b898-dc73eebfbda3); Time taken: 0.463 seconds
INFO  : Executing command(queryId=hive_20180531182929_1e4bd43e-df8a-4b87-b898-dc73eebfbda3): select count(*) from default.sample_07
INFO  : Query ID = hive_20180531182929_1e4bd43e-df8a-4b87-b898-dc73eebfbda3
INFO  : Total jobs = 1
INFO  : Launching Job 1 out of 1
INFO  : Starting task [Stage-1:MAPRED] in serial mode
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
ERROR : Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:64)
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
        at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:125)
        at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1782)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1539)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1318)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1127)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:178)
        at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:72)
        at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:232)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
        at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:245)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Timed out waiting for client connection.
        at com.google.common.base.Throwables.propagate(Throwables.java:156)
        at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:120)
        at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
        at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:99)
        at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:95)
        at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
        ... 22 more
Caused by: java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Timed out waiting for client connection.
        at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
        at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:104)
        ... 27 more
Caused by: java.util.concurrent.TimeoutException: Timed out waiting for client connection.
        at org.apache.hive.spark.client.rpc.RpcServer$2.run(RpcServer.java:141)
        at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        ... 1 more

ERROR : Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:64)
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
        at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:125)
        at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1782)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1539)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1318)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1127)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:178)
        at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:72)
        at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:232)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
        at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:245)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Timed out waiting for client connection.
        at com.google.common.base.Throwables.propagate(Throwables.java:156)
        at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:120)
        at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
        at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:99)
        at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:95)
        at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65)
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
        ... 22 more

最佳答案

经过调查,这是一个错误,该错误将在HIVE-10594 [1]中修复。

关于apache-spark - Hive on Spark超时,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50629056/

相关文章:

scala - Spark 数与拍摄和长度

hadoop - HBase & Mahout - 使用 HBase 作为 Mahout 的数据存储/源 - 分类

SQL 2016 PolyBase 计算下推到使用 WASBS 又名 Azure Blob 的 Hadoop HDI

java - 在 2 个或多个 JOIN 上选择 DISTINCT

python - 检查数据框是否包含任何空值

scala - 如何获取 RDD 的子集?

apache-spark - 从TF-YARN库创建pex进行分布式培训时出错

macos - HDFS和Hadoop 1.0.3的疑惑

bash - 如何使脚本处理不同的文件?

hadoop - 在HDFS中查询文本文件(压缩格式)的最佳工具是什么?