java - 提交flink作业时如何处理akka AskTimeoutException

标签 java akka apache-flink flink-streaming

Flink 1.5.3,当我将 flink 作业提交到 flink 集群(在 yarn 上)时,它总是抛出 AskTimeoutException。在flink的配置文件中,我配置了参数 "akka.ask.timeout=1000s",但是Exception还是下面这样。

这意味着我增加了超时参数,“akka.ask.timeout=1000s”,但它不起作用。

org.apache.flink.runtime.rest.handler.RestHandlerException: Job submission failed.
    at org.apache.flink.runtime.rest.handler.job.JobSubmitHandler.lambda$handleRequest$2(JobSubmitHandler.java:116)
    at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
    at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
    at org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:770)
    at akka.dispatch.OnComplete.internal(Future.scala:258)
    at akka.dispatch.OnComplete.internal(Future.scala:256)
    at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)
    at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)
    at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
    at org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:83)
    at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
    at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
    at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:603)
    at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)
    at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)
    at scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)
    at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)
    at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)
    at akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)
    at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)
    at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.CompletionException: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/dispatcher#-1851759541]] after [10000 ms]. Sender[null] sent message of type "org.apache.flink.runtime.rpc.messages.LocalFencedMessage".
    at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
    at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
    at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
    at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899)
    ... 21 more

Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/dispatcher#-1851759541]] after [10000 ms]. Sender[null] sent message of type "org.apache.flink.runtime.rpc.messages.LocalFencedMessage".
    at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)
    ... 9 more

那么有什么解决方案可以避免这个问题吗?

最佳答案

REST 处理程序和 Flink 集群之间的通信超时由 web.timeout 控制。超时以毫秒为单位指定,因此,如果您想等待 1000 秒,则需要在 flink-conf.yaml 中将其设置为 web.timeout: 1000000

此外,最好检查集群入口点日志,了解为什么作业提交需要这么长时间。通常不应超过 10 秒。

关于java - 提交flink作业时如何处理akka AskTimeoutException,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53893172/

相关文章:

java - java中评估字符串上的xpath并返回结果字符串的简单方法是什么

java - 如何仅对某个 url 路径应用 HttpAuthenticationMechanism?

digital-ocean - 在DCOS中安装Flink出错

scala - flink中wordcount示例中与JobManager的通信失败错误

hadoop - Apache Flink 与 Hadoop 上的 Mapreduce 相比如何?

java - 为什么在 Eclipse IDE 中不产生编译时异常除以零?

Java Process.waitFor() 与 Process.exitValue()

scala - 如何打印akka系统中的所有 Actor ?

scala - 如何在不阻塞的情况下使用 Akka 询问模式

akka - 如何使用 Akka BoundedMailBox 来限制生产者