java - 堆空间错误: SparkListenerBus

标签 java apache-spark pyspark apache-spark-2.0

我正在尝试调试 PySpark 程序,坦率地说,我被难住了。

我在日志中看到以下错误。我验证了输入参数 - 一切似乎都按顺序进行。

驱动程序和执行程序似乎是正确的 - 每个节点上使用了大约 7GB 中的 3MB。 我看到创建的 DAG 计划非常庞大。会不会是这个原因?

17/02/18 00:59:02 ERROR Utils:在线程 SparkListenerBus 中抛出未捕获的 fatal error

java.lang.OutOfMemoryError:Java 堆空间

    at java.util.Arrays.copyOfRange(Arrays.java:3664)

    at java.lang.String.<init>(String.java:207)

    at java.lang.StringBuilder.toString(StringBuilder.java:407)

    at com.fasterxml.jackson.core.util.TextBuffer.contentsAsString(TextBuffer.java:356)

    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.getText(ReaderBasedJsonParser.java:235)

    at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:20)

    at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:42)

    at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:35)

    at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3736)

    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2726)

    at org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:20)

    at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:50)

    at org.apache.spark.util.JsonProtocol$.sparkEventToJson(JsonProtocol.scala:103)

    at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:134)

    at org.apache.spark.scheduler.EventLoggingListener.onOtherEvent(EventLoggingListener.scala:202)

    at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:67)

    at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:36)

    at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:36)

    at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:63)

    at org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:36)

    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:94)

    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)

    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)

    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)

    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78)

    at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245)

    at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77)

线程“SparkListenerBus”中出现异常java.lang.OutOfMemoryError:Java堆空间

    at java.util.Arrays.copyOfRange(Arrays.java:3664)

    at java.lang.String.<init>(String.java:207)

    at java.lang.StringBuilder.toString(StringBuilder.java:407)

    at com.fasterxml.jackson.core.util.TextBuffer.contentsAsString(TextBuffer.java:356)

    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.getText(ReaderBasedJsonParser.java:235)

    at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:20)

    at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:42)

    at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:35)

    at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3736)

    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2726)

    at org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:20)

    at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:50)

    at org.apache.spark.util.JsonProtocol$.sparkEventToJson(JsonProtocol.scala:103)

    at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:134)

    at org.apache.spark.scheduler.EventLoggingListener.onOtherEvent(EventLoggingListener.scala:202)

    at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:67)

    at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:36)

    at org.apache.spark.scheduler.LiveListenerBus.doPostEvent(LiveListenerBus.scala:36)

    at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:63)

    at org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:36)

    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:94)

    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)

    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79)

    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)

    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78)

    at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245)

    at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77)

最佳答案

此错误的解决方法是使用以下设置:

spark.eventLog.enabled=false 

但这意味着您没有收到任何事件日志。

关于java - 堆空间错误: SparkListenerBus,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48837010/

相关文章:

javax.mail.Part.getContent() ArrayIndexOutOfBoundsException : 16448

java - 两个运算符之间的 AND 如何工作?

java - SharedPreferences 中的多个值

java - Apache Spark MySQL JavaRDD.foreachPartition - 为什么我收到 ClassNotFoundException

scala - Spark Scala,如何检查数据框中是否存在嵌套列

apache-spark - pyspark 是否支持窗口函数(例如 first、last、lag、lead)?

pyspark - Dataproc 上的 Spark 流数据管道遇到突然频繁的套接字超时

python - 如何在 PySpark 中的每个分区中回填空值

java - 如何确定Java VM是否安装在Windows上?

apache-spark - 我确实通过hiveql更改了表。然后用spark-sql显示表是行不通的。错误:路径不存在