java - 在 Spark Web UI 中看不到完成的作业

我正在使用 ./bin/spark-submit 来运行我的 Spark 作业。它运行良好，但打开 Spark Web UI，我在完成列表中看不到作业。

./bin/spark-submit --name "myapp" --master local --conf "spark.master=spark://fahad:7077" --class com.apptest.App ~/app-0.0.1-SNAPSHOT.jar

注意:Spark 版本 2.0.1，1 个 worker 正在运行，master UI 位于 localhost:8080 worker 和 master 都从 ./sbin/start-*.sh 运行脚本。

最佳答案

有两种不同的 UI，常规 Spark UI 和 Spark History Server。

在作业完成后显示作业的是历史服务器。

http://spark.apache.org/docs/latest/monitoring.html

他们在文档中解释说您需要通过运行来启动它:

./sbin/start-history-server.sh

This creates a web interface at http://server-url:18080 by default, listing incomplete and completed applications and attempts.

When using the file-system provider class (see spark.history.provider below), the base logging directory must be supplied in the spark.history.fs.logDirectory configuration option, and should contain sub-directories that each represents an application’s event logs.

The spark jobs themselves must be configured to log events, and to log them to the same shared, writeable directory. For example, if the server was configured with a log directory of hdfs://namenode/shared/spark-logs, then the client-side options would be:

spark.eventLog.enabled true spark.eventLog.dir hdfs://namenode/shared/spark-logs

关于java - 在 Spark Web UI 中看不到完成的作业，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/39936593/

上一篇：java - 分配参数类型对整数溢出的影响

下一篇：java - spring MVC 中的 validator 没有配置消息属性文件

sql - Spark groupBy 聚合结果加入回初始数据框

apache-spark - 舞台上显示数字的 Spark 含义

apache-spark - Spark 流数据如何存储

python - 使用 Spark 压缩文件

apache-spark - 在 spark 中，参数 "minPartitions"在 SparkContext.textFile(path, minPartitions) 中有什么作用？

java - 如何在android中对JSONArray进行排序

java - 在分形 java 的 Paint() 方法之外使用 Graphics 对象

java - 当其他类出现编译问题时，如何在IntelliJ IDEA中运行测试？

java - 正则表达式与打印精美的文件不匹配