java - Apache Spark 部署的网络问题

标签 java apache-spark ubuntu-server

到目前为止,我一直在我的应用程序中“嵌入”Spark。现在,我想在专用服务器上运行它。

我是那么远:

  • 全新的ubuntu 16,服务器名为micha/ip 10.0.100.120,安装scala 2.10,安装Spark 1.6.2,重新编译
  • Pi 测试成功
  • 8080 端口上的 UI 正常

日志说:

Spark Command: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp /opt/apache-spark-1.6.2/conf/:/opt/apache-spark-1.6.2/assembly/target/scala-2.10/spark-assembly-1.6.2-hadoop2.2.0.jar:/opt/apache-spark-1.6.2/lib_managed/jars/datanucleus-rdbms-3.2.9.jar:/opt/apache-spark-1.6.2/lib_managed/jars/datanucleus-core-3.2.10.jar:/opt/apache-spark-1.6.2/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar -Xms1g -Xmx1g org.apache.spark.deploy.master.Master --ip micha --port 7077 --webui-port 8080
========================================
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/07/10 13:03:55 INFO Master: Registered signal handlers for [TERM, HUP, INT]
16/07/10 13:03:55 WARN Utils: Your hostname, micha resolves to a loopback address: 127.0.1.1; using 10.0.100.120 instead (on interface eno1)
16/07/10 13:03:55 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/07/10 13:03:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/07/10 13:03:55 INFO SecurityManager: Changing view acls to: root
16/07/10 13:03:55 INFO SecurityManager: Changing modify acls to: root
16/07/10 13:03:55 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/07/10 13:03:56 INFO Utils: Successfully started service 'sparkMaster' on port 7077.
16/07/10 13:03:56 INFO Master: Starting Spark master at spark://micha:7077
16/07/10 13:03:56 INFO Master: Running Spark version 1.6.2
16/07/10 13:03:56 INFO Server: jetty-8.y.z-SNAPSHOT
16/07/10 13:03:56 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:8080
16/07/10 13:03:56 INFO Utils: Successfully started service 'MasterUI' on port 8080.
16/07/10 13:03:56 INFO MasterWebUI: Started MasterWebUI at http://10.0.100.120:8080
16/07/10 13:03:56 INFO Server: jetty-8.y.z-SNAPSHOT
16/07/10 13:03:56 INFO AbstractConnector: Started SelectChannelConnector@micha:6066
16/07/10 13:03:56 INFO Utils: Successfully started service on port 6066.
16/07/10 13:03:56 INFO StandaloneRestServer: Started REST server for submitting applications on port 6066
16/07/10 13:03:56 INFO Master: I have been elected leader! New state: ALIVE

在我的应用程序中,我将配置更改为:

SparkConf conf = new SparkConf().setAppName("myapp").setMaster("spark://10.0.100.120:6066");

(也试过7077)

在客户端:

16-07-10 13:22:58:300 INFO org.spark-project.jetty.server.AbstractConnector - Started SelectChannelConnector@0.0.0.0:4040
16-07-10 13:22:58:300 DEBUG org.spark-project.jetty.util.component.AbstractLifeCycle - STARTED SelectChannelConnector@0.0.0.0:4040
16-07-10 13:22:58:300 DEBUG org.spark-project.jetty.util.component.AbstractLifeCycle - STARTED org.spark-project.jetty.server.Server@3eb292cd
16-07-10 13:22:58:301 INFO org.apache.spark.util.Utils - Successfully started service 'SparkUI' on port 4040.
16-07-10 13:22:58:306 INFO org.apache.spark.ui.SparkUI - Started SparkUI at http://10.0.100.100:4040
16-07-10 13:22:58:621 INFO org.apache.spark.deploy.client.AppClient$ClientEndpoint - Connecting to master spark://10.0.100.120:6066...
16-07-10 13:22:58:648 DEBUG org.apache.spark.network.client.TransportClientFactory - Creating new connection to /10.0.100.120:6066
16-07-10 13:22:58:689 DEBUG io.netty.util.ResourceLeakDetector - -Dio.netty.leakDetectionLevel: simple
16-07-10 13:22:58:714 WARN org.apache.spark.deploy.client.AppClient$ClientEndpoint - Failed to connect to master 10.0.100.120:6066
java.io.IOException: Failed to connect to /10.0.100.120:6066
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)

如果我尝试远程登录:

$ telnet 10.0.100.120 6066
Trying 10.0.100.120...
telnet: connect to address 10.0.100.120: Connection refused
telnet: Unable to connect to remote host

$ telnet 10.0.100.120 7077
Trying 10.0.100.120...
telnet: connect to address 10.0.100.120: Connection refused
telnet: Unable to connect to remote host

在服务器上,我使用 netstat 进行了检查:

jgp@micha:/opt/apache-spark$ netstat -a | grep 6066
tcp6       0      0 micha.nc.rr.com:6066    [::]:*                  LISTEN     
jgp@micha:/opt/apache-spark$ netstat -a | grep 7077
tcp6       0      0 micha.nc.rr.com:7077    [::]:*                  LISTEN 

如果我解释这个,它看起来像是在 IP v6 而不是 v4 中监听......

更新#1:

我设置:

_JAVA_OPTIONS=-Djava.net.preferIPv4Stack=true
SPARK_LOCAL_IP=10.0.100.120

我在日志中仍然有警告:

16/07/10 14:10:13 WARN Utils: Your hostname, micha resolves to a loopback address: 127.0.1.1; using 10.0.100.120 instead (on interface eno1)
16/07/10 14:10:13 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address

仍然连接被拒绝...

更新#2:

在系统的/etc/hosts 中有这么奇怪的一行:

127.0.0.1      localhost
127.0.1.1      micha.nc.rr.com micha

我已经将其注释掉,现在我在 Spark 的日志文件中有以下内容:

Spark Command: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp /opt/apache-spark-1.6.2/conf/:/opt/apache-spark-1.6.2/assembly/target/scala-2.10/spark-assembly-1.6.2-hadoop2.2.0.jar:/opt/apache-spark-1.6.2/lib_managed/jars/datanucleus-rdbms-3.2.9.jar:/opt/apache-spark-1.6.2/lib_managed/jars/datanucleus-core-3.2.10.jar:/opt/apache-spark-1.6.2/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar -Xms1g -Xmx1g org.apache.spark.deploy.master.Master --ip micha --port 7077 --webui-port 8080
========================================
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/07/10 22:11:54 INFO Master: Registered signal handlers for [TERM, HUP, INT]
16/07/10 22:11:54 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/07/10 22:11:54 INFO SecurityManager: Changing view acls to: root
16/07/10 22:11:54 INFO SecurityManager: Changing modify acls to: root
16/07/10 22:11:54 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7077. Attempting port 7078.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7078. Attempting port 7079.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7079. Attempting port 7080.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7080. Attempting port 7081.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7081. Attempting port 7082.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7082. Attempting port 7083.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7083. Attempting port 7084.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7084. Attempting port 7085.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7085. Attempting port 7086.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7086. Attempting port 7087.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7087. Attempting port 7088.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7088. Attempting port 7089.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7089. Attempting port 7090.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7090. Attempting port 7091.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7091. Attempting port 7092.
16/07/10 22:11:55 WARN Utils: Service 'sparkMaster' could not bind on port 7092. Attempting port 7093.
Exception in thread "main" java.net.BindException: Cannot assign requested address: Service 'sparkMaster' failed after 16 retries! Consider explicitly setting the appropriate port for the service 'sparkMaster' (for example spark.ui.port for SparkUI) to an available port or increasing spark.port.maxRetries.
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:433)
        at sun.nio.ch.Net.bind(Net.java:425)
        at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:485)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1089)
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:430)
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:415)
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:903)
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:198)
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:348)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:745)

最佳答案

您必须在您的 Spark 服务器中配置 spark-env.sh 文件。将 SPARK_MASTER_IP 添加到 spark-env.sh

export SPARK_MASTER_IP=10.0.100.120

要从您的远程应用程序连接到 master,请使用 7077 端口。 6066 用于 REST API。

SparkConf conf = new SparkConf().setAppName("myapp").setMaster("spark://10.0.100.120:7077");

关于java - Apache Spark 部署的网络问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38294641/

相关文章:

python - 从 Flask 应用访问 Spark

python - PySpark toPandas 函数正在更改列类型

ubuntu-server - isc-dhcp-server 状态失败 : Failed with result 'exit code'

opencv - 在Ubuntu服务器11.10上安装OpenCV预构建库

java - Apache FOP 1.1 QRCodes 与 zxing

java - 如何将 HTML 文本转换为纯文本?

java - 如何在 Activemq jms 队列中设置消息 ID?

java - 根据 ArrayList<String> 和 TextView 中的值比较设置 RecyclerView 中的开关状态

hadoop - 将 _temporary 文件夹的内容移动到最终位置

c - 过滤 rsyslogd 消息