我通过 Homebrew 安装了 Hadoop 和 Spark
$ brew list --versions | grep spark
apache-spark 2.2.0
$ brew list --versions | grep hadoop
hadoop 2.8.1 2.8.2 hdfs
我使用的是 Hadoop 2.8.2。
我关注了this post配置 Hadoop。另外,关注this post将 spark.yarn.archive
配置为:
spark.yarn.archive hdfs://localhost:9000/user/panc25/spark-jars.zip
以下是我在 .bash_profile
中的 Hadoop/Spark 相关环境设置:
# ---------------------
# Hadoop
# ---------------------
export HADOOP_HOME=/usr/local/Cellar/hadoop/2.8.2
export YARN_CONF_DIR=$HADOOP_HOME/libexec/etc/hadoop/
alias hadoop-start="$HADOOP_HOME/sbin/start-dfs.sh;$HADOOP_HOME/sbin/start-yarn.sh"
alias hadoop-stop="$HADOOP_HOME/sbin/stop-yarn.sh;$HADOOP_HOME/sbin/stop-dfs.sh"
# ---------------------
# Apache Spark
# ---------------------
export SPARK_HOME=/usr/local/Cellar/apache-spark/2.2.0/libexec
export PATH=$SPARK_HOME/../bin:$SPARK_HOME/sbin:$PATH
我可以成功启动hadoop(hdfa + yarn):
$ hadoop-start
17/11/12 17:08:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/hadoop-panc25-namenode-mbp13mid2017.local.out
localhost: starting datanode, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/hadoop-panc25-datanode-mbp13mid2017.local.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/hadoop-panc25-secondarynamenode-mbp13mid2017.local.out
17/11/12 17:08:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/yarn-panc25-resourcemanager-mbp13mid2017.local.out
localhost: starting nodemanager, logging to /usr/local/Cellar/hadoop/2.8.2/libexec/logs/yarn-panc25-nodemanager-mbp13mid2017.local.out
$ jps
92723 NameNode
93188 Jps
93051 ResourceManager
93149 NodeManager
92814 DataNode
92926 SecondaryNameNode
但是,当我启动 spark-shell --master yarn
时,它似乎卡住了,我不知道发生了什么:
怎么了?
顺便说一句,我可以访问 SparkUI http://localhost:4040/
,但是所有页面都是空白的。
最佳答案
我遇到了类似的问题,这是由于我忘记将/conf 附加到 HADOOP_CONF_DIR 环境变量 (/etc/hadoop/conf)。
关于hadoop - spark-shell --master yarn 卡住,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47254616/