我现在无法在我的 Windows 计算机上启动 spark-shell。我使用的 Spark 版本是为 Hadoop 2.4 或更高版本预构建的 1.5.2。我认为 spark-shell.cmd 可以在没有任何配置的情况下直接运行,因为它是预构建的,我无法弄清楚是什么问题阻止我正确启动 Spark。
除了打印出的错误消息外,我仍然可以在命令行上执行一些基本的 scala 命令,但显然这里出了点问题。
这是来自 cmd 的错误日志:
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.li
b.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more in
fo.
Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.propertie
s
To adjust logging level use sc.setLogLevel("INFO")
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.5.2
/_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_25)
Type in expressions to have them evaluated.
Type :help for more information.
15/11/18 17:51:32 WARN MetricsSystem: Using default name DAGScheduler for source
because spark.app.id is not set.
Spark context available as sc.
15/11/18 17:51:39 WARN General: Plugin (Bundle) "org.datanucleus" is already reg
istered. Ensure you dont have multiple JAR versions of the same plugin in the cl
asspath. The URL "file:/C:/spark-1.5.2-bin-hadoop2.4/lib/datanucleus-core-3.2.10
.jar" is already registered, and you are trying to register an identical plugin
located at URL "file:/C:/spark-1.5.2-bin-hadoop2.4/bin/../lib/datanucleus-core-3
.2.10.jar."
15/11/18 17:51:39 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is
already registered. Ensure you dont have multiple JAR versions of the same plug
in in the classpath. The URL "file:/C:/spark-1.5.2-bin-hadoop2.4/lib/datanucleus
-rdbms-3.2.9.jar" is already registered, and you are trying to register an ident
ical plugin located at URL "file:/C:/spark-1.5.2-bin-hadoop2.4/bin/../lib/datanu
cleus-rdbms-3.2.9.jar."
15/11/18 17:51:39 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is alr
eady registered. Ensure you dont have multiple JAR versions of the same plugin i
n the classpath. The URL "file:/C:/spark-1.5.2-bin-hadoop2.4/bin/../lib/datanucl
eus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an
identical plugin located at URL "file:/C:/spark-1.5.2-bin-hadoop2.4/lib/datanucl
eus-api-jdo-3.2.6.jar."
15/11/18 17:51:39 WARN Connection: BoneCP specified but not present in CLASSPATH
(or one of dependencies)
15/11/18 17:51:40 WARN Connection: BoneCP specified but not present in CLASSPATH
(or one of dependencies)
15/11/18 17:51:46 WARN ObjectStore: Version information not found in metastore.
hive.metastore.schema.verification is not enabled so recording the schema versio
n 1.2.0
15/11/18 17:51:46 WARN ObjectStore: Failed to get database default, returning No
SuchObjectException
15/11/18 17:51:47 WARN : Your hostname, Lenovo-PC resolves to a loopback/non-reachab
le address: fe80:0:0:0:297a:e76d:828:59dc%wlan2, but we couldn't find any extern
al IP address!
java.lang.RuntimeException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.jav
a:522)
at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.s
cala:171)
at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveCo
ntext.scala:162)
at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala
:160)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:167)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstruct
orAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingC
onstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:10
28)
at $iwC$$iwC.<init>(<console>:9)
at $iwC.<init>(<console>:18)
at <init>(<console>:20)
at .<init>(<console>:24)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:
1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:
1340)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840
)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:8
57)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.sca
la:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply
(SparkILoopInit.scala:132)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply
(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoop
Init.scala:124)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$Spark
ILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.s
cala:159)
at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkIL
oopInit.scala:108)
at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:
64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$Spark
ILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:991)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$Spark
ILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$Spark
ILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClass
Loader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$pr
ocess(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSub
mit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:18
0)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:1097)
at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.
loadPermissionInfo(RawLocalFileSystem.java:559)
at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.
getPermission(RawLocalFileSystem.java:534)
at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(Sess
ionState.java:599)
at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(Sess
ionState.java:554)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.jav
a:508)
... 56 more
<console>:10: error: not found: value sqlContext
import sqlContext.implicits._
^
<console>:10: error: not found: value sqlContext
import sqlContext.sql
^
最佳答案
有几个问题。您使用的是 Windows,与其他兼容 POSIX 的操作系统相比,此操作系统的情况有所不同。
从阅读开始Problems running Hadoop on Windows文件并查看“缺少 WINUTILS.EXE”是否是问题所在。确保您在控制台中以管理员权限运行 spark-shell
。
您可能还想阅读类似问题的答案 Why does starting spark-shell fail with NullPointerException on Windows?
此外,您可能已经在 bin
子目录中启动了 spark-shell
,因此出现如下错误:
15/11/18 17:51:39 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/spark-1.5.2-bin-hadoop2.4/bin/../lib/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/spark-1.5.2-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar."
最后一期:
15/11/18 17:51:47 WARN : Your hostname, Lenovo-PC resolves to a loopback/non-reachable address: fe80:0:0:0:297a:e76d:828:59dc%wlan2, but we couldn't find any external IP address!
一个 解决方法 是将 SPARK_LOCAL_HOSTNAME
设置为某个可解析的主机名并完成它。
SPARK_LOCAL_HOSTNAME
是自定义主机名,它会在创建驱动程序、主服务器、工作程序和执行程序时覆盖主机名的任何其他候选者。
在您的情况下,使用 spark-shell
,只需执行以下命令:
SPARK_LOCAL_HOSTNAME=localhost ./bin/spark-shell
您还可以使用:
./bin/spark-shell -c spark.driver.host=localhost
另请参阅 Environment Variables在 Spark 的官方文档中。
关于apache-spark - 为什么启动 spark-shell 失败并显示 "we couldn' t find any external IP address!"在 Windows 上?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33792197/