scala - 为什么 Spark 项目中的 sbt 程序集失败并显示 "Please add any Spark dependencies by supplying the sparkVersion and sparkComponents"?

标签 scala apache-spark sbt sbt-assembly

我在一个 sbt 管理的 Spark 项目上工作,该项目具有 spark-cloudant 依赖项。密码是available on GitHub (on spark-cloudant-compile-issue branch) .

我已将以下行添加到 build.sbt:

"cloudant-labs" % "spark-cloudant" % "1.6.4-s_2.10" % "provided"

所以 build.sbt 看起来如下:

name := "Movie Rating"

version := "1.0"

scalaVersion := "2.10.5"

libraryDependencies ++= {
  val sparkVersion =  "1.6.0"
  Seq(
     "org.apache.spark" %% "spark-core" % sparkVersion % "provided",
     "org.apache.spark" %% "spark-sql" % sparkVersion % "provided",
     "org.apache.spark" %% "spark-streaming" % sparkVersion % "provided",
     "org.apache.spark" %% "spark-streaming-kafka" % sparkVersion % "provided",
     "org.apache.spark" %% "spark-mllib" % sparkVersion % "provided",
     "org.apache.kafka" % "kafka-log4j-appender" % "0.9.0.0",
     "org.apache.kafka" % "kafka-clients" % "0.9.0.0",
     "org.apache.kafka" %% "kafka" % "0.9.0.0",
     "cloudant-labs" % "spark-cloudant" % "1.6.4-s_2.10" % "provided"
    )
}

assemblyMergeStrategy in assembly := {
  case PathList("org", "apache", "spark", xs @ _*) => MergeStrategy.first
  case PathList("scala", xs @ _*) => MergeStrategy.discard
  case PathList("META-INF", "maven", "org.slf4j", xs @ _* ) => MergeStrategy.first
  case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)
}

unmanagedBase <<= baseDirectory { base => base / "lib" }

assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)

当我执行 sbt assembly 时,出现以下错误:

java.lang.RuntimeException: Please add any Spark dependencies by 
   supplying the sparkVersion and sparkComponents. Please remove: 
   org.apache.spark:spark-core:1.6.0:provided

最佳答案

可能相关:https://github.com/databricks/spark-csv/issues/150

您可以尝试将 spIgnoreProvided := true 添加到您的 build.sbt 中吗?

(这可能不是答案,我本可以发表评论,但我没有足够的声誉)

关于scala - 为什么 Spark 项目中的 sbt 程序集失败并显示 "Please add any Spark dependencies by supplying the sparkVersion and sparkComponents"?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40951106/

相关文章:

performance - 与 Java 相比,Scala 的性能如何?

scala - 是否有更紧凑的方法将 Map[Any,Any] 转换为 Map[String,String]?

scala - Apache Spark 中同一个 RDD 上的两个转换是否并行执行?

scala - spark数据帧爆炸功能错误

scala - 任何支持 SBT 的 IDE?

scala - 在Scala中,如何限制多个参数必须是同一类型内的类型?

scala - reshape 案例类构造函数?

java - Spark Dataframe 的 count() API 的替代方案

java - sbt:对象应用程序不是包的成员

scala - 无法在 PhantomJSEnv 上运行 scala.js,requiresDOM 设置强制为 false