我想使用spark处理hive表,但是当我运行我的程序时,我得到了这个错误:
Exception in thread "main" java.lang.IllegalArgumentException: Unable to instantiate SparkSession with Hive support because Hive classes are not found.
我的应用程序代码
object spark_on_hive_table extends App {
val spark = SparkSession
.builder()
.appName("Spark Hive Example")
.config("spark.sql.warehouse.dir", "hdfs://localhost:54310/user/hive/warehouse")
.enableHiveSupport()
.getOrCreate()
import spark.implicits._
spark.sql("select * from pbSales").show()
}
build.sbt
version := "0.1"
scalaVersion := "2.11.12"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "2.3.2",
"org.apache.spark" %% "spark-sql" % "2.3.2",
"org.apache.spark" %% "spark-streaming" % "2.3.2",
"org.apache.spark" %% "spark-hive" % "2.3.2" % "provided"
)
最佳答案
您应该删除 spark-hive
依赖项的 provided
:
"org.apache.spark" %% "spark-hive" % "2.3.2" % "provided"
更改为
"org.apache.spark" %% "spark-hive" % "2.3.2"
关于apache-spark - 尝试使用 Spark 处理 Hive 表时出现 "Unable to instantiate SparkSession with Hive support"错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62456086/