我们最近将 ETL 项目从 Spark 2.4.2 升级到 2.4.5。
部署更改并运行作业后,我看到以下错误:
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:65)
at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V
at com.advisory.pic.etl.utils.OracleDialect$.<init>(OracleDialect.scala:12)
at com.advisory.pic.etl.utils.OracleDialect$.<clinit>(OracleDialect.scala)
at com.advisory.pic.etl.drivers.BaseDriver.$init$(BaseDriver.scala:19)
at com.advisory.pic.etl.drivers.PASLoadDriver$.<init>(PASLoadDriver.scala:19)
at com.advisory.pic.etl.drivers.PASLoadDriver$.<clinit>(PASLoadDriver.scala)
at com.advisory.pic.etl.drivers.PASLoadDriver.main(PASLoadDriver.scala)
... 6 more
我在网上看到这可能是由于库版本不匹配造成的,但我在 build.gradle 中找不到任何此类违规行为,除了 testImplementation ,我将其升级到了正确的版本,但是我怀疑这就是问题的根本原因。
以下是 build.gradle 文件中的依赖项片段。
dependencies {
def hadoopClientVersion = '2.7.1'
def hadoopCommonsVersion = '2.7.1'
def sparkVersion = '2.4.5'
def sparkTestingVersion = '2.4.5'
provided group: 'org.apache.hadoop', name: 'hadoop-client', version: hadoopClientVersion
provided group: 'org.apache.hadoop', name: 'hadoop-common', version: hadoopCommonsVersion
implementation("org.apache.spark:spark-sql_2.12:$sparkVersion"){
//excluding is causing issues when running through IDE - as sparkLibraries are not available at run-time
//We can comment while going for deployment if we face jar conflict issues
//exclude module: 'spark-core_2.10'
//exclude module: 'spark-catalyst_2.10'
}
implementation group: 'org.apache.spark', name: 'spark-core_2.12', version: sparkVersion
testImplementation group: 'org.apache.spark', name: 'spark-core_2.12', version: sparkVersion
// spark -sql with avro
implementation("com.databricks:spark-avro_2.11:4.0.0")
// joda-time
implementation 'com.github.nscala-time:nscala-time_2.12:2.22.0'
//configuration object
implementation group: 'com.typesafe', name: 'config', version: '1.2.1'
implementation "ch.qos.logback:logback-classic:1.1.3"
implementation "org.slf4j:log4j-over-slf4j:1.7.13"
// Libraries needed for scala api
implementation 'org.scala-lang:scala-library:2.12.0'
implementation 'org.scala-lang:scala-compiler:2.12.0'
testImplementation 'org.scalatest:scalatest_2.12:3.0.5'
implementation 'com.oracle:ojdbc7:12.1.0.1'
testImplementation group: 'com.h2database', name: 'h2', version: '1.4.196'
testImplementation 'com.holdenkarau:spark-testing-base_2.12:' + sparkTestingVersion + '_0.12.0'
itestCompile 'org.scala-lang:scala-library:2.12.0'
itestCompile 'org.scalatest:scalatest_2.12:3.0.5'
testImplementation 'org.scalamock:scalamock_2.12:4.3.0'
}
关于为什么会发生上述问题以及如何验证版本不匹配有什么建议吗?
最佳答案
我认为这是由于编译代码的 Scala 版本与运行时的 Scala 版本不匹配造成的。
Spark 2.4.2 是使用 Scala 2.12 预构建的,但 Scala 2.4.5 是使用 Scala 2.11 预构建的,如 - https://spark.apache.org/downloads.html 中所述。 .
如果您使用在 2.11 中编译的 Spark 库,这个问题应该会消失
compile group: 'org.apache.spark', name: 'spark-core_2.11', version: '2.4.5'
关于scala - 升级了spark版本,在spark作业中遇到java.lang.NoSuchMethodError : scala. Product.$init$(Lscala/Product;)V,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64270307/