azure - 与 Azure Java SDK 一起使用时,Jackson 在 Apache Spark 中发生冲突

标签 azure scala apache-spark azure-synapse

Azure Synapse 发布运行时可用的 jar here 。我目前正在使用 Apache Spark 3.1 运行时。

我的项目还依赖 1.4.0 版本的 azure-eventgrid 作为依赖项(它会引入 azure-core)。当作业部署在 Synapse 上时,我收到以下错误。

该作业在本地运行良好,但部署在 Synapse 上时则不然。

21/11/29 17:38:00 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.LinkageError: Package versions: jackson-annotations=2.10.0, jackson-core=2.10.0, jackson-databind=2.10.0, jackson-dataformat-xml=2.12.5, jackson-datatype-jsr310=2.12.5, azure-core=1.19.0, Troubleshooting version conflicts: https://aka.ms/azsdk/java/dependency/troubleshoot at com.azure.core.implementation.jackson.ObjectMapperShim.createXmlMapper(ObjectMapperShim.java:73) at com.azure.core.util.serializer.JacksonAdapter.(JacksonAdapter.java:81) at com.azure.core.util.serializer.JacksonAdapter.(JacksonAdapter.java:58) at com.azure.core.util.serializer.JacksonAdapter$SerializerAdapterHolder.(JacksonAdapter.java:113) at com.azure.core.util.serializer.JacksonAdapter.createDefaultSerializerAdapter(JacksonAdapter.java:122) at com.azure.identity.implementation.IdentityClient.(IdentityClient.java:100) at com.azure.identity.implementation.IdentityClientBuilder.build(IdentityClientBuilder.java:139) at com.azure.identity.ManagedIdentityCredential.(ManagedIdentityCredential.java:70) at com.azure.identity.DefaultAzureCredentialBuilder.getCredentialsChain(DefaultAzureCredentialBuilder.java:129) at com.azure.identity.DefaultAzureCredentialBuilder.build(DefaultAzureCredentialBuilder.java:123)com.xxxxxxxxxxx.$anonfun$sendEvents$1$adapted(xxxxxxxGridSender.scala:25) at scala.collection.immutable.List.foreach(List.scala:392) at xxxxxx.xxxxxx.xxxxxx.xxxxxx.xxxxxx.xxxxxx.sendEvents(xxxxxx.scala:25) at scala.collection.Iterator.foreach(Iterator.scala:941) at scala.collection.Iterator.foreach$(Iterator.scala:941) at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at xxxxxx.xxxxxx.xxxxxx.xxxxxx.runner.xxxxxx.xxxxxx(xxxxxx.scala:82) at xxxxxx.xxxxxx.xxxxxx.xxxxxx.xxxxxx.xxxxxx.xxxxxx(xxxxxx.scala:61) at xxxxxx.xxxxxx.xxxxxx.xxxxxx.xxxxxx.xxxxxx.$anonfun$start$2(xxxxxx.scala:39) at scala.collection.TraversableLike$WithFilter.$anonfun$map$2(TraversableLike.scala:827) at scala.collection.Iterator.foreach(Iterator.scala:941) at scala.collection.Iterator.foreach$(Iterator.scala:941) at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at scala.collection.TraversableLike$WithFilter.map(TraversableLike.scala:826) at xxxxxx.xxxxxx.xxxxxx.xxxxxx.x.xxxxx.xxxxxx.start(xxxxxx.scala:36) at xxxxxx.xxxxxx.xxxxxx.xxxxxx.xxxxxx$.main(xxxxxx.scala:29) at xxxxxx.xxxxxx.aiops.xxxxxx.xxxxxx.main(xxxxxx.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:732) Caused by: java.lang.NoSuchMethodError: com.fasterxml.jackson.dataformat.xml.XmlMapper.coercionConfigDefaults()Lcom/fasterxml/jackson/databind/cfg/MutableCoercionConfig; at com.fasterxml.jackson.dataformat.xml.XmlMapper.(XmlMapper.java:176) at com.fasterxml.jackson.dataformat.xml.XmlMapper.(XmlMapper.java:145) at com.fasterxml.jackson.dataformat.xml.XmlMapper.(XmlMapper.java:127) at com.fasterxml.jackson.dataformat.xml.XmlMapper.builder(XmlMapper.java:218) at com.azure.core.implementation.jackson.ObjectMapperFactory.createXmlMapper(ObjectMapperFactory.java:84) at com.azure.core.implementation.jackson.ObjectMapperShim.createXmlMapper(ObjectMapperShim.java:70) ... 45 more

最佳答案

Synapse 有自己的 jar,作为其 runtime 的一部分。 。项目依赖项需要与运行时可用的 jar 兼容。

这里有两个部分:

  1. Azure-core 纳入 Jackson 依赖项 2.12 系列。 Apache Spark 3.1 仍在 2.10 系列中。
  2. Azure-core 已在 synapse (1.16.0) 的类路径上可用。因此,引入的任何 azure 库(随之而来的是 azure-core 作为依赖项)都需要与 azure-core 1.16.0 兼容

为了修复 (1),我添加了以下内容:

object DependencyOverrides {

  /**
   * We do not have any direct dependency on jackson. Spark relies on 2.10 series and Azure-core sdk has dependency on 2.12.
   * In order to resolve conflicts, we explicitly provide the jackson dependency here to 2.10.5
   */
  val jackson = Seq(
    "com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.10.0",
    "com.fasterxml.jackson.core" % "jackson-core" % "2.10.0",
    "com.fasterxml.jackson.core" % "jackson-annotations" % "2.10.0",
    "com.fasterxml.jackson.core" % "jackson-databind" % "2.10.0",
    "com.fasterxml.jackson.dataformat" % "jackson-dataformat-xml" % "2.10.0",
    "com.fasterxml.jackson.datatype" % "jackson-datatype-jsr310" % "2.10.0",
  )

  val others = Seq(
    "com.google.guava" % "guava" % "27.0-jre"
  )

  val all = jackson ++ others
}

并在 SBT 中覆盖上述依赖项:

dependencyOverrides ++= DependencyOverrides.all

要修复 (2),请在上面的 others 中另外添加相关 jar:

  val others = Seq(
    "com.azure" % "azure-core" % "1.16.0",
    "com.azure" % "azure-core-http-netty" % "1.6.2",
    "com.google.guava" % "guava" % "27.0-jre"
  )

就我而言,添加 azure-core 还不够。还必须添加 azure-core-http-netty 和 guava。

关于azure - 与 Azure Java SDK 一起使用时,Jackson 在 Apache Spark 中发生冲突,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/70738081/

相关文章:

unit-testing - 如何在specs2中定义DataTables的上下文

java - 使用 Spark Streaming 连接到 Cassandra 时出错

c# - Azure 部署带有纯文本连接字符串的 web.config 文件

azure - KQL Azure 工作簿 : Filtering AppInsights cross-resource query by subscription

.net - log4net ADO.NET Appender 在 Dev Fabric 上工作,但在 Azure 上无提示地失败

azure - 如何设置 FTP 到 Azure 云服务或 Azure 存储?

scala - Spark 数据集案例类编码器的小数精度

scala - ApacheFlink 中的数据集联盟

apache-spark - 根据工作线程,内核和DataFrame大小确定Spark分区的最佳数量

scala - spark - 在 Spark 中读取 Hive 表时从 RDD[Row] 中提取元素