scala - Spark JDBC to DashDB (DB2) 与 CLOB 错误

标签 scala jdbc apache-spark dashdb

我正在努力将我的 spark 应用程序连接到 DashDB。目前,我可以很好地加载我的数据。

但是,我无法将 DataFrame 保存到 DashDB。

任何见解都会有所帮助。

  var jdbcSets = sqlContext.read.format("jdbc").options(Map("url" -> url, "driver" -> driver, "dbtable" -> "setsrankval")).load()
  jdbcSets.registerTempTable("setsOpponentRanked")
  jdbcSets = jdbcSets.coalesce(10)
  sqlContext.cacheTable("setsOpponentRanked")

但是,当我尝试保存大型 DataFrame 时,出现错误:

DB2 SQL 错误:SQLCODE=-1666,SQLSTATE=42613,SQLERRMC=CLOB,DRIVER=4.19.26

我用来保存数据的代码如下:
val writeproperties = new Properties()
  writeproperties.setProperty("user", "dashXXXX")
  writeproperties.setProperty("password", "XXXXXX")
  writeproperties.setProperty("rowId", "false")
  writeproperties.setProperty("driver", "com.ibm.db2.jcc.DB2Driver")
  results.write.mode(SaveMode.Overwrite).jdbc(writeurl, "players_stat_temp", writeproperties)

一个示例测试数据集可以在这里看到:
println("Test set: "+results.first()) 
Test set: ['Damir DZUMHUR','test','test','test','test','test','test','test','test','test','test','test','test','test','test','test','test','test','test','test','test','test',null,null,null,null,null,null,null]

DataFrame 架构如下:
    root
 |-- PLAYER: string (nullable = true)
 |-- set01: string (nullable = true)
 |-- set02: string (nullable = true)
 |-- set12: string (nullable = true)
 |-- set01weakseed: string (nullable = true)
 |-- set01medseed: string (nullable = true)
 |-- set01strongseed: string (nullable = true)
 |-- set02weakseed: string (nullable = true)
 |-- set02medseed: string (nullable = true)
 |-- set02strongseed: string (nullable = true)
 |-- set12weakseed: string (nullable = true)
 |-- set12medseed: string (nullable = true)
 |-- set12strongseed: string (nullable = true)
 |-- set01weakrank: string (nullable = true)
 |-- set01medrank: string (nullable = true)
 |-- set01strongrank: string (nullable = true)
 |-- set02weakrank: string (nullable = true)
 |-- set02medrank: string (nullable = true)
 |-- set02strongrank: string (nullable = true)
 |-- set12weakrank: string (nullable = true)
 |-- set12medrank: string (nullable = true)
 |-- set12strongrank: string (nullable = true)
 |-- minibreak: string (nullable = true)
 |-- minibreakweakseed: string (nullable = true)
 |-- minibreakmedseed: string (nullable = true)
 |-- minibreakstrongseed: string (nullable = true)
 |-- minibreakweakrank: string (nullable = true)
 |-- minibreakmedrank: string (nullable = true)
 |-- minibreakstrongrank: string (nullable = true)

我查看了 jdbc DB2Dialect 并看到 StringType 的代码被映射到 CLOB。我想知道以下是否有帮助:
private object DB2CustomDialect extends JdbcDialect {
    override def canHandle(url: String): Boolean = url.startsWith("jdbc:db2")
    override def getJDBCType(dt: DataType): Option[JdbcType] = dt match {
            case StringType => Option(JdbcType("VARCHAR(10000)", java.sql.Types.VARCHAR))
            case BooleanType => Option(JdbcType("CHAR(1)", java.sql.Types.CHAR))
            case _ => None
    }
}

最佳答案

通过添加自定义方言效果很好。

JdbcDialects.registerDialect(new DB2CustomDialect())

关于scala - Spark JDBC to DashDB (DB2) 与 CLOB 错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40771799/

相关文章:

Scala 可变集合,插入时排序

scala - 从 Seq 到 Set 再到 Seq 的转换

java - 不必要的外键约束失败

java - 使用线程进行记录插入的最佳实践?

apache-spark - Python Spark 连接两个数据帧并填充列

apache-spark - Spark : is using wrong network interface

scala - 是否可以确保按foreach定义按顺序在Scala中迭代主题集合?

scala - 从回调函数发送 zio http 响应

java - JBoss 3.2.2 和 JDBC 升级

apache-spark - 在 Spark Streaming 中将 RDD 打印到控制台