mysql - Spark 未找到类为 com.mysql.jdbc.Driver 的已注册驱动程序

标签 mysql jdbc apache-spark pyspark cloudera-cdh

我正在使用 CDH 5.7.0 和 PySpark。当我运行诸如 RDD.count() 之类的操作时,它显示错误:没有找到带有类 com.mysql.jdbc.Driver 的注册驱动程序

步骤如下

pyspark --driver-class-path/usr/share/java/mysql-connector-java.jar (每个节点上的/usr/share/java/mysql-connector-java.jar)

>>>url ="jdbc:mysql://host/spark?user=root&password=test"
>>> stock_data=sqlContext.read.format("jdbc").option("url",url).option("dbtable","StockPrices").load()
>>> stock_data.printSchema()
root
 |-- date: string (nullable = true)
 |-- open: double (nullable = true)
 |-- high: double (nullable = true)
 |-- low: double (nullable = true)
 |-- close: double (nullable = true)
 |-- volume: long (nullable = true)
 |-- adjclose: double (nullable = true)
>>> stock_data.count()
......
    at java.lang.reflect.Method.invoke(Method.java:606)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
    at py4j.Gateway.invoke(Gateway.java:259)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:209)
    at java.lang.Thread.run(Thread.java:745)
**Caused by: java.lang.IllegalStateException: Did not find registered driver with class com.mysql.jdbc.Driver**
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2$$anonfun$3.apply(JdbcUtils.scala:58)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2$$anonfun$3.apply(JdbcUtils.scala:58)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:57)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:52)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.<init>(JDBCRDD.scala:347)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:339)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

最佳答案

终于找到了...你需要在conf下,有一个default.conf文件,添加spark.executor.extraClassPath mysql.jar,这样executor就可以找到驱动了。

关于mysql - Spark 未找到类为 com.mysql.jdbc.Driver 的已注册驱动程序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37313564/

相关文章:

mysql - 删除检查命令在 MYSQL 中不起作用

mysql - 如何避免在 MySql 中存储默认日期(1970-01-01)

java - 如何使用 JDBC 修复 SQL 服务器中的端口 1433 失败错误,我尝试了所有方法

java - MySQL Java JDBC : How to get the name of an auto-incremented column?

apache-spark - 在 yarn 集群模式下在 yarn 上运行 Spark : Where does the console output go?

python - Apache Spark - ModuleNotFoundError : No module named 'mysql'

MySQL 查询对于有限行来说太慢

mysql - 我需要一些关于数据库设计的想法,与存储坐标相关

java - 使用 JDBC 准备语句的 SQL 1064 语法错误

apache-spark - pyspark - 合并 2 列集合