我想开发在 HDInsight 上运行的 Hadoop 应用程序。在我的应用程序的驱动程序方法中,我需要从 Azure SQL 数据库获取一些信息。我想知道是否可以在我的 Hadoop 作业的驱动程序方法中查询 Azure SQL 数据库?
最佳答案
您可以使用 java.sql 类访问 Azure SQL 数据库,但您可能需要将头节点 IP 添加到数据库防火墙规则中。
package org.microsoft.andrewmoll.SqlExample;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
/**
* Hello world!
*
*/
public class SQLExample
{
public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{
//You should put some awesome map logic here
}
public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
//You should put some awesome reducer logic here
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
String jobName = getData();
System.out.println(jobName);
Job job = Job.getInstance(conf, jobName);
job.setJarByClass(SQLExample.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
public static String getData()
{
String driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver";
String url = "jdbc:sqlserver:<servername>.database.windows.net;DatabaseName=<dbname>";
String username = "DarthMoll";
String password = "Luke,Iamnotyourfather";
try {
/* Load database driver */
Class.forName(driver);
/* Establish database connection */
Connection con = DriverManager.getConnection(url, username, password);
/* Run query */
PreparedStatement stmt = con.prepareStatement("select top 1 * from dbo.SithWarriors");
/* Get return result */
ResultSet resultset = stmt.executeQuery();
/* get users first name */
String result = resultset.getString("FirstName");
/* Close result set */
resultset.close();
/* Close database connection */
con.close();
return result;
} catch (Exception e) {
e.printStackTrace();
}
return "Implement Some Throwable Here";
}
}
如果可能,我建议将数据存储在 blob 中并使用 Java SDK 访问数据。使您不必担心头节点 IP 地址。
关于azure - 可以在 HDInsight 中运行的 Hadoop 作业的驱动程序方法中访问 Azure SQL 数据库吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34234670/