hadoop - 实时设置环境变量。谷歌大查询

我正在为 Spark 编写一个谷歌大查询连接器，在它下面使用谷歌 hadoop 连接器。

目前，Google hadoop 连接器需要一个指向 creds json 文件的 Google 环境变量。

当您在 dataproc 世界之外启动集群时，设置这可能会很烦人

在代码中实时设置是不是不好的做法？或者是否有一种解决方法可以告诉 hadoop 连接器忽略 env 变量，因为它是在“fs.gs.auth.service.account.json.keyfile”hadoop 配置中设置的？

Dennis既然您是该项目的贡献者，也许这次您也可以提供帮助？

最佳答案

对于那些感兴趣的人，我只是使用 scala 中的以下要点在运行时设置它们

https://gist.github.com/jaytaylor/770bc416f0dd5954cf0f

但是这里是代码，以防要点离线

trait EnvHacker {
/**
 * Portable method for setting env vars on both *nix and Windows.
 * @see http://stackoverflow.com/a/7201825/293064
 */
def setEnv(newEnv: Map[String, String]): Unit = {
    try {
        val processEnvironmentClass = Class.forName("java.lang.ProcessEnvironment")
        val theEnvironmentField = processEnvironmentClass.getDeclaredField("theEnvironment")
        theEnvironmentField.setAccessible(true)
        val env = theEnvironmentField.get(null).asInstanceOf[JavaMap[String, String]]
        env.putAll(newEnv)
        val theCaseInsensitiveEnvironmentField = processEnvironmentClass.getDeclaredField("theCaseInsensitiveEnvironment")
        theCaseInsensitiveEnvironmentField.setAccessible(true)
        val cienv = theCaseInsensitiveEnvironmentField.get(null).asInstanceOf[JavaMap[String, String]]
        cienv.putAll(newEnv)
    } catch {
        case e: NoSuchFieldException =>
            try {
                val classes = classOf[Collections].getDeclaredClasses()
                val env = System.getenv()
                for (cl <- classes) {
                    if (cl.getName() == "java.util.Collections$UnmodifiableMap") {
                        val field = cl.getDeclaredField("m")
                        field.setAccessible(true)
                        val obj = field.get(env)
                        val map = obj.asInstanceOf[JavaMap[String, String]]
                        map.clear()
                        map.putAll(newEnv)
                    }
                }
            } catch {
                case e2: Exception => e2.printStackTrace()
            }

        case e1: Exception => e1.printStackTrace()
    }
}

}

关于hadoop - 实时设置环境变量。谷歌大查询，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/42344121/

hadoop - 实时设置环境变量。谷歌大查询

上一篇：hadoop - 如何在 hadoop yarn 上获取应用程序运行时

下一篇：hadoop - 垃圾检查点间隔如何在 hadoop 中工作？为什么需要 checkpoint_intereval？