python - 使用 python 模块 impyla 连接到 Kerberized hadoop 集群

标签 python python-2.7 hadoop kerberos impyla

我正在使用 impyla 模块连接到 kerberized hadoop 集群。我想访问
hiveserver2/hive 但我收到以下错误:
test_conn.py

from impala.dbapi import connect
import os
connection_string = 'hdp296m1.XXX.XXX.com'
conn = connect(host=connection_string, port=21050,auth_mechanism="GSSAPI",kerberos_service_name='testuser@Myrealm.COM',password='testuser')
cursor = conn.cursor()
cursor.execute('select count(*) form t_all_types_simple_t')
print cursor.description
results = cursor.fetchall()
堆栈跟踪:
[vagrant@localhost vagrant]$ python test_conn.py
Traceback (most recent call last):
  File "test_conn.py", line 4, in <module>
    conn = connect(host=connection_string, port=21050, auth_mechanism="GSSAPI",kerberos_service_name='testuser@Myrealm.COM',password='testuser')
  File "/usr/lib/python2.7/site-packages/impala/dbapi.py", line 147, in connect
    auth_mechanism=auth_mechanism)
  File "/usr/lib/python2.7/site-packages/impala/hiveserver2.py", line 758, in connect
    transport.open()
  File "/usr/lib/python2.7/site-packages/thrift_sasl/__init__.py", line 61, in open
    self._trans.open()
  File "/usr/lib64/python2.7/site-packages/thrift/transport/TSocket.py", line 101, in open
    message=message)
thrift.transport.TTransport.TTransportException: Could not connect to hdp296m1.XXX.XXX.com:21050
testuser 是我将用于执行 kinit 的 kerberos 主体。

最佳答案

您的连接似乎不正确.. 尝试,

from impala.dbapi import *
import sys, os
# set your parms
host=os.environ.get("CDH_HIVE",'x.x.x.x')
port=os.environ.get("CDH_HIVE_port",'10000')
auth_mechanism=os.environ.get("CDH_auth",'GSSAPI')
user='hive' 
db='mydb' 
# No password use kinit 
password=''
# hive is principal with krb
kbservice='hive'  

class Hive:

    def __init__(self,db):
        self.database=db
        self.__conn = connect(host=host,
                            port=port,
                            auth_mechanism=auth_mechanism,
                            user=user,
                            password=password,
                            database=db,
                            kerberos_service_name=kbservice
                            )


        self.__cursor = self.__conn.cursor()


h = Hive(db)

关于python - 使用 python 模块 impyla 连接到 Kerberized hadoop 集群,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41738526/

相关文章:

java - Log4j 找不到记录器的附加程序 (org.apache.hadoop.util.shell)

python - 如何在 emacs 中激活 ananconda 的 env python?

hadoop - 在 Hive 中高效存储数据

python - Python 中 zip() 的时间复杂度是多少?

python - osquery-python 扩展导致 osqueryi 错误

Mysql 在 python 中显示类型错误

python-2.7 - 如何在 Python 中解释 Unicode 符号?

java - Hadoop 2.7.1 中的作业历史记录服务器不工作

python - 获取当前 GTK 主题的字体颜色

python - 正则表达式以获取具有特定字母的所有单词列表(unicode 字素)