python - 如何使用 impyla 连接到 impala 或使用 pyhive 连接到 hive?

标签 python hive impala pyhive impyla

我正在尝试使用 impyla 和以下代码连接到 impala:

from impala.dbapi import connect
conn = connect(host='host_name.com', port=21050, user='usr', password='pass', use_ssl=True, auth_mechanism='LDAP')
cursor = conn.cursor()
cursor.execute('SHOW DATABASES')
cursor.fetchall()

根据文档,该库需要 0.2.1 版中的 thrift_sasl,但我无法安装它,因为它显示此错误

Collecting thrift_sasl==0.2.1
  Using cached https://files.pythonhosted.org/packages/80/36/16dfe92d32df63cc2b7b7be8d0e4a736617b7e52daaa7f83ae386a89d179/thrift_sasl-0.2.1.tar.gz
Collecting sasl>=0.2.1 (from thrift_sasl==0.2.1)
  Using cached https://files.pythonhosted.org/packages/8e/2c/45dae93d666aea8492678499e0999269b4e55f1829b1e4de5b8204706ad9/sasl-0.2.1.tar.gz
Collecting thriftpy (from thrift_sasl==0.2.1)
  Using cached https://files.pythonhosted.org/packages/f4/19/cca118cf7d2087310dbc8bd70dc7df0c1320f2652873a93d06d7ba356d4a/thriftpy-0.3.9.tar.gz
Requirement already satisfied: six in c:\users\psowa\appdata\local\programs\python\python36\lib\site-packages (from sasl>=0.2.1->thrift_sasl==0.2.1) (1.12.0)
Requirement already satisfied: ply<4.0,>=3.4 in c:\users\psowa\appdata\local\programs\python\python36\lib\site-packages (from thriftpy->thrift_sasl==0.2.1) (3.11)
Installing collected packages: sasl, thriftpy, thrift-sasl
  Running setup.py install for sasl ... error
    ERROR: Command errored out with exit status 1:
     command: 'c:\users\psowa\appdata\local\programs\python\python36\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-y42wej4x\\sasl\\setup.py'"'"'; __file__='"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-y42wej4x\\sasl\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\psowa\AppData\Local\Temp\pip-record-rjwkcc76\install-record.txt' --single-version-externally-managed --compile
         cwd: C:\Users\psowa\AppData\Local\Temp\pip-install-y42wej4x\sasl\
    Complete output (27 lines):
    running install
    running build
    running build_py
    creating build
    creating build\lib.win-amd64-3.6
    creating build\lib.win-amd64-3.6\sasl
    copying sasl\__init__.py -> build\lib.win-amd64-3.6\sasl
    running egg_info
    writing sasl.egg-info\PKG-INFO
    writing dependency_links to sasl.egg-info\dependency_links.txt
    writing requirements to sasl.egg-info\requires.txt
    writing top-level names to sasl.egg-info\top_level.txt
    reading manifest file 'sasl.egg-info\SOURCES.txt'
    reading manifest template 'MANIFEST.in'
    writing manifest file 'sasl.egg-info\SOURCES.txt'
    copying sasl\saslwrapper.cpp -> build\lib.win-amd64-3.6\sasl
    copying sasl\saslwrapper.h -> build\lib.win-amd64-3.6\sasl
    copying sasl\saslwrapper.pyx -> build\lib.win-amd64-3.6\sasl
    running build_ext
    building 'sasl.saslwrapper' extension
    creating build\temp.win-amd64-3.6
    creating build\temp.win-amd64-3.6\Release
    creating build\temp.win-amd64-3.6\Release\sasl
    C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -Isasl -Ic:\users\psowa\appdata\local\programs\python\python36\include -Ic:\users\psowa\appdata\local\programs\python\python36\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\cppwinrt" /EHsc /Tpsasl/saslwrapper.cpp /Fobuild\temp.win-amd64-3.6\Release\sasl/saslwrapper.obj
    saslwrapper.cpp
    c:\users\psowa\appdata\local\temp\pip-install-y42wej4x\sasl\sasl\saslwrapper.h(22): fatal error C1083: Cannot open include file: 'sasl/sasl.h': No such file or directory
    error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.16.27023\\bin\\HostX86\\x64\\cl.exe' failed with exit status 2
    ----------------------------------------
ERROR: Command errored out with exit status 1: 'c:\users\psowa\appdata\local\programs\python\python36\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-y42wej4x\\sasl\\setup.py'"'"'; __file__='"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-y42wej4x\\sasl\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\psowa\AppData\Local\Temp\pip-record-rjwkcc76\install-record.txt' --single-version-externally-managed --compile Check the logs for full command output.

当我安装最新版本的 thrift_sasl jupyter 时出现此错误:

AttributeError: 'TSSLSocket' object has no attribute 'isOpen'

我还尝试使用以下代码通过 pyhive 连接:

from pyhive import hive

host_name = "host_name.com"
port = 10000
user = "usr"
password = "pass"

def hiveconnection(host_name, port, user,password):
    conn = hive.Connection(host=host_name, port=port, username=user, password=password, auth='LDAP')
    cur = conn.cursor()
    cur.execute('SHOW DATABASES')
    result = cur.fetchall()

    return result

output = hiveconnection(host_name, port, user,password)
print(output)

它希望我安装 sasl,但当我尝试这样做时,它显示:

Collecting sasl
  Using cached https://files.pythonhosted.org/packages/8e/2c/45dae93d666aea8492678499e0999269b4e55f1829b1e4de5b8204706ad9/sasl-0.2.1.tar.gz
Requirement already satisfied: six in c:\users\psowa\appdata\local\programs\python\python36\lib\site-packages (from sasl) (1.12.0)
Installing collected packages: sasl
  Running setup.py install for sasl ... error
    ERROR: Command errored out with exit status 1:
     command: 'c:\users\psowa\appdata\local\programs\python\python36\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-9rn_a9g0\\sasl\\setup.py'"'"'; __file__='"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-9rn_a9g0\\sasl\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\psowa\AppData\Local\Temp\pip-record-sedxucba\install-record.txt' --single-version-externally-managed --compile
         cwd: C:\Users\psowa\AppData\Local\Temp\pip-install-9rn_a9g0\sasl\
    Complete output (27 lines):
    running install
    running build
    running build_py
    creating build
    creating build\lib.win-amd64-3.6
    creating build\lib.win-amd64-3.6\sasl
    copying sasl\__init__.py -> build\lib.win-amd64-3.6\sasl
    running egg_info
    writing sasl.egg-info\PKG-INFO
    writing dependency_links to sasl.egg-info\dependency_links.txt
    writing requirements to sasl.egg-info\requires.txt
    writing top-level names to sasl.egg-info\top_level.txt
    reading manifest file 'sasl.egg-info\SOURCES.txt'
    reading manifest template 'MANIFEST.in'
    writing manifest file 'sasl.egg-info\SOURCES.txt'
    copying sasl\saslwrapper.cpp -> build\lib.win-amd64-3.6\sasl
    copying sasl\saslwrapper.h -> build\lib.win-amd64-3.6\sasl
    copying sasl\saslwrapper.pyx -> build\lib.win-amd64-3.6\sasl
    running build_ext
    building 'sasl.saslwrapper' extension
    creating build\temp.win-amd64-3.6
    creating build\temp.win-amd64-3.6\Release
    creating build\temp.win-amd64-3.6\Release\sasl
    C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -Isasl -Ic:\users\psowa\appdata\local\programs\python\python36\include -Ic:\users\psowa\appdata\local\programs\python\python36\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17763.0\cppwinrt" /EHsc /Tpsasl/saslwrapper.cpp /Fobuild\temp.win-amd64-3.6\Release\sasl/saslwrapper.obj
    saslwrapper.cpp
    c:\users\psowa\appdata\local\temp\pip-install-9rn_a9g0\sasl\sasl\saslwrapper.h(22): fatal error C1083: Cannot open include file: 'sasl/sasl.h': No such file or directory
    error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.16.27023\\bin\\HostX86\\x64\\cl.exe' failed with exit status 2
    ----------------------------------------
ERROR: Command errored out with exit status 1: 'c:\users\psowa\appdata\local\programs\python\python36\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-9rn_a9g0\\sasl\\setup.py'"'"'; __file__='"'"'C:\\Users\\psowa\\AppData\\Local\\Temp\\pip-install-9rn_a9g0\\sasl\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\psowa\AppData\Local\Temp\pip-record-sedxucba\install-record.txt' --single-version-externally-managed --compile Check the logs for full command output.

有什么想法吗?

最佳答案

在 2.7 版本中使用 python 修复了这个问题。我认为存在兼容性问题。

关于python - 如何使用 impyla 连接到 impala 或使用 pyhive 连接到 hive?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57958290/

相关文章:

hadoop - Xml与Hive解析

hadoop - 如何将 Hive 分区列和值插入数据( Parquet )文件?

hadoop - 通过Hive导出到Oracle表时,将字符串和非字符串数据转换为空字段的 'null'

Python smtplib sendmail() 不适用于主题/正文

apache-spark - PySpark to_utc_timestamp 返回相同时间

azure - HDInsight 和 Talend Open Studio for Big Data

hadoop - 在 cloudera impala 1.2.3 中使用 date_sub() udf 从 View 查询时出现连接重置错误

python - 如何在 Python 中打印 Unicode 字符?

python - 使用 Pandas 打开 Excel 文件

python - python中的关联组