centos7 pyhive 连接 hive(基于 kerberos)
1、安装软件包
yum install -y gcc-c++
yum install -y cyrus-sasl-lib
yum install -y cyrus-sasl-devel
yum install -y libgsasl-devel
yum install -y saslwrapper-devel
yum install -y cyrus-sasl-gssapi
yum install -y cyrus-sasl-md5
yum install -y cyrus-sasl-plain
yum install -y krb5-workstation krb5-libs
pip install thrift
pip install thrift_sasl
pip install sasl
pip install pyhive
2、配置 krb5.conf 并验证
配置这些主机上的/etc/krb5.conf,这个文件的内容与 KDC 服务器中的文件保持一致即可。例如:
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
default_realm = TDH
dns_lookup_realm = false
dns_lookup_kdc = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
allow_weak_crypto = true
default_tkt_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1
default_tgs_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1
permitted_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1
udp_preference_limit = 1000000
[realms]
TDH = {
kdc = namenode1tdh
kdc = namenode2tdh
admin_server = namenode1tdh
admin_server = namenode2tdh
}
验证命令:
[root@localhost ~]# kinit etl
Password for etl@TDH:: 123456
3、编写脚本
from pyhive import hive
conn = hive.Connection(host='10.123.185.31', port=10000,kerberos_service_name='hive', auth='KERBEROS',database='yang').cursor()
conn.execute('SELECT * FROM tb_user_jifen_m LIMIT 3', async=True)
for result in conn.fetchall():
print(result)
这里主要需要修改:
Host:主机名
port:端口号
kerberos_service_name:kerberos 服务名
database:数据库名