hadoop - 使用 Zookeeper 客户端连接到 Hadoop 集群的 KrbException - UNKNOWN_SERVER

标签 hadoop apache-zookeeper kerberos gssapi

我的 Zookeeper 客户端无法连接到 Hadoop 集群。

这在 Linux VM 上运行良好,但我使用的是 Mac。

我在 JVM 上设置了 -Dsun.security.krb5.debug=true 标志并得到以下输出:

Found ticket for solr@DDA.MYCO.COM to go to krbtgt/DDA.MYCO.COM@DDA.MYCO.COM expiring on Sat Apr 29 03:15:04 BST 2017
Entered Krb5Context.initSecContext with state=STATE_NEW
Found ticket for solr@DDA.MYCO.COM to go to krbtgt/DDA.MYCO.COM@DDA.MYCO.COM expiring on Sat Apr 29 03:15:04 BST 2017
Service ticket not found in the subject
>>> Credentials acquireServiceCreds: same realm
Using builtin default etypes for default_tgs_enctypes
default etypes for default_tgs_enctypes: 17 16 23.
>>> CksumType: sun.security.krb5.internal.crypto.RsaMd5CksumType
>>> EType: sun.security.krb5.internal.crypto.Aes128CtsHmacSha1EType
>>> KrbKdcReq send: kdc=oc-10-252-132-139.nat-ucfc2z3b.usdv1.mycloud.com UDP:88, timeout=30000, number of retries =3, #bytes=682
>>> KDCCommunication: kdc=oc-10-252-132-139.nat-ucfc2z3b.usdv1.mycloud.com UDP:88, timeout=30000,Attempt =1, #bytes=682
>>> KrbKdcReq send: #bytes read=217
>>> KdcAccessibility: remove oc-10-252-132-139.nat-ucfc2z3b.usdv1.mycloud.com
>>> KDCRep: init() encoding tag is 126 req type is 13
>>>KRBError:
     cTime is Thu Dec 24 11:18:15 GMT 2015 1450955895000
     sTime is Fri Apr 28 15:15:06 BST 2017 1493388906000
     suSec is 925863
     error code is 7
     error Message is Server not found in Kerberos database
     cname is solr@DDA.MYCO.COM
     sname is zookeeper/oc-10-252-132-160.nat-ucfc2z3b.usdv1.mycloud.com@DDA.MYCO.COM
     msgType is 30
KrbException: Server not found in Kerberos database (7) - UNKNOWN_SERVER
    at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:73)
    at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:251)
    at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:262)
    at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:308)
    at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:126)
    at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:458)
    at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:693)
    at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
    at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
    at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
    at org.apache.zookeeper.client.ZooKeeperSaslClient$2.run(ZooKeeperSaslClient.java:366)
    at org.apache.zookeeper.client.ZooKeeperSaslClient$2.run(ZooKeeperSaslClient.java:363)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslToken(ZooKeeperSaslClient.java:362)
    at org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslToken(ZooKeeperSaslClient.java:348)
    at org.apache.zookeeper.client.ZooKeeperSaslClient.sendSaslPacket(ZooKeeperSaslClient.java:420)
    at org.apache.zookeeper.client.ZooKeeperSaslClient.initialize(ZooKeeperSaslClient.java:458)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1057)
Caused by: KrbException: Identifier doesn't match expected value (906)
    at sun.security.krb5.internal.KDCRep.init(KDCRep.java:140)
    at sun.security.krb5.internal.TGSRep.init(TGSRep.java:65)
    at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:60)
    at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:55)
    ... 18 more
ERROR   2017-04-28 15:15:07,046 5539    org.apache.zookeeper.client.ZooKeeperSaslClient [main-SendThread(oc-10-252-132-160.nat-ucfc2z3b.usdv1.mycloud.com:2181)]    
An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed 
[Caused by GSSException: No valid credentials provided 
(Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER)]) 
occurred when evaluating Zookeeper Quorum Member's  received SASL token. 
This may be caused by Java's being unable to resolve the Zookeeper Quorum Member's hostname correctly. 
You may want to try to adding '-Dsun.net.spi.nameservice.provider.1=dns,sun' to your client's JVMFLAGS environment. 
Zookeeper Client will go to AUTH_FAILED state.

我已经按如下方式测试了 Kerberos 配置:

>kinit -kt /etc/security/keytabs/solr.headless.keytab solr
>klist
Credentials cache: API:3451691D-7D5E-49FD-A27C-135816F33E4D
        Principal: solr@DDA.MYCO.COM

  Issued                Expires               Principal
Apr 28 16:58:02 2017  Apr 29 04:58:02 2017  krbtgt/DDA.MYCO.COM@DDA.MYCO.COM

按照说明操作 from hortonworks我设法将 kerberos 票证存储在一个文件中:

 >klist -c FILE:/tmp/krb5cc_501
Credentials cache: FILE:/tmp/krb5cc_501
    Principal: solr@DDA.MYCO.COM

Issued                Expires               Principal
Apr 28 17:10:25 2017  Apr 29 05:10:25 2017  krbtgt/DDA.MYCO.COM@DDA.MYCO.COM

我还尝试了堆栈跟踪中建议的 JVM 选项 (-Dsun.net.spi.nameservice.provider.1=dns,sun),但这导致了不同的错误Client session timed out 行,这表明此 JVM 参数首先阻止客户端正确连接。

==编辑==

似乎 Mac 版本的 Kerberos 不是最新的:

> krb5-config --version
Kerberos 5 release 1.7-prerelease

我刚刚尝试 brew install krb5 安装更新版本,然后调整路径以指向新版本。

> krb5-config --version
Kerberos 5 release 1.15.1

这对结果没有任何影响。

注意,在我的 Mac 上使用完全相同的 jaas.conf、keytab 文件和 krb5.conf,这在 linux VM 上运行良好。

krb5.conf:

[libdefaults]
renew_lifetime = 7d
forwardable = true
  default_realm = DDA.MYCO.COM
  ticket_lifetime = 24h
  dns_lookup_realm = false
  dns_lookup_kdc = false


[realms]
  DDA.MYCO.COM = {
    admin_server = oc-10-252-132-139.nat-ucfc2z3b.usdv1.mycloud.com
    kdc = oc-10-252-132-139.nat-ucfc2z3b.usdv1.mycloud.com
  }

反向 DNS: 我检查了我正在连接的 FQDN 主机名是否可以使用反向 DNS 查找找到:

> host 10.252.132.160
160.132.252.10.in-addr.arpa domain name pointer oc-10-252-132-160.nat-ucfc2z3b.usdv1.mycloud.com.

这与 linux VM 对同一命令的响应完全一致。

===WIRESHARK 分析===

使用配置为使用系统键选项卡的 Wireshark 可以在分析中提供更多细节。

在这里我发现失败的调用看起来像这样:

client -> host  AS-REQ
host -> client  AS-REP 
client -> host  AS-REQ
host -> client  AS-REP 
client -> host TGS-REQ  <-- this call is detailed below
host -> client KRB error KRB5KDC_ERR_S_PRINCIPAL_UNKNOWN

错误的 TGS-REQ 调用显示如下:

Kerberos
tgs-req
    pvno: 5
    msg-type: krb-tgs-req (12)
    padata: 1 item
    req-body
        Padding: 0
        kdc-options: 40000000 (forwardable)
        realm: DDA.MYCO.COM
        sname
            name-type: kRB5-NT-UNKNOWN (0)
            sname-string: 2 items
                SNameString: zookeeper
                SNameString: oc-10-252-134-51.nat-ucfc2z3b.usdv1.mycloud.com
        till: 1970-01-01 00:00:00 (UTC)
        nonce: 797021964
        etype: 3 items
            ENCTYPE: eTYPE-AES128-CTS-HMAC-SHA1-96 (17)
            ENCTYPE: eTYPE-DES3-CBC-SHA1 (16)
            ENCTYPE: eTYPE-ARCFOUR-HMAC-MD5 (23)

这里是linux box相应的成功调用,后面又是几次交流。

Kerberos
    tgs-req
        pvno: 5
        msg-type: krb-tgs-req (12)
        padata: 1 item
        req-body
            Padding: 0
            kdc-options: 40000000 (forwardable)
            realm: DDA.MYCO.COM
            sname
                name-type: kRB5-NT-UNKNOWN (0)
                sname-string: 2 items
                    SNameString: zookeeper
                    SNameString: d59407.ddapoc.ucfc2z3b.usdv1.mycloud.com
            till: 1970-01-01 00:00:00 (UTC)
            nonce: 681936272
            etype: 3 items
                ENCTYPE: eTYPE-AES128-CTS-HMAC-SHA1-96 (17)
                ENCTYPE: eTYPE-DES3-CBC-SHA1 (16)
                ENCTYPE: eTYPE-ARCFOUR-HMAC-MD5 (23)

所以看起来客户端正在发送

oc-10-252-134-51.nat-ucfc2z3b.usdv1.mycloud.com

作为服务器主机,应该在什么时候发送:

d59407.ddapoc.ucfc2z3b.usdv1.mycloud.com

所以问题是,我该如何解决这个问题?请记住,这是一段 Java 代码。

我的/etc/hosts 有以下内容:

10.252.132.160 b3e073.ddapoc.ucfc2z3b.usdv1.mycloud.com
10.252.134.51  d59407.ddapoc.ucfc2z3b.usdv1.mycloud.com
10.252.132.139 d7cc18.ddapoc.ucfc2z3b.usdv1.mycloud.com

我的 krb5.conf 文件有:

kdc = d7cc18.ddapoc.ucfc2z3b.usdv1.mycloud.com
kdc = b3e073.ddapoc.ucfc2z3b.usdv1.mycloud.com
kdc = d59407.ddapoc.ucfc2z3b.usdv1.mycloud.com

我尝试添加 -Dsun.net.spi.nameservice.provider.1=file,dns 作为 JVM 参数,但得到了相同的结果。

最佳答案

我通过设置一个本地 dnsmasq 实例来提供正向和反向 DNS 查找来解决这个问题。

现在从命令行,host d59407.ddapoc.ucfc2z3b.usdv1.mycloud.com 返回 10.252.134.51

另见 herehere .

关于hadoop - 使用 Zookeeper 客户端连接到 Hadoop 集群的 KrbException - UNKNOWN_SERVER,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43685086/

相关文章:

sql-server-2008 - 对 SQL Server 2008 使用 Kerberos 身份验证

java - 在Java中验证,使用和重用密码和Kerberos凭证

xml - 在Hive XML SerDe中使用 “Attribute to Attribute”映射

hadoop - Sort 在 MapReduce 阶段用在什么地方,为什么?

hadoop - Sqoop 导入问题 - java.lang.IncompatibleClassChangeError : Found class org. apache.hadoop.mapreduce.JobContext,但接口(interface)是预期的

mysql - 生成 int32 和 int64 大小的唯一主 ID

hadoop - 无法启动hbase.sh(引发错误)

nosql - HBase的最小服务器组成是多少?

hadoop - 级联 2.0.0 作业在 hadoop FileNotFoundException job.split 上失败

java - 在 Java 6 上运行的 JMX 客户端/服务器上支持 Kerberos 身份验证/授权