hadoop - 如何在HIVEQL中的特定列值上保留外部联接?

标签 hadoop hive hiveql

当我运行以下HiveQL代码时,出现错误:
执行错误:编译语句时出错:失败:SemanticException [错误10004]:行1:2112无效的表别名或
列引用“T3”

    SELECT *        
    FROM CC_CLAIM_EXT T1
    INNER JOIN CC_EXPOSURE_EXT T2 ON (T1.ID = T2.CLAIMID)
    LEFT OUTER JOIN CC_POLICY_EXT T3 ON (T1.POLICYID = T3.ID)
    LEFT OUTER JOIN CC_COVERAGE_EXT T4 ON (T2.COVERAGEID = T4.ID)
    LEFT OUTER JOIN CC_TRANSACTION_EXT T5 ON (T2.ID = T5.EXPOSUREID)
    LEFT OUTER JOIN CC_TRANSACTIONSET_EXT T6 ON (T5.TRANSACTIONSETID = T6.ID)
    LEFT OUTER JOIN CC_TRANSACTIONLINEITEM_EXT T7 ON (T5.ID = T7.TRANSACTIONID)
    LEFT OUTER JOIN CC_RISKUNIT_EXT T12 ON (T4.RISKUNITID = T12.ID)
    LEFT OUTER JOIN CC_CLASSCODE_EXT T13 ON (T12.CLASSCODEID = T13.ID)

    LEFT OUTER JOIN (SELECT TT12.CLAIMID
                            ,CASE WHEN COUNT(TT13.PRIMARYBODYPART) > 1 THEN 10010 ELSE MAX(TT13.PRIMARYBODYPART) END AS PRIMARYBODYPART
                            ,CASE WHEN COUNT(TT13.DETAILEDBODYPART) > 1 THEN 10010 ELSE MAX(TT13.DETAILEDBODYPART) END AS DETAILEDBODYPART
                    FROM CC_INCIDENT_EXT TT12
                    LEFT OUTER JOIN CC_BODYPART_EXT TT13 ON (TT12.ID = TT13.INCIDENTID)
                    GROUP BY TT12.CLAIMID) T14 
    ON (T1.ID = T14.CLAIMID AND T3.POLICYTYPE IN(10022,10023))

    WHERE T1.STATE IN(2,3)
        AND T2.STATE IN(2,3)
        AND T6.APPROVALSTATUS = 1
        AND T7.RETIRED = 0

    ORDER BY CLAIMNUMBER
        ,EXPOSUREID
        ,TRANSACTIONID

我将其范围缩小:
    ON (T1.ID = T14.CLAIMID AND T3.POLICYTYPE IN(10022,10023))

如果我删除:
    AND T3.POLICYTYPE IN(10022,10023)

代码运行正常。是否有更好的方法来限制HiveQL中的此联接?

最佳答案

该错误是因为您在T1和T14之间的左联接上放置了一个引用,但您在条件上放置了T3。为了将查询限制为指定的T3.policy_id,您应该沿着离开T1和T3的那一行。参见下面的第4行:

SELECT *        
    FROM CC_CLAIM_EXT T1
    INNER JOIN CC_EXPOSURE_EXT T2 ON (T1.ID = T2.CLAIMID)
    LEFT OUTER JOIN CC_POLICY_EXT T3 ON (T1.POLICYID = T3.ID AND T3.POLICYTYPE IN(10022,10023))
    LEFT OUTER JOIN CC_COVERAGE_EXT T4 ON (T2.COVERAGEID = T4.ID)
    LEFT OUTER JOIN CC_TRANSACTION_EXT T5 ON (T2.ID = T5.EXPOSUREID)
    LEFT OUTER JOIN CC_TRANSACTIONSET_EXT T6 ON (T5.TRANSACTIONSETID = T6.ID)
    LEFT OUTER JOIN CC_TRANSACTIONLINEITEM_EXT T7 ON (T5.ID = T7.TRANSACTIONID)
    LEFT OUTER JOIN CC_RISKUNIT_EXT T12 ON (T4.RISKUNITID = T12.ID)
    LEFT OUTER JOIN CC_CLASSCODE_EXT T13 ON (T12.CLASSCODEID = T13.ID)
    LEFT OUTER JOIN (SELECT TT12.CLAIMID
                            ,CASE WHEN COUNT(TT13.PRIMARYBODYPART) > 1 THEN 10010 ELSE MAX(TT13.PRIMARYBODYPART) END AS PRIMARYBODYPART
                            ,CASE WHEN COUNT(TT13.DETAILEDBODYPART) > 1 THEN 10010 ELSE MAX(TT13.DETAILEDBODYPART) END AS DETAILEDBODYPART
                    FROM CC_INCIDENT_EXT TT12
                    LEFT OUTER JOIN CC_BODYPART_EXT TT13 ON (TT12.ID = TT13.INCIDENTID)
                    GROUP BY TT12.CLAIMID) T14 
    ON T1.ID = T14.CLAIMID
    WHERE T1.STATE IN(2,3)
        AND T2.STATE IN(2,3)
        AND T6.APPROVALSTATUS = 1
        AND T7.RETIRED = 0
    ORDER BY CLAIMNUMBER
        ,EXPOSUREID
        ,TRANSACTIONID

关于hadoop - 如何在HIVEQL中的特定列值上保留外部联接?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49540626/

相关文章:

hadoop - HBase局部扫描怎么办?

hadoop - 如何处理配置单元分区以提高性能与过度分区

datetime - Hive:如何将 yyyy-mm-ddThh:mm:SS:sssZ 转换为小时单位

带有横向 View 的 HIVE 查询,Json_Tuple

mysql - 在 hbase 中使用 enclosed by

c++ - 在 Hadoop 2.x 中运行 C++ 代码

linux - HDFS + 在 HDFS 文件夹到本地文件系统文件夹之间创建符号链接(symbolic link)

hadoop - 更新Hive表中的值

python - 安排 pyspark 笔记本

sql - Hive 查询逻辑和优化