我正在尝试从 Oracle SQL 查询
创建 HIVE 查询
。本质上,我想选择第一条记录,按 UPDATED_TM、DATETIME、ID_NUM
降序
排序。
SELECT
tbl1.NUM AS ID,
tbl1.UNIT AS UNIT,
tbl2.VALUE AS VALUE,
tbl1.CONTACT AS CONTACT_NAME,
'FILE' AS SOURCE,
CURDATE() AS DATE
FROM
DB1.TBL1 tbl1
LEFT JOIN DB1.TBL2 tbl2 ON tbl1.USR_ID = tbl2.USR_ID
WHERE
tbl1.UNIT IS NOT NULL
AND tbl1.TYPE = 'Generic'
QUALIFY
ROW_NUMBER() OVER (PARTITION BY tbl1.ROW_ID ORDER BY tbl1.UPDATED_TM DESC, tbl1.DATETIME DESC, tbl1.ID_NUM DESC) = 1
以及我对等效 Hive 查询
的尝试(但也兼容 sql):
SELECT
tbl1.NUM AS ID,
tbl1.UNIT AS UNIT,
tbl2.VALUE AS VALUE,
tbl1.CONTACT AS CONTACT_NAME,
'FILE' AS SOURCE,
CURDATE() AS DATE
FROM (
SELECT
USR_ID, TYPE, NUM, UNIT, ROW_NUMBER() OVER (PARTITION BY tbl1.ROW_ID ORDER BY tbl1.UPDATED_TM DESC, tbl1.DATETIME DESC, tbl1.ID_NUM DESC) AS RNUM
FROM
DB1.TBL1
) tbl1
LEFT JOIN DB1.TBL2 tbl2 ON tbl1.USR_ID = tbl2.USR_ID
WHERE
tbl1.RNUM = 1
AND tbl1.UNIT IS NOT NULL
AND tbl1.TYPE = 'Generic'
这看起来正确吗?有什么办法可以优化查询吗?我正在使用的表非常大,我希望使其尽可能高效。
谢谢。
最佳答案
SELECT
tbl1.NUM AS ID,
tbl1.UNIT AS UNIT,
tbl2.VALUE AS VALUE,
tbl1.CONTACT AS CONTACT_NAME,
'FILE' AS SOURCE,
CURDATE() AS DATE
FROM
(
SELECT
USR_ID, TYPE, NUM, UNIT, ROW_NUMBER() OVER (PARTITION BY tbl.ROW_ID ORDER BY tbl.UPDATED_TM DESC, tbl.DATETIME DESC, tbl.ID_NUM DESC) AS RNUM
FROM
(
SELECT
USR_ID,TYPE,NUM,UNIT,ROW_ID,UPDATED_TM,DATETIME,ID_NUM
FROM DB1.TBL1
WHERE UNIT IS NOT NULL
AND TYPE = 'Generic'
)tbl
)tbl1
LEFT OUTER JOIN
DB1.TBL2 tbl2
ON tbl1.USR_ID = tbl2.USR_ID
WHERE tbl1.RNUM = 1;
关于SQL QUALIFY 等效 HIVE 查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29912918/