mysql - Hive中使用CTE加入错误

标签 mysql sql hive

我有两个查询,它们的工作几乎相似。 一种不使用 CTE,另一种使用 CTE。我无法弄清楚为什么第二个查询完全没有结果,而第一个查询是。

在过去的两个小时里,我一直在尝试通过尝试各种连接来解决这个问题,但在查询 1 中工作的相同连接在查询 2 中不起作用。我希望有人可以指导我解决这个问题。

第一个查询(返回结果):

WITH MessageCTE AS 
    (
    SELECT dt
    , id
    , ts
    , family
    , message_type
    , to_user
    , message_id
    , class
    FROM dhruv.MessageLatencyInformation_20171210_20171125_to_20171130_02 as latencydata
    INNER JOIN dhruv.UsersOn503AndAbove_20171201_200k as required_users
    ON latencydata.to_user = required_users.user_id
    )
SELECT COUNT(DISTINCT to_user) AS Users
, AVG(latency) AS AvgLatency
, AVG(CASE WHEN latency > 0 THEN latency ELSE NULL END) AS AvgLatency_Positive
, PERCENTILE(latency, 0.5) AS 50Percentile
, PERCENTILE(latency, 0.75) AS 75Percentile
, PERCENTILE(latency, 0.8) AS 80Percentile
, PERCENTILE(latency, 0.9) AS 90Percentile
, PERCENTILE(latency, 0.95) AS 95Percentile
, PERCENTILE(latency, 0.99) AS 99Percentile
FROM
    (
    SELECT a.dt, a.to_user, (latency_dl.ts - latency_pb.ts) as latency
    FROM 
        (
        SELECT dt
        , id, ts
        , family
        , message_type
        , to_user
        , message_id
        , class
        FROM MessageCTE
        WHERE class = 'pb'
        ) as latency_pb
    INNER JOIN 
        (SELECT dt
        , id
        , ts
        , family
        , message_type
        , to_user
        , message_id
        , class
        FROM MessageCTE
        WHERE class = 'rdl'
        AND family = 'stm'
        ) as latency_rdl
    ON latency_pb.dt = latency_rdl.dt and latency_pb.to_user = latency_rdl.to_user and latency_pb.id = latency_rdl.id
    INNER JOIN
        (
        SELECT dt
        , id
        , ts
        , family
        , message_type
        , to_user
        , message_id
        , class
        FROM MessageCTE
        WHERE class = 'dl'
        ) as latency_dl
    ON latency_rdl.dt = latency_dl.dt and latency_rdl.to_user = latency_dl.to_user and latency_rdl.id = latency_dl.id) AS UserLatency;

第一个查询输出: First Query Output

现在第二个查询,是一个轻微的修改和所有相同的条件,但由于某种原因它没有返回任何匹配项。希望有人能指导我,我只花了大约 2 个小时尝试一些加入,但我无法弄清楚为什么它们没有发生。

第二个查询:

WITH MessageCTE_pb AS 
    (
    SELECT dt, id, ts, to_user
    FROM 
        (
        SELECT dt, id, min(ts) as ts, to_user
        FROM dhruv.MessageLatencyInformation_20171210_20171125_to_20171130_02
        WHERE class = 'pb'
        GROUP BY dt, to_user, id
        ) as latencydata
    INNER JOIN dhruv.UsersOn503AndAbove_20171201_200k as required_users
    ON latencydata.to_user = required_users.user_id
    )
, MessageCTE_dl AS 
    (
    SELECT dt, id, ts, to_use
    FROM
        (
        SELECT dt, id, max(ts) as ts, to_user 
        FROM dhruv.MessageLatencyInformation_20171210_20171125_to_20171130_02 
        WHERE class = 'dl' 
        GROUP BY dt, to_user, id
        ) as latencydata
    INNER JOIN dhruv.UsersOn503AndAbove_20171201_200k as required_users
    ON latencydata.to_user = required_users.user_id
    )
, MessageCTE_rdl AS 
    (
    SELECT dt, id, to_user
    FROM
        (
        SELECT DISTINCT dt, id, to_user 
        FROM dhruv.MessageLatencyInformation_20171210_20171125_to_20171130_02
        WHERE class = 'rdl' 
        AND family = 'stm'
        ) as latencydata 
    INNER JOIN dhruv.UsersOn503AndAbove_20171201_200k as required_users
    ON latencydata.to_user = required_users.user_id
    )
SELECT COUNT(DISTINCT to_user) AS Users 
, AVG(latency) AS AvgLatency 
, AVG(CASE WHEN latency > 0 THEN latency ELSE NULL END) AS AvgLatency_Positive 
, PERCENTILE(latency, 0.5) AS 50Percentile 
, PERCENTILE(latency, 0.75) AS 75Percentile 
, PERCENTILE(latency, 0.8) AS 80Percentile 
, PERCENTILE(latency, 0.9) AS 90Percentile 
, PERCENTILE(latency, 0.95) AS 95Percentile 
, PERCENTILE(latency, 0.99) AS 99Percentile
FROM
    (
    SELECT a.dt, a.to_user, (latency_dl.ts - latency_pb.ts) as latency
    FROM MessageCTE_pb as latency_pb
    INNER JOIN MessageCTE_rdl as latency_rdl
    ON latency_pb.dt = latency_rdl.dt and latency_pb.to_user = latency_rdl.to_user and latency_pb.id = latency_rdl.id
    INNER JOIN MessageCTE_dl as latency_dl
    ON latency_rdl.dt = latency_dl.dt and latency_rdl.to_user = latency_dl.to_user and latency_rdl.id = latency_dl.id) AS UserLatency;

谢谢!

第二次查询结果: Second Query Result

最佳答案

答案 block 中的另一个评论,这样我就可以发布一堆 SQL...

这是什么结果?

WITH
    UserLatency AS 
(
    SELECT
        latencydata.dt,
        latencydata.to_user,
        latencydata.id,
        MAX(CASE WHEN latencydata.class = 'dl' THEN latencydata.ts END)
        -
        MIN(CASE WHEN latencydata.class = 'pb' THEN latencydata.ts END)
            AS latency
    FROM
        dhruv.MessageLatencyInformation_20171210_20171125_to_20171130_02   AS latencydata
    INNER JOIN
        dhruv.UsersOn503AndAbove_20171201_200k                             AS required_users
            ON latencydata.to_user = required_users.user_id
    GROUP BY
        latencydata.dt,
        latencydata.to_user,
        latencydata.id
    HAVING
        0 < SUM(CASE WHEN latencydata.class  = 'rdl'
                      AND latencydata.family = 'stm' THEN 1 END)
)
SELECT
      COUNT(DISTINCT to_user)                       AS Users
    , AVG(latency)                                  AS AvgLatency 
    , AVG(CASE WHEN latency > 0 THEN latency END)   AS AvgLatency_Positive 
    , PERCENTILE(latency, 0.50)                     AS 50Percentile 
    , PERCENTILE(latency, 0.75)                     AS 75Percentile 
    , PERCENTILE(latency, 0.80)                     AS 80Percentile 
    , PERCENTILE(latency, 0.90)                     AS 90Percentile 
    , PERCENTILE(latency, 0.95)                     AS 95Percentile 
    , PERCENTILE(latency, 0.99)                     AS 99Percentile
FROM
    UserLatency
;

关于mysql - Hive中使用CTE加入错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47915689/

相关文章:

mysql - WordPress 插件即使在禁用时也会连接到 WP 数据库吗?

mysql - 错误代码 : 1054. 's.Product_id' 中的未知列 'on clause' 错误代码 : 1054. 's.Product_id' 中的未知列 'on clause'

mysql - 从 SQL Server Express 2012 迁移到 MySQL 未列出架构

json - 配置单元是否允许列名为 "rows"?

hadoop - 是否可以在Hive中执行按位分组功能?

php - 在 MYSQL 中存储 URL 值

php - 使用 php 从 mysql 表创建基于类别的数组

php - 对顶部匹配的标签进行排序

mysql - 如何在MySQL中搜索精确的字符串值?

sql - 计算配置单元数据中的大多数条目(模式)