sql - 在 Teradata 中一次获取多个百分位数(percentile_cont 等效项)

标签 sql teradata analytic-functions

据我了解,我们可以在 Teradata 中将 percentile_cont 重写为:

SELECT
  part_col
  ,data_col
   + ((MIN(data_col) OVER (PARTITION BY part_col ORDER BY data_col ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) - data_col)
       * (((COUNT(*) OVER (PARTITION BY part_col) - 1) * x) MOD 1)) AS percentile_cont
FROM tab
QUALIFY ROW_NUMBER() OVER (PARTITION BY part_col ORDER BY data_col)
    = CAST((COUNT(*) OVER (PARTITION BY part_col) - 1) * x AS INT) + 1;

参见this very helpful discussion了解更多信息。

理解用 0.90 替换 x 将返回第 90 个百分位数,是否有一种优雅的方法来扩展它并一次返回多个百分位数?

例如,假设我想扩展此示例并一次返回第 25 个、第 50 个和第 75 个百分位数?这可能吗?似乎我需要多个 QUALIFY 语句?同样,如果我需要多个 GROUP BY 等价物,这是否类似于在 PARTITION BY 中传递更多列?

<小时/>
-- SQL:2008 Equivalent pseudo-code
SELECT
  part_col_a
 ,part_col_b
 ,PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY order_col) AS p25
 ,PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY order_col) AS p50
 ,PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY order_col) AS p75
FROM tab
GROUP BY
  part_col_a
  ,part_col_b

最佳答案

你应该充分阅读我的博客,最终的查询正是你想要的:-)

SELECT part_col
  ,MIN(pc25) OVER (PARTITION BY part_col) AS quartile_1
  ,MIN(pc50) OVER (PARTITION BY part_col) AS quartile_2
  ,MIN(pc75) OVER (PARTITION BY part_col) AS quartile_3
FROM
 (
  SELECT
    part_col
    ,COUNT(*)      OVER (PARTITION BY part_col) - 1 AS N
    ,ROW_NUMBER()  OVER (PARTITION BY part_col ORDER BY data_col) - 1 AS rowno
    ,MIN(data_col) OVER (PARTITION BY part_col ORDER BY data_col ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) - data_col AS diff

    ,CASE
       WHEN rowno = CAST(N * 0.25 AS INT)
       THEN data_col +(((N * 0.25) MOD 1) * diff)
     END AS pc25

    ,CASE
       WHEN rowno = CAST(N * 0.50 AS INT)
       THEN data_col +(((N * 0.50) MOD 1) * diff)
     END AS pc50

    ,CASE
       WHEN rowno = CAST(N * 0.75 AS INT)
       THEN data_col +(((N * 0.75) MOD 1) * diff)
     END AS pc75
  FROM tab
  QUALIFY   rowno = CAST(N * 0.25 AS INT)
       OR   rowno = CAST(N * 0.50 AS INT)
       OR   rowno = CAST(N * 0.75 AS INT)
 ) AS dt
QUALIFY ROW_NUMBER() OVER (PARTITION BY part_col ORDER BY part_col) = 1

关于sql - 在 Teradata 中一次获取多个百分位数(percentile_cont 等效项),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35733463/

相关文章:

teradata - Teradata 中变量的声明

hadoop - Sqoop导入性能改善

sql - 在 oracle 中对列进行排序而不进行排序

mysql - 左外连接问题

mysql - SQL 将 doublon ID 行合并到具有更多列的唯一行中

java - Teradata-jdbc : What's the point of using FaSTLoad if java has memory limitations?

sql - 如何在 Oracle SQL 中的键内的特定值之后向行添加标志?

Oracle Analytic函数-如何重用PARTITION BY子句?

sql - sql中大写字母的拆分词

sql - 使用 MySQL 作为键/值数据库的可扩展性