现在,如果我想获得某个值的十分位数,我会这样做
SELECT
APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(10)] as p10,
APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(20)] as p20,
APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(30)] as p30,
APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(40)] as p40,
APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(50)] as p50,
APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(60)] as p60,
APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(70)] as p70,
APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(80)] as p80,
APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(90)] as p90,
APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(100)] as p100
FROM table
我想确保这不会使大型查询的工作量增加 10 倍,并且是否有更紧凑的方式来编写此内容
最佳答案
如果您运行查询,然后检查执行计划,您将看到 BigQuery 仅计算一次分位数,然后在第二步中提取数组的各个元素。您无需担心自己尝试对 APPROX_QUANTILES
聚合进行重复数据删除。
关于quantile - 高效使用 BigQuery 的 APPROX_QUANTILES,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50959540/