我有下面的查询,运行大约需要 15-20 秒。
with cte0 as ( SELECT label, date, CASE WHEN Lead(label || date || "number") OVER (PARTITION BY label || date || "number" ORDER BY "label", "date", "number", "time") IS NULL THEN '1'::numeric ELSE '0'::numeric END As "unique" FROM table_data LEFT JOIN table_mapper ON table_mapper."type" = table_data."type" WHERE Date BETWEEN date_trunc('month', current_date - 1) and current_date - 1 ) SELECT 'MTD' as "label", round(sum("unique") / count("unique") *100,1) as "value" FROM cte0 WHERE "date" BETWEEN date_trunc('month', current_date - 1) AND current_date -1 UNION ALL SELECT 'Week' as "label", round(sum("unique") / count("unique") *100,1) as "value" FROM cte0 WHERE "date" BETWEEN date_trunc('week', current_date - 1) AND current_date -1 UNION ALL SELECT 'FTD' as "label", round(sum("unique") / count("unique") *100,1) as "value" FROM cte0 WHERE "date" = current_date -1
在表 table_data
中,我在 date
列上有一个索引。
CREATE INDEX ix_cli_date ON table_data USING btree (date);
表定义(\d table_data
)
Table "public.table_data" Column | Type | Modifiers ------------------+------------------------+----------- date | date | not null number | bigint | not null time | time without time zone | not null end time | time without time zone | not null duration | integer | not null time1 | integer | not null time2 | integer | not null time3 | integer | not null time4 | integer | not null time5 | integer | not null time6 | integer | not null time7 | integer | not null type | text | not null name | text | not null id1 | integer | not null id2 | integer | not null key | integer | not null status | text | not null Indexes: "ix_cli_date" btree (date)
表定义(\d table_mapper
)
Table "public.table_mapper" Column | Type | Modifiers ------------+------+----------- type | text | not null label | text | not null label2 | text | not null label3 | text | not null label4 | text | not null label5 | text | not null
EXPLAIN ANALYZE 查询
Result (cost=184342.66..230332.86 rows=3 width=64) (actual time=23377.923..25695.478 rows=3 loops=1)" CTE cte0" -> WindowAgg (cost=121516.06..156751.65 rows=612793 width=23) (actual time=14578.000..18985.958 rows=696157 loops=1)" -> Sort (cost=121516.06..123048.04 rows=612793 width=23) (actual time=14577.975..17084.405 rows=696157 loops=1)" Sort Key: (((table_mapper.label || (table_data.date)::text) || (table_data."number")::text)), table_mapper.label, table_data.date, table_data."number", table_data."time"" Sort Method: external merge Disk: 39480kB" -> Hash Left Join (cost=11.96..37474.21 rows=612793 width=23) (actual time=1.449..3308.718 rows=696157 loops=1)" Hash Cond: (table_data."type" = table_mapper."type")" -> Index Scan using ix_cli_date on table_data (cost=0.02..29036.36 rows=612793 width=38) (actual time=0.141..946.648 rows=696157 loops=1)" Index Cond: ((date >= date_trunc('month'::text, ((('now'::text)::date - 1))::timestamp with time zone)) AND (date Hash (cost=7.53..7.53 rows=353 width=25) (actual time=1.275..1.275 rows=336 loops=1)" Buckets: 1024 Batches: 1 Memory Usage: 15kB" -> Seq Scan on table_mapper (cost=0.00..7.53 rows=353 width=25) (actual time=0.020..0.589 rows=336 loops=1)" -> Append (cost=27591.00..73581.21 rows=3 width=64) (actual time=23377.920..25695.467 rows=3 loops=1)" -> Aggregate (cost=27591.00..27591.02 rows=1 width=32) (actual time=23377.917..23377.918 rows=1 loops=1)" -> CTE Scan on cte0 (cost=0.00..27575.68 rows=3064 width=32) (actual time=14578.052..22335.236 rows=696157 loops=1)" Filter: ((date = date_trunc('month'::text, ((('now'::text)::date - 1))::timestamp with time zone)))" -> Aggregate (cost=27591.00..27591.02 rows=1 width=32) (actual time=1741.509..1741.510 rows=1 loops=1)" -> CTE Scan on cte0 (cost=0.00..27575.68 rows=3064 width=32) (actual time=20.009..1522.352 rows=168261 loops=1)" Filter: ((date = date_trunc('week'::text, ((('now'::text)::date - 1))::timestamp with time zone)))" -> Aggregate (cost=18399.11..18399.13 rows=1 width=32) (actual time=576.029..576.030 rows=1 loops=1)" -> CTE Scan on cte0 (cost=0.00..18383.79 rows=3064 width=32) (actual time=9.308..546.735 rows=23486 loops=1)" Filter: (date = (('now'::text)::date - 1))" Total runtime: 25710.506 ms"
描述:
我正在从 table_data
中获取唯一计数和重复计数,这就是 LEAD
帮助我解决的问题,我为 a 的最后一个重复值指定了值 0专栏。
假设我在一列中有 3 个 x
。我将 1
值赋给前 2 个 x
,第三个 x
赋值为 0。
实际上,通过 cte
,我从表 table_data
中取出整行,并使用前导进行一些计算,并在定义的日期范围内连接字符串,其中每个行行 1
和 0
值根据标准定义。
如果线索为空,则计为 1,如果不为空,则计为 0。
然后我分别返回 3 行 MTD
、Current Week
和 FTD
,并计算了 sum()
我从领导和 count(*)
整行中得到。
对于 MTD,我有当月的总和和计数。
对于周 - 这是当前周,FTD 是昨天。
最佳答案
WITH cte AS (
SELECT d.thedate
, lead(m.label) OVER (PARTITION BY m.label, d.thedate, d.number
ORDER BY d.thetime) AS leader
FROM table_data d
LEFT JOIN table_mapper m USING (type)
WHERE thedate BETWEEN date_trunc('month', current_date - 1)
AND current_date - 1
)
SELECT 'MTD' AS label, round(count(leader)::numeric / count(*) * 100, 1) AS val
FROM cte
UNION ALL
SELECT 'Week', round(count(leader)::numeric / count(*) * 100, 1)
FROM cte
WHERE thedate BETWEEN date_trunc('week', current_date - 1) AND current_date - 1
UNION ALL
SELECT 'FTD', round(count(leader)::numeric / count(*) * 100, 1)
FROM cte
WHERE thedate = current_date - 1;
CTE 对于大表很有意义,因此您只需扫描一次。对于较小的表,如果没有...,它可能会更快
使用 thedate
代替保留字 date
(在标准 SQL 中)。
thetime
,uni
而不是time
,unique
。等等
简化了 lead()
调用。您获得前导行的值或 NULL。这似乎是唯一相关的信息。
在 window function 的 ORDER BY
子句中重复 PARTITION
子句中的列是毫无意义的浪费.
在此基础上,count(leader)/count(*)
比 sum(uni)/count(uni)
更快一些。 count(column)
只计算非空值,而 count(*)
计算所有行。
UNION
查询的第一项条件是多余的。
在问题的评论中提供更多关于数据定义的建议和链接。
表设计/索引
你应该有主键。我建议 serial
或 IDENTITY
列作为 table_data
的代理 PK:
ALTER TABLE table_data ADD COLUMN table_data_id serial PRIMARY KEY;
参见:
将 type
设为 table_mapper
的主键(以下 FK 约束也需要):
ALTER TABLE table_mapper ADD CONSTRAINT table_mapper_pkey (type);
为 type
添加外键约束以强制执行参照完整性。像这样的东西:
ALTER TABLE table_data ADD CONSTRAINT table_data_type_fkey
FOREIGN KEY (type) REFERENCES table_mapper (type)
ON UPDATE CASCADE ON DELETE NO ACTION;
为了最终的读取性能(写入需要付出一些代价),添加一个多列索引可能允许 index-only scans对于上述查询:
CREATE INDEX table_data_foo_idx ON table_data (thedate, number, thetime);
关于sql - PostgreSQL - 查询优化,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23339269/