sql - 如何使用子查询(或许通过横向连接)优化 sql 查询?

标签 sql postgresql optimization postgis lateral-join

我正在尝试优化复杂的 SQL 查询,它将在每个 map 绑定(bind)框更改时执行。我认为 INNER LATERAL JOIN 会是最快的,但事实并非如此。有谁知道如何加快此查询以及如何更好地利用LATERAL JOIN

我做过的最快的查询:

SELECT r0."id", r0."name" 
FROM "hiking"."routes" AS r0 
INNER JOIN "hiking"."hierarchy" AS h1 ON r0."id" = h1."parent" 
INNER JOIN (SELECT DISTINCT unnest(s0."rels") AS "rel" 
            FROM "hiking"."segments" AS s0 
            WHERE (ST_Intersects(s0."geom", ST_SetSrid(ST_MakeBox2D(ST_GeomFromText('POINT(1285982.015631 7217169.814674)', -1), ST_GeomFromText('POINT(2371999.313507 6454022.524275)', -1)), 3857)))) AS s2 ON TRUE 
WHERE (s2."rel" = h1."child");

Planning time: ~0.605 ms Execution time: ~37.232 ms

实际上与上面相同,但使用LATERAL JOIN,它会更慢吗?

SELECT r0."id", r0."name" 
FROM "hiking"."routes" AS r0 
INNER JOIN "hiking"."hierarchy" AS h1 ON r0."id" = h1."parent" 
INNER JOIN LATERAL (SELECT DISTINCT unnest(s0."rels") AS "rel" 
                    FROM "hiking"."segments" AS s0 
                    WHERE (ST_Intersects(s0."geom", ST_SetSrid(ST_MakeBox2D(ST_GeomFromText('POINT(1285982.015631 7217169.814674)', -1), ST_GeomFromText('POINT(2371999.313507 6454022.524275)', -1)), 3857)))) AS s2 ON TRUE 
WHERE (s2."rel" = h1."child");

Planning time: ~1.353 ms Execution time: ~38.518 ms

子查询中子查询的最慢查询(这是我的第一个查询,所以我对其进行了一些改进):

SELECT r0."id", r0."name" 
FROM "hiking"."routes" AS r0 
INNER JOIN (SELECT DISTINCT h0."parent" AS "parent" 
            FROM "hiking"."hierarchy" AS h0 
            INNER JOIN (SELECT DISTINCT unnest(s0."rels") AS "rel" 
                        FROM "hiking"."segments" AS s0 
                        WHERE (ST_Intersects(s0."geom", ST_SetSrid(ST_MakeBox2D(ST_GeomFromText('POINT(1285982.015631 7217169.814674)', -1), ST_GeomFromText('POINT(2371999.313507 6454022.524275)', -1)), 3857)))) AS s1 ON TRUE 
            WHERE (h0."child" = s1."rel")) AS s1 ON TRUE 
WHERE (r0."top" AND (r0."id" = s1."parent"));

Planning time: ~1.017 ms Execution time: ~41.288 ms

最佳答案

在不了解数据库的情况下很难重现查询的逻辑,但我会尝试,所以请耐心等待:

SELECT r0."id", r0."name" 
FROM "hiking"."routes" AS r0 
INNER JOIN "hiking"."hierarchy" AS h1 ON r0."id" = h1."parent" 
WHERE 
  EXISTS (
    SELECT 1
    FROM "hiking"."segments" AS s0 
    WHERE (
      ST_Intersects(
        s0."geom",
        ST_SetSrid(ST_MakeBox2D(ST_GeomFromText('POINT(1285982.015631 7217169.814674)', -1), ST_GeomFromText('POINT(2371999.313507 6454022.524275)', -1)),
        3857)))
      AND array[h1."child"] <@ s0."rels");

有两点:

  1. EXISTSNOT EXISTS 过滤数据有时比加入过滤数据更快
  2. 您可以使用数组比较运算符,而不是取消嵌套数组字段来将其元素与某个值进行比较。拥有适当的 GIN 索引会更快(文档 herehere )。

这是如何在数组上使用索引及其速度的简单示例:

create table foo(bar int[]);
insert into foo(bar) select array[1,2,3,x] from generate_series(1,1000000) as x;
create index idx on foo using gin (bar); // Note this
select * from foo where 666 in (select unnest(bar)); // 6936,345 ms on my HW
select * from foo where array[666] <@ bar; // 45,524 ms

关于sql - 如何使用子查询(或许通过横向连接)优化 sql 查询?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48362243/

相关文章:

c# - 带偏移量和获取的 SQL 查询

sql - MySQL 从日期列中获取每个月的最后一个日期

Mysql日期功能不工作少于

javascript - 如何在更少的 SQL 查询中执行复杂的 API 授权?

mysql - Laravel 4 - 区分大小写的数据库列

c++ - std::move 和 RVO 优化

mysql - 如何将十进制数按组相乘

postgresql - 如何查询自指定点(时间戳或事务 ID)以来的 postgres 增量更新?

c++ - 将 x-1 计算成一个变量然后使用它会更好吗?

c - 对于许多圆圈,查找圆圈内的所有线。优化