sql - LIMIT 的小结果查询比 100 行以上的查询慢 1000 倍

标签 sql postgresql postgresql-performance

我正在尝试调试一个查询,该查询返回的记录越多,运行速度越快,但使用较小的LIMIT(即10)返回较小的返回(即<10行),性能会严重下降(慢>10倍) .

示例:

快速查询,100 万行中有 5 个结果 - 无限制

SELECT *
FROM transaction_internal_by_addresses
WHERE address = 'foo'
ORDER BY block_number desc;

解释:

Sort  (cost=7733.14..7749.31 rows=6468 width=126) (actual time=0.030..0.031 rows=5 loops=1)
"  Output: address, block_number, log_index, transaction_hash"
  Sort Key: transaction_internal_by_addresses.block_number
  Sort Method: quicksort  Memory: 26kB
  Buffers: shared hit=10
  ->  Index Scan using transaction_internal_by_addresses_pkey on public.transaction_internal_by_addresses  (cost=0.69..7323.75 rows=6468 width=126) (actual time=0.018..0.021 rows=5 loops=1)
"        Output: address, block_number, log_index, transaction_hash"
        Index Cond: (transaction_internal_by_addresses.address = 'foo'::text)
        Buffers: shared hit=10
Query Identifier: -8912211611755432198
Planning Time: 0.051 ms
Execution Time: 0.041 ms

快速查询,100 万行中有 5 个结果:- 高限制

SELECT *
FROM transaction_internal_by_addresses
WHERE address = 'foo'
ORDER BY block_number desc
LIMIT 100;
Limit  (cost=7570.95..7571.20 rows=100 width=126) (actual time=0.024..0.025 rows=5 loops=1)
"  Output: address, block_number, log_index, transaction_hash"
  Buffers: shared hit=10
  ->  Sort  (cost=7570.95..7587.12 rows=6468 width=126) (actual time=0.023..0.024 rows=5 loops=1)
"        Output: address, block_number, log_index, transaction_hash"
        Sort Key: transaction_internal_by_addresses.block_number DESC
        Sort Method: quicksort  Memory: 26kB
        Buffers: shared hit=10
        ->  Index Scan using transaction_internal_by_addresses_pkey on public.transaction_internal_by_addresses  (cost=0.69..7323.75 rows=6468 width=126) (actual time=0.016..0.020 rows=5 loops=1)
"              Output: address, block_number, log_index, transaction_hash"
              Index Cond: (transaction_internal_by_addresses.address = 'foo'::text)
              Buffers: shared hit=10
Query Identifier: 3421253327669991203
Planning Time: 0.042 ms
Execution Time: 0.034 ms

慢速查询:- 低限制

SELECT *
FROM transaction_internal_by_addresses
WHERE address = 'foo'
ORDER BY block_number desc
LIMIT 10;

解释结果:

Limit  (cost=1000.63..6133.94 rows=10 width=126) (actual time=10277.845..11861.269 rows=0 loops=1)
"  Output: address, block_number, log_index, transaction_hash"
  Buffers: shared hit=56313576
  ->  Gather Merge  (cost=1000.63..3333036.90 rows=6491 width=126) (actual time=10277.844..11861.266 rows=0 loops=1)
"        Output: address, block_number, log_index, transaction_hash"
        Workers Planned: 4
        Workers Launched: 4
        Buffers: shared hit=56313576
        ->  Parallel Index Scan Backward using transaction_internal_by_address_idx_block_number on public.transaction_internal_by_addresses  (cost=0.57..3331263.70 rows=1623 width=126) (actual time=10256.995..10256.995 rows=0 loops=5)
"              Output: address, block_number, log_index, transaction_hash"
              Filter: (transaction_internal_by_addresses.address = 'foo'::text)
              Rows Removed by Filter: 18485480
              Buffers: shared hit=56313576
              Worker 0:  actual time=10251.822..10251.823 rows=0 loops=1
                Buffers: shared hit=11387166
              Worker 1:  actual time=10250.971..10250.972 rows=0 loops=1
                Buffers: shared hit=10215941
              Worker 2:  actual time=10252.269..10252.269 rows=0 loops=1
                Buffers: shared hit=10191990
              Worker 3:  actual time=10252.513..10252.514 rows=0 loops=1
                Buffers: shared hit=10238279
Query Identifier: 2050754902087402293
Planning Time: 0.081 ms
Execution Time: 11861.297 ms

DDL

create table transaction_internal_by_addresses
(
    address          text   not null,
    block_number     bigint,
    log_index        bigint not null,
    transaction_hash text   not null,
    primary key (address, log_index, transaction_hash)
);

alter table transaction_internal_by_addresses
    owner to "icon-worker";

create index transaction_internal_by_address_idx_block_number
    on transaction_internal_by_addresses (block_number);

所以我的问题

  • 我是否应该考虑强制查询规划器在地址(主键)上应用 WHERE 的方法?
  • 正如您在解释中所看到的,行 block_number 在慢速查询中被扫描,但我不确定为什么。谁能解释一下吗?
  • 这正常吗?似乎数据越多,查询就越困难,而不是像本例中那样。

更新

  • 对 A 的延迟回复和 B 的问题中的一些不一致表示歉意。
  • 我已经更新了 EXPLAIN,清楚地显示了 1000 倍的性能下降

最佳答案

(address, block_number DESC) 上的多列 BTREE 索引正是查询规划器生成您提到的结果集所需的。它将随机访问第一个符合条件的行的索引,然后按顺序读取这些行,直到达到 LIMIT。您也可以省略 DESC 而不会产生不良影响。

create index address_block_number
on transaction_internal_by_addresses
 (address, block_number DESC);

至于询问查询规划器结果的“原因”,这通常是一个持久的谜。

关于sql - LIMIT 的小结果查询比 100 行以上的查询慢 1000 倍,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74480227/

相关文章:

sql - 为什么 pg 查询会在一段时间后停止使用索引?

sql - SELECT DISTINCT 在我的 PostgreSQL 表上比预期的要慢

python - Redshift 将值插入表中

sql - 历史数据查询

postgresql - 多对多关联的连接表是否有任何选项?

postgresql - 多对多关系中所有 3 个表的 SELECT 语句

mysql - SQL使用top或其他命令拉取数据

具有限制的 len(n) 的 3 个值的 SQL Server 排列

MySQL 5.0 报告 "concat does not exist"

mysql - 配置单元错误 : FAILED: SemanticException [Error 10017]: Line 4:28 Both left and right aliases encountered in JOIN 'status_cd'