sql - PostgreSQL 10 - IN 和任何性能莫名其妙的行为

标签 sql postgresql

我从大表中进行选择,其中 id 在数组/列表中。 检查了几个变体,结果让我感到惊讶。

<强>1。使用 ANY 和 ARRAY

EXPLAIN (ANALYZE,BUFFERS)
SELECT * FROM cca_data_hours
    WHERE
    datetime = '2018-01-07 19:00:00'::timestamp without time zone AND
    id_web_page = ANY (ARRAY[1, 2, 8, 3 /* ~50k ids */])

结果

"Index Scan using cca_data_hours_pri on cca_data_hours  (cost=0.28..576.79 rows=15 width=188) (actual time=0.035..0.998 rows=6 loops=1)"
"  Index Cond: (datetime = '2018-01-07 19:00:00'::timestamp without time zone)"
"  Filter: (id_web_page = ANY ('{1,2,8,3, (...)"
" Rows Removed by Filter: 5"
"  Buffers: shared hit=3"
"Planning time: 57.625 ms"
"Execution time: 1.065 ms"

<强>2。使用 IN 和 VALUES

EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM cca_data_hours
    WHERE
    datetime = '2018-01-07 19:00:00'::timestamp without time zone AND
    id_web_page IN (VALUES (1),(2),(8),(3) /* ~50k ids */)

结果

"Hash Join  (cost=439.77..472.66 rows=8 width=188) (actual time=90.806..90.858 rows=6 loops=1)"
"  Hash Cond: (cca_data_hours.id_web_page = "*VALUES*".column1)"
"  Buffers: shared hit=3"
"  ->  Index Scan using cca_data_hours_pri on cca_data_hours  (cost=0.28..33.06 rows=15 width=188) (actual time=0.035..0.060 rows=11 loops=1)"
"        Index Cond: (datetime = '2018-01-07 19:00:00'::timestamp without time zone)"
"        Buffers: shared hit=3"
"  ->  Hash  (cost=436.99..436.99 rows=200 width=4) (actual time=90.742..90.742 rows=4 loops=1)"
"        Buckets: 1024  Batches: 1  Memory Usage: 9kB"
"        ->  HashAggregate  (cost=434.99..436.99 rows=200 width=4) (actual time=90.709..90.717 rows=4 loops=1)"
"              Group Key: "*VALUES*".column1"
"              ->  Values Scan on "*VALUES*"  (cost=0.00..362.49 rows=28999 width=4) (actual time=0.008..47.056 rows=28999 loops=1)"
"Planning time: 53.607 ms"
"Execution time: 91.681 ms"

我希望案例 #2 会更快,但事实并非如此。 为什么 IN with VALUES 很慢?

最佳答案

比较 EXPLAIN ANALYZE 结果,在给定的示例中,旧版本似乎没有使用 key 的可用索引。 ANY (ARRAY[]) 变得更快的原因在于版本 9.2 https://www.postgresql.org/docs/current/static/release-9-2.html

Allow indexed_col op ANY(ARRAY[...]) conditions to be used in plain index scans and index-only scans (Tom Lane)

您从中获得建议的站点是关于版本 9.0

关于sql - PostgreSQL 10 - IN 和任何性能莫名其妙的行为,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48317609/

相关文章:

SQL Server : Could not find prepared statement with handle 10 from query analizer

sql - 为每一行分配默认值的简单连接表(包括 sql fiddle)

database - postgresql 中带有字母(非数字)的列的数据类型

PostgresQL 错误 : relation <table> doesn't exist

postgresql - Postgres "normalise"区间输出

mysql - 编辑 :SQL multiple table instances

Mysql 复制不适用于不同的数据库引擎?

postgresql - 在 Postgresql 数据库中存储递归结构

mysql - SQL ORDER BY 与 UNION

postgresql - 如何使用 GeoTools 从 hstore 列中过滤数据?