我有一个包含 1m 条记录的表,其中 100k 条记录在 colA 上为空。剩余记录具有截然不同的值,在此列上创建常规索引与使用 where colA is not null 的部分索引有区别吗?

由于常规 Postgres 索引不存储 NULL 值,这与使用 where colA is not null 创建部分索引不一样吗?



这是一个在 13.5 上使用完整索引的测试。

# create index idx_test_num on test(num);

# explain select count(*) from test where num is null;
                                     QUERY PLAN                                      
 Aggregate  (cost=5135.00..5135.01 rows=1 width=8)
   ->  Bitmap Heap Scan on test  (cost=63.05..5121.25 rows=5500 width=0)
         Recheck Cond: (num IS NULL)
         ->  Bitmap Index Scan on idx_test_num  (cost=0.00..61.68 rows=5500 width=0)
               Index Cond: (num IS NULL)
(5 rows)


# create index idx_test_num on test(num) where num is not null;

# explain select count(*) from test where num is null;
                                      QUERY PLAN                                      
 Finalize Aggregate  (cost=10458.12..10458.13 rows=1 width=8)
   ->  Gather  (cost=10457.90..10458.11 rows=2 width=8)
         Workers Planned: 2
         ->  Partial Aggregate  (cost=9457.90..9457.91 rows=1 width=8)
               ->  Parallel Seq Scan on test  (cost=0.00..9352.33 rows=42228 width=0)
                     Filter: (num IS NULL)
(6 rows)

Since regular postgres indexes do not store NULL values...

自 16 年前的 8.2 版 [检查笔记] 以来,情况并非如此。 8.2 docs说...

Indexes are not used for IS NULL clauses by default. The best way to use indexes in such cases is to create a partial index using an IS NULL predicate.

8.3 introduced nulls firstnulls last 以及围绕空值的许多其他改进,包括...

Allow col IS NULL to use an index (Teodor)

