sql - PostgreSQL trigram Indexes in characters < 3 行为

在我的 PostgreSQL 数据库中，我有 slides 表，其中有 name 列。我想实现搜索。我在 PostgreSQL 中尝试了三元组索引。我创建了以下索引:

CREATE INDEX index_slides_on_name_trigram ON slides USING gin (name gin_trgm_ops);

当我搜索至少 3 个字符时，索引工作正常:

explain analyze SELECT name FROM slides WHERE name ILIKE '%hur%';

QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on slides  (cost=18.97..1809.80 rows=900 width=25) (actual time=0.810..6.316 rows=906 loops=1)
   Recheck Cond: ((name)::text ~~* '%hur%'::text)
   Heap Blocks: exact=583
   ->  Bitmap Index Scan on index_slides_on_name_trigram  (cost=0.00..18.75 rows=900 width=0) (actual time=0.552..0.552 rows=906 loops=1)
         Index Cond: ((name)::text ~~* '%hur%'::text)
 Planning time: 0.973 ms
 Execution time: 6.506 ms
(7 rows)

但是当我的搜索短语短于 3 个字符时，索引没有被使用:

explain analyze SELECT name FROM slides WHERE name ILIKE '%hu%';

QUERY PLAN                                                
---------------------------------------------------------------------------------------------------------
 Seq Scan on slides  (cost=0.00..2803.86 rows=932 width=25) (actual time=0.053..31.075 rows=910 loops=1)
   Filter: ((name)::text ~~* '%hu%'::text)
   Rows Removed by Filter: 25399
 Planning time: 0.954 ms
 Execution time: 31.220 ms
(5 rows)

这是三元索引的工作方式吗？我想知道是否有更好的方法来实现搜索。

最佳答案

PostgreSQL 认为如果查询字符串太短，使用顺序扫描比三元组索引效率更高。

这是因为短搜索字符串可能会找到很多结果，无论正确与否，如果您需要检查表的更大部分，顺序扫描通常会更快。

你可以先运行自己测试一下

SET enable_seqscan=off;

然后 PostgreSQL 将尽可能避免顺序扫描。

如果您不确定 PostgreSQL 是否正确，您可以在打开或关闭顺序扫描的情况下执行查询，并测量在每种情况下需要多长时间。

关于sql - PostgreSQL trigram Indexes in characters < 3 行为，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44774315/

sql - PostgreSQL trigram Indexes in characters < 3 行为

上一篇：python - 如何使用 SQLAlchemy 测试无效记录插入？

下一篇：php - 如何在准备好的语句中使用sql函数