sqlite - sqlite 中内连接的查询优化

标签 sqlite

我有一个包含 3 个表的数据库:masterInfo、primDescT、secDescT。

CREATE TABLE masterInfo (id INTEGER PRIMARY KEY AUTOINCREMENT,
primDescId INTEGER,
secDescId INTEGER,
category INTEGER,
UNIQUE(primDescId, secDescId, category));

CREATE TABLE primDescT (id INTEGER PRIMARY KEY,
primDesc nvarchar(512));

CREATE TABLE secDescT (id INTEGER PRIMARY KEY,
secDesc nvarchar(512));

INSERT INTO primDescT VALUES(1,'XXXX');
INSERT INTO primDescT VALUES(2,'YYYY');
INSERT INTO primDescT VALUES(3,'ZZZZ');
INSERT INTO primDescT VALUES(4,'SSSS');

INSERT INTO secDescT VALUES(1,'AAA');
INSERT INTO secDescT VALUES(2,'BBB');
INSERT INTO secDescT VALUES(3,'CCC');

INSERT INTO masterInfo VALUES(1,1,1,1);
INSERT INTO masterInfo VALUES(2,2,2,2);
INSERT INTO masterInfo VALUES(3,3,1,1);
INSERT INTO masterInfo VALUES(4,4,3,2);

表,masterInfo 有 1765137 行,primDescT 中有 312210 行,secDescT 中有 105458 行。

我使用以下查询来获取结果。

SELECT m.id AS pId, 
primDesc AS pDescr, secDesc AS sDescr, category   AS category 
FROM masterInfo m
INNER JOIN primDescT ON primDescT.id = m.primDescId
INNER JOIN secDescT ON secDescT.id = m.secDescId
WHERE m.category IN ('1','2') ORDER BY pDescr ASC  LIMIT 100 OFFSET 0

以上查询需要 8 秒才能响应。

但是如果我将偏移量设置为1756300,那么需要53秒。

SELECT m.id AS pId, 
primDesc AS pDescr, secDesc AS sDescr, category   AS category 
FROM masterInfo m
INNER JOIN primDescT ON primDescT.id = m.primDescId
INNER JOIN secDescT ON secDescT.id = m.secDescId
WHERE m.category IN ('1','2') ORDER BY pDescr ASC  LIMIT 100 OFFSET 1756300

如何优化上述查询以在 3 秒内获取?

最佳答案

这些查询的问题在于 ORDER BY:必须先计算所有结果,然后数据库才能确定 100 个或 1756400 个最小的结果是哪一个。 EXPLAIN QUERY PLAN输出:

0,0,0,SCAN TABLE masterInfo AS m
0,1,1,SEARCH TABLE primDescT USING INTEGER PRIMARY KEY (rowid=?)
0,2,2,SEARCH TABLE secDescT USING INTEGER PRIMARY KEY (rowid=?)
0,0,0,USE TEMP B-TREE FOR ORDER BY

To remove the explicit sorting step, you must index that column:

CREATE INDEX pd ON primDescT(primDesc);

并且您必须强制数据库使用此索引(默认情况下,SQLite 在估计查询成本时会忽略 LIMIT,如果您想要所有结果,不使用 pd 索引会更快):

SELECT ...
FROM masterInfo m
INNER JOIN primDescT INDEXED BY pd ON primDescT.id = m.primDescId
--                   ^^^^^^^^^^^^^
INNER JOIN secDescT ON secDescT.id = m.secDescId
WHERE ...
ORDER BY pDescr ASC
LIMIT 100 OFFSET ...;
0,0,1,SCAN TABLE primDescT USING COVERING INDEX pd
0,1,0,SEARCH TABLE masterInfo AS m USING COVERING INDEX sqlite_autoindex_masterInfo_1 (primDescId=?)
0,2,2,SEARCH TABLE secDescT USING INTEGER PRIMARY KEY (rowid=?)

A large OFFSET value is always slow; the database must compute and throw away all these rows.

If you are using paging, you can replace the OFFSET with a lookup on the sorting column; this requires that you save the last value of the previous page:

SELECT ...
FROM masterInfo m
INNER JOIN primDescT INDEXED BY pd ON primDescT.id = m.primDescId
INNER JOIN secDescT ON secDescT.id = m.secDescId
WHERE primDesc > :LastValue
--    ^^^^^^^^^^^^^^^^^^^^^
  AND ...
ORDER BY pDescr ASC
LIMIT 100 /* no offset */;

关于sqlite - sqlite 中内连接的查询优化,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41973098/

相关文章:

entity-framework - EF7 RC2 LINQ选择不包括新添加的记录

regex - 在 Haskell 中插入 regexp() SQLite 函数 (Database.SQLite3 ,"direct-sqlite")

.net - SQLite 数据库大小太大

c# - 如何修复 "SQLite Error 1: ' 找不到指定的模块。”使用 EF 和 Spatialite

ios - 如何在 iOS 中的 sqlite 中的一个查询中执行多个 select 语句?

python-2.7 - 使 SQLite 更快地运行 SELECT

sqlite - 在 perl 中关闭 AutoCommit 的情况下调用 SELECT 语句时,SQLite 是否执行磁盘事件?

SQLite 语法错误(从 PostgreSQL 转换)

sql - 如何在 SQLite 中使用 JSON 合并多个列?

android - SQLite 在 where 子句和分组依据中使用带有日期函数的年份