我有三个表,bug
, bugrule
和 bugtrace
,其中的关系是:
bug 1--------N bugrule
id = bugid
bugrule 0---------N bugtrace
id = ruleid
因为我几乎总是对 bug <---> bugtrace
之间的关系感兴趣我创建了一个合适的 VIEW
用作多个查询的一部分。有趣的是,查询使用这个 VIEW
与使用底层 JOIN
的等效查询相比,性能明显更差明确地。
VIEW
定义:
CREATE VIEW bugtracev AS
SELECT t.*, r.bugid
FROM bugtrace AS t
LEFT JOIN bugrule AS r ON t.ruleid=r.id
WHERE r.version IS NULL
使用 VIEW
查询的执行计划(表现不佳):
mysql> explain
SELECT c.id,state,
(SELECT COUNT(DISTINCT(t.id)) FROM bugtracev AS t
WHERE t.bugid=c.id)
FROM bug AS c
WHERE c.version IS NULL
AND c.id<10;
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
| 1 | PRIMARY | c | range | id_2,id | id_2 | 8 | NULL | 3 | Using index condition |
| 2 | DEPENDENT SUBQUERY | t | index | NULL | ruleid | 9 | NULL | 1426004 | Using index |
| 2 | DEPENDENT SUBQUERY | r | ref | id_2,id | id_2 | 8 | bugapp.t.ruleid | 1 | Using where |
+----+--------------------+-------+-------+---------------+--------+---------+-----------------+---------+-----------------------+
3 rows in set (0.00 sec)
使用底层 JOIN
的查询执行计划直接(性能好):
mysql> explain
SELECT c.id,state,
(SELECT COUNT(DISTINCT(t.id))
FROM bugtrace AS t
LEFT JOIN bugrule AS r ON t.ruleid=r.id
WHERE r.version IS NULL
AND r.bugid=c.id)
FROM bug AS c
WHERE c.version IS NULL
AND c.id<10;
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
| 1 | PRIMARY | c | range | id_2,id | id_2 | 8 | NULL | 3 | Using index condition |
| 2 | DEPENDENT SUBQUERY | r | ref | id_2,id,bugid | bugid | 8 | bugapp.c.id | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | t | ref | ruleid | ruleid | 9 | bugapp.r.id | 713002 | Using index |
+----+--------------------+-------+-------+---------------+--------+---------+-------------+--------+-----------------------+
3 rows in set (0.00 sec)
CREATE TABLE
语句(由不相关的列减少)是:
mysql> show create table bug;
CREATE TABLE `bug` (
`id` bigint(20) NOT NULL,
`version` int(11) DEFAULT NULL,
`state` varchar(16) DEFAULT NULL,
UNIQUE KEY `id_2` (`id`,`version`),
KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
mysql> show create table bugrule;
CREATE TABLE `bugrule` (
`id` bigint(20) NOT NULL,
`version` int(11) DEFAULT NULL,
`bugid` bigint(20) NOT NULL,
UNIQUE KEY `id_2` (`id`,`version`),
KEY `id` (`id`),
KEY `bugid` (`bugid`),
CONSTRAINT `bugrule_ibfk_1` FOREIGN KEY (`bugid`) REFERENCES `bug` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
mysql> show create table bugtrace;
CREATE TABLE `bugtrace` (
`id` bigint(20) NOT NULL,
`ruleid` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `ruleid` (`ruleid`),
CONSTRAINT `bugtrace_ibfk_1` FOREIGN KEY (`ruleid`) REFERENCES `bugrule` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
最佳答案
您问为什么关于使用 COUNT(DISTINCT val)
和依赖子查询的几个复杂查询的查询优化。很难确定为什么。
不过,您可能会通过摆脱依赖子查询来解决大部分性能问题。尝试这样的事情:
SELECT c.id,state, cnt.cnt
FROM bug AS c
LEFT JOIN (
SELECT bugid, COUNT(DISTINCT id) cnt
FROM bugtracev
GROUP BY bugid
) cnt ON c.id = cnt.bugid
WHERE c.version IS NULL
AND c.id<10;
为什么这有帮助?为了满足查询,优化器可以选择只运行一次 GROUP BY
子查询,而不是多次。而且,您可以在 GROUP BY
子查询上使用 EXPLAIN
来了解其性能。
您还可以通过在 bugrule
上创建与 View 中的查询匹配的复合索引来提高性能。试试这个。
CREATE INDEX bugrule_v ON bugrule (version, ruleid, bugid)
然后像这样切换最后两列
CREATE INDEX bugrule_v ON bugrule (version, ruleid, bugid)
这些索引称为覆盖索引,因为它们包含满足您的查询所需的所有列。 version
首先出现,因为这有助于优化 View 定义中的 WHERE version IS NULL
。这使它更快。
专业提示:避免在 View 和查询中使用SELECT *
,尤其是当您遇到性能问题时。相反,列出您实际需要的列。 *
可能会强制查询优化器避免覆盖索引,即使索引会有所帮助。
关于MySQL:与直接使用 View 的底层 JOIN 的查询相比,为什么使用 VIEW 的查询效率较低?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59117037/