我有以下查询:
SELECT table_1.id
FROM
table_1
LEFT JOIN table_2 ON (table_1.id = table_2.id)
WHERE
table_1.col_condition_1 = 0
AND table_1.col_condition_2 NOT IN (3, 4)
AND (table_2.id is NULL OR table_1.date_col > table_2.date_col)
LIMIT 5000;
我有以下键和索引:
- table_1.id 主键。
- table_1.col_condition_1 上的索引
- table_1.col_condition_2 上的索引
- table_1.col_condition_1 和 table_1.col_condition_2 的复合索引
正在获取正确的索引。查询说明:
+--+----+-------------+---------+--------+---------------------------------------------------------------------+-----------------------+---------+------------+----------+-----------------------+--+
| | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | |
+--+----+-------------+---------+--------+---------------------------------------------------------------------+-----------------------+---------+------------+----------+-----------------------+--+
| | 1 | SIMPLE | table_1 | range | "the composite index", col_condition_1 index ,col_condition_2 index | "the composite index" | 7 | | 11819433 | Using index condition | |
| | 1 | SIMPLE | table_2 | eq_ref | PRIMARY,id_UNIQUE | PRIMARY | 8 | table_1.id | 1 | Using where | |
+--+----+-------------+---------+--------+---------------------------------------------------------------------+-----------------------+---------+------------+----------+-----------------------+--+
table_1 有约 60 条 MM 记录,table_2 有约 4 条 MM 记录。
查询需要 60 秒才能返回结果。
有趣的是:
SELECT table_1.id
FROM
table_1
LEFT JOIN table_2 ON (table_1.id = table_2.id)
WHERE
table_1.col_condition_1 = 0
AND table_1.col_condition_2 NOT IN (3, 4)
LIMIT 5000;
需要 145 毫秒返回结果,并选择与第一个查询相同的索引。
SELECT table_1.id
FROM
table_1
LEFT JOIN table_2 ON (table_1.id = table_2.id)
WHERE
table_1.col_condition_1 = 0
AND (table_2.id is NULL OR table_1.date_col > table_2.date_col)
LIMIT 5000;
需要 174 毫秒返回结果。
查询说明:
+----+-------------+---------+--------+---------------------------------------------------------------------+-----------------+---------+------------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+--------+---------------------------------------------------------------------+-----------------+---------+------------+----------+-------------+
| 1 | SIMPLE | table_1 | ref | "the composite index", col_condition_1 index ,col_condition_2 index | col_condition_1 | 2 | const | 30381842 | NULL |
| 1 | SIMPLE | table_2 | eq_ref | PRIMARY,id_UNIQUE | PRIMARY | 8 | table_1.id | 1 | Using where |
+----+-------------+---------+--------+---------------------------------------------------------------------+-----------------+---------+------------+----------+-------------+
还有
SELECT table_1.id
FROM
table_1
LEFT JOIN table_2 ON (table_1.id = table_2.id)
WHERE
table_1.col_condition_2 NOT IN (3, 4)
AND (table_2.id is NULL OR table_1.date_col > table_2.date_col)
LIMIT 5000;
返回结果大约需要1秒。
查询说明:
+----+-------------+---------+--------+---------------------------------------------------------------------+-----------------+---------+------------+----------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+--------+---------------------------------------------------------------------+-----------------+---------+------------+----------+-----------------------+
| 1 | SIMPLE | table_1 | range | "the composite index", col_condition_1 index ,col_condition_2 index | col_condition_2 | 5 | | 36254294 | Using index condition |
| 1 | SIMPLE | table_2 | eq_ref | PRIMARY,id_UNIQUE | PRIMARY | 8 | table_1.id | 1 | Using where |
+----+-------------+---------+--------+---------------------------------------------------------------------+-----------------+---------+------------+----------+-----------------------+
此外,当我单独使用每个 where 条件时,查询会在大约 100 毫秒内返回结果。
我的问题是,为什么当同时使用三个 where 条件时,查询需要花费大量时间(60 秒)来返回结果,即使看起来使用了正确的索引并使用任意两个条件执行查询三个 where 条件也可以在更短的时间内返回结果。
还有,有没有办法优化这个查询?
谢谢。
编辑:
创建表:
表_1:
CREATE TABLE `table_1` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`col_condition_1` tinyint(1) DEFAULT '0',
`col_condition_2` int(11) DEFAULT NULL,
`date_col` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `compositeidx` (`col_condition_1`,`col_condition_2`),
KEY `col_condition_1_idx` (`col_condition_1`),
KEY `col_condition_2_idx` (`col_condition_2`)
) ENGINE=InnoDB AUTO_INCREMENT=68272192 DEFAULT CHARSET=utf8
表_2:
CREATE TABLE `table_2` (
`id` bigint(20) NOT NULL,
`date_col` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id_UNIQUE` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
最佳答案
尝试将现有 SQL 分成两部分,看看每部分的执行时间是多少。这有望让您了解导致缓慢的部分:
第 1 部分:
SELECT table_1.id
FROM table_1
LEFT JOIN table_2
ON (table_1.id = table_2.id)
WHERE table_1.col_condition_1 = 0
AND table_1.col_condition_2 NOT IN (3, 4)
AND table_2.id is NULL
和第 2 部分(注意此处的内部连接):
SELECT table_1.id
FROM table_1
JOIN table_2
ON (table_1.id = table_2.id)
WHERE table_1.col_condition_1 = 0
AND table_1.col_condition_2 NOT IN (3, 4)
AND table_1.date_col > table_2.date_col
我预计第二部分会花费更长的时间。在这方面,我认为在 date_coll 上对 table_1 和 table_2 建立索引会有所帮助。
我认为综合索引对您的选择没有任何帮助。
这意味着很难诊断为什么这三个条件一起会对性能产生如此严重的影响。看来和你的数据分布有关。不确定 mySql,但在 Oracle 中,这些表上的统计信息收集会产生影响。
希望有帮助。
关于mysql - 具有三个where条件的select查询很慢,但是具有三个where条件中的任意两个条件的相同查询很快,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54001498/