我有一个 Providers 表,如下所示:
| id | lastName | firstName | middleName | | --- | -------- | --------- | ---------- |
with the following indexes:
- Providers_lastName
- Providers_firstName
- Providers_lastName_firstName
- Providers_lastName_firstName_middleName
All my queries use a trailing wildcard in the lastName and firstName values:
SELECT * FROM Providers
WHERE lastName LIKE 'smi%'
ORDER BY lastName ASC, firstName ASC, middleName
LIMIT 0, 50
SELECT * FROM Providers
WHERE firstName LIKE 'mar%'
ORDER BY lastName ASC, firstName ASC, middleName
LIMIT 0, 50
此表中有大约 700 万行。我按姓氏进行的查询非常快。然而,按名字排序的速度非常慢。我在这里做错了什么吗?我还可以添加哪些其他索引来提高仅限名字的查询的性能而不更改或删除顺序?
编辑1:
lastName
查询的 EXPLAIN
输出:
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "69901.30"
},
"ordering_operation": {
"using_filesort": false,
"table": {
"table_name": "Providers",
"access_type": "range",
"possible_keys": [
"Providers_lastName",
"Providers_lastName_firstName",
"Providers_lastName_firstName_middleName"
],
"key": "Providers_lastName_firstName_middleName",
"used_key_parts": [
"lastName"
],
"key_length": "143",
"rows_examined_per_scan": 59008,
"rows_produced_per_join": 59008,
"filtered": "100.00",
"index_condition": "(`db_name`.`providers`.`lastName` like 'smi%')",
"cost_info": {
"read_cost": "64000.51",
"eval_cost": "5900.80",
"prefix_cost": "69901.31",
"data_read_per_join": "158M"
},
"used_columns": [
"id",
"firstName",
"middleName",
"lastName",
// OTHER COLUMNS
]
}
}
}
}
EXPLAIN
firstName
查询的输出:
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "390813.95"
},
"ordering_operation": {
"using_filesort": false,
"table": {
"table_name": "Providers",
"access_type": "index",
"possible_keys": [
"Providers_firstName"
],
"key": "Providers_lastName_firstName_middleName",
"used_key_parts": [
"lastName",
"firstName",
"middleName"
],
"key_length": "309",
"rows_examined_per_scan": 948,
"rows_produced_per_join": 329914,
"filtered": "5.27",
"cost_info": {
"read_cost": "357822.55",
"eval_cost": "32991.40",
"prefix_cost": "390813.95",
"data_read_per_join": "883M"
},
"used_columns": [
"id",
"firstName",
"middleName",
"lastName",
// OTHER COLUMNS
],
"attached_condition": "(`db_name`.`providers`.`firstName` like 'mar%')"
}
}
}
}
显示创建表
:
CREATE TABLE `Providers` (
`id` varchar(10) NOT NULL,
`firstName` varchar(20) DEFAULT NULL,
`middleName` varchar(20) DEFAULT NULL,
`lastName` varchar(35) DEFAULT NULL,
/* Other columns */
PRIMARY KEY (`id`),
KEY `Providers_firstName` (`firstName`),
KEY `Providers_lastName` (`lastName`),
KEY `Providers_lastName_firstName` (`lastName`,`firstName`),
KEY `Providers_lastName_firstName_middleName` (`lastName`,`firstName`,`middleName`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
编辑2:
运行FLUSH STATUS
后SHOW SESSION STATUS LIKE 'Handler%'
的输出:
查询 1(名字):
{
"data":
[
{
"Variable_name": "Handler_commit",
"Value": "1"
},
{
"Variable_name": "Handler_delete",
"Value": "0"
},
{
"Variable_name": "Handler_discover",
"Value": "0"
},
{
"Variable_name": "Handler_external_lock",
"Value": "2"
},
{
"Variable_name": "Handler_mrr_init",
"Value": "0"
},
{
"Variable_name": "Handler_prepare",
"Value": "0"
},
{
"Variable_name": "Handler_read_first",
"Value": "1"
},
{
"Variable_name": "Handler_read_key",
"Value": "1"
},
{
"Variable_name": "Handler_read_last",
"Value": "0"
},
{
"Variable_name": "Handler_read_next",
"Value": "1487176"
},
{
"Variable_name": "Handler_read_prev",
"Value": "0"
},
{
"Variable_name": "Handler_read_rnd",
"Value": "0"
},
{
"Variable_name": "Handler_read_rnd_next",
"Value": "0"
},
{
"Variable_name": "Handler_rollback",
"Value": "0"
},
{
"Variable_name": "Handler_savepoint",
"Value": "0"
},
{
"Variable_name": "Handler_savepoint_rollback",
"Value": "0"
},
{
"Variable_name": "Handler_update",
"Value": "0"
},
{
"Variable_name": "Handler_write",
"Value": "0"
}
]
}
查询 2(姓氏):
{
"data":
[
{
"Variable_name": "Handler_commit",
"Value": "1"
},
{
"Variable_name": "Handler_delete",
"Value": "0"
},
{
"Variable_name": "Handler_discover",
"Value": "0"
},
{
"Variable_name": "Handler_external_lock",
"Value": "2"
},
{
"Variable_name": "Handler_mrr_init",
"Value": "0"
},
{
"Variable_name": "Handler_prepare",
"Value": "0"
},
{
"Variable_name": "Handler_read_first",
"Value": "0"
},
{
"Variable_name": "Handler_read_key",
"Value": "1"
},
{
"Variable_name": "Handler_read_last",
"Value": "0"
},
{
"Variable_name": "Handler_read_next",
"Value": "49"
},
{
"Variable_name": "Handler_read_prev",
"Value": "0"
},
{
"Variable_name": "Handler_read_rnd",
"Value": "0"
},
{
"Variable_name": "Handler_read_rnd_next",
"Value": "0"
},
{
"Variable_name": "Handler_rollback",
"Value": "0"
},
{
"Variable_name": "Handler_savepoint",
"Value": "0"
},
{
"Variable_name": "Handler_savepoint_rollback",
"Value": "0"
},
{
"Variable_name": "Handler_update",
"Value": "0"
},
{
"Variable_name": "Handler_write",
"Value": "0"
}
]
}
编辑3
使用FORCE_INDEX(Providers_firstName)
:
EXPLAIN
firstName
查询的输出:
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "389514.60"
},
"ordering_operation": {
"using_filesort": true,
"table": {
"table_name": "Providers",
"access_type": "range",
"possible_keys": [
"Providers_firstName"
],
"key": "Providers_firstName",
"used_key_parts": [
"firstName"
],
"key_length": "83",
"rows_examined_per_scan": 329914,
"rows_produced_per_join": 329914,
"filtered": "100.00",
"index_condition": "(`db_name`.`providers`.`firstName` like 'mar%')",
"cost_info": {
"read_cost": "356523.20",
"eval_cost": "32991.40",
"prefix_cost": "389514.60",
"data_read_per_join": "883M"
},
"used_columns": [
"id",
"firstName",
"middleName",
"lastName",
// Other columns
]
}
}
}
}
处理程序计数:
{
"data":
[
{
"Variable_name": "Handler_commit",
"Value": "1"
},
{
"Variable_name": "Handler_delete",
"Value": "0"
},
{
"Variable_name": "Handler_discover",
"Value": "0"
},
{
"Variable_name": "Handler_external_lock",
"Value": "2"
},
{
"Variable_name": "Handler_mrr_init",
"Value": "0"
},
{
"Variable_name": "Handler_prepare",
"Value": "0"
},
{
"Variable_name": "Handler_read_first",
"Value": "0"
},
{
"Variable_name": "Handler_read_key",
"Value": "51"
},
{
"Variable_name": "Handler_read_last",
"Value": "0"
},
{
"Variable_name": "Handler_read_next",
"Value": "168497"
},
{
"Variable_name": "Handler_read_prev",
"Value": "0"
},
{
"Variable_name": "Handler_read_rnd",
"Value": "50"
},
{
"Variable_name": "Handler_read_rnd_next",
"Value": "0"
},
{
"Variable_name": "Handler_rollback",
"Value": "0"
},
{
"Variable_name": "Handler_savepoint",
"Value": "0"
},
{
"Variable_name": "Handler_savepoint_rollback",
"Value": "0"
},
{
"Variable_name": "Handler_update",
"Value": "0"
},
{
"Variable_name": "Handler_write",
"Value": "0"
}
]
}
最佳答案
查询 1
WHERE lastName LIKE 'smi%'
ORDER BY lastName ASC, firstName ASC, middleName
可能使用这个索引。 (请提供EXPLAIN...
):
Providers_lastName_firstName_middleName
它的工作效率相对较高,因为它可以遍历 smi...
索引的一部分。
我假设 SELECT *
仅获取 4 列,并且 id
是PRIMARY KEY
??那Providers_lastName_firstName_middleName
是 INDEX(lastName, firstName, middleName)
,隐含 id
加在最后是因为它是InnoDB??
这意味着整个查询可以在索引中运行。 EXPLAIN
将通过说“使用索引”来确认这一点,这意味着“覆盖索引”。
此外,此查询仅触及 50 行 - 因为索引针对 WHERE
进行了很好的调整。和ORDER BY
,它实际上可以折叠在 LIMIT 50
中,也是。
查询 2
WHERE firstName LIKE 'mar%'
ORDER BY lastName ASC, firstName ASC, middleName
Providers_firstName
还可以遍历 mar...
的索引,但随后必须访问数据才能获取其余列。
但是其余的优化(覆盖等)均不适用。您可以添加INDEX(first, last, middle, id)
使其更快。
此查询无法折叠在 LIMIT
中.
注释
在美国,10% 的名字以最常见的字母“S”开头。 (“10%”在全局范围内大致相同,但最受欢迎的字母可能有所不同。)
优化器有多种方法来执行任何查询,并根据有限的信息选择“最佳”方法。当很明显某个范围将是一个大范围( WHERE lastName LIKE 'S%'
)时,它可能选择从使用索引切换到简单地丢弃许多行。我不认为这发生在这里,但又EXPLAIN
会告诉我们。
有关创建最佳索引的更多信息:http://mysql.rjweb.org/doc.php/index_cookbook_mysql
解释后
如果我读了EXPLAINs
正确的是,他们都使用 INDEX(last, first, middle)
,从而避免排序。另请注意"using_filesort": false
., 这允许查询在 LIMIT 50
之后停止.
要收集更多信息,请运行以下命令:
FLUSH STATUS;
SELECT ...
SHOW SESSION STATUS LIKE 'Handler%';
如果Handler_write*
是0
,然后就没有排序了。同时, Handler_read* values gives you the number of rows (probably in the
的总和INDEX`)被触及。
我预计查询 1 总共会显示 50 次读取,因为它(理论上)可以深入到 smi
处的索引。并抓取接下来的 50(或更少)行。这应该需要几毫秒的时间。
查询 2 比较困惑,因为它需要扫描大量索引才能找到 50 个具有该名字的索引。它不会是 7M,但可能是 50K 行。如果缓存了索引的必要部分,这可能需要几秒钟;如果不是,则需要几分钟。
没有办法让 Q2 和 Q1 一样快。对于 mar%
来说,这可能更快,但 m%
速度较慢:INDEX(first, last, middle)
。也就是说,引入这样的索引是有风险的。
在大多数情况下,INDEX(a)
如果您还有 INDEX(a,b)
则是多余的。也就是说,您有 2 个可以删除的索引。
关于mysql - 使用索引的 MySQL 查询速度慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56729201/