mysql - 使用索引的 MySQL 查询速度慢

标签 mysql indexing

我有一个 Providers 表,如下所示:

| id  | lastName | firstName | middleName |
| --- | -------- | --------- | ---------- |

with the following indexes:

  • Providers_lastName
  • Providers_firstName
  • Providers_lastName_firstName
  • Providers_lastName_firstName_middleName

All my queries use a trailing wildcard in the lastName and firstName values:

SELECT * FROM Providers
WHERE lastName LIKE 'smi%'
ORDER BY lastName ASC, firstName ASC, middleName
LIMIT 0, 50
SELECT * FROM Providers
WHERE firstName LIKE 'mar%'
ORDER BY lastName ASC, firstName ASC, middleName
LIMIT 0, 50

此表中有大约 700 万行。我按姓氏进行的查询非常快。然而,按名字排序的速度非常慢。我在这里做错了什么吗?我还可以添加哪些其他索引来提高仅限名字的查询的性能而不更改或删除顺序?

编辑1:

lastName 查询的

EXPLAIN 输出:

{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "69901.30"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "Providers",
        "access_type": "range",
        "possible_keys": [
          "Providers_lastName",
          "Providers_lastName_firstName",
          "Providers_lastName_firstName_middleName"
        ],
        "key": "Providers_lastName_firstName_middleName",
        "used_key_parts": [
          "lastName"
        ],
        "key_length": "143",
        "rows_examined_per_scan": 59008,
        "rows_produced_per_join": 59008,
        "filtered": "100.00",
        "index_condition": "(`db_name`.`providers`.`lastName` like 'smi%')",
        "cost_info": {
          "read_cost": "64000.51",
          "eval_cost": "5900.80",
          "prefix_cost": "69901.31",
          "data_read_per_join": "158M"
        },
        "used_columns": [
          "id",
          "firstName",
          "middleName",
          "lastName",
          // OTHER COLUMNS
        ]
      }
    }
  }
}

EXPLAIN firstName 查询的输出:

{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "390813.95"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "Providers",
        "access_type": "index",
        "possible_keys": [
          "Providers_firstName"
        ],
        "key": "Providers_lastName_firstName_middleName",
        "used_key_parts": [
          "lastName",
          "firstName",
          "middleName"
        ],
        "key_length": "309",
        "rows_examined_per_scan": 948,
        "rows_produced_per_join": 329914,
        "filtered": "5.27",
        "cost_info": {
          "read_cost": "357822.55",
          "eval_cost": "32991.40",
          "prefix_cost": "390813.95",
          "data_read_per_join": "883M"
        },
        "used_columns": [
          "id",
          "firstName",
          "middleName",
          "lastName",
          // OTHER COLUMNS
        ],
        "attached_condition": "(`db_name`.`providers`.`firstName` like 'mar%')"
      }
    }
  }
}

显示创建表:

CREATE TABLE `Providers` (
  `id` varchar(10) NOT NULL,
  `firstName` varchar(20) DEFAULT NULL,
  `middleName` varchar(20) DEFAULT NULL,
  `lastName` varchar(35) DEFAULT NULL,
  /* Other columns */
  PRIMARY KEY (`id`),
  KEY `Providers_firstName` (`firstName`),
  KEY `Providers_lastName` (`lastName`),
  KEY `Providers_lastName_firstName` (`lastName`,`firstName`),
  KEY `Providers_lastName_firstName_middleName` (`lastName`,`firstName`,`middleName`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

编辑2:

运行FLUSH STATUSSHOW SESSION STATUS LIKE 'Handler%'的输出:

查询 1(名字):

{
    "data":
    [
        {
            "Variable_name": "Handler_commit",
            "Value": "1"
        },
        {
            "Variable_name": "Handler_delete",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_discover",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_external_lock",
            "Value": "2"
        },
        {
            "Variable_name": "Handler_mrr_init",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_prepare",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_read_first",
            "Value": "1"
        },
        {
            "Variable_name": "Handler_read_key",
            "Value": "1"
        },
        {
            "Variable_name": "Handler_read_last",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_read_next",
            "Value": "1487176"
        },
        {
            "Variable_name": "Handler_read_prev",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_read_rnd",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_read_rnd_next",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_rollback",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_savepoint",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_savepoint_rollback",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_update",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_write",
            "Value": "0"
        }
    ]
}

查询 2(姓氏):

{
    "data":
    [
        {
            "Variable_name": "Handler_commit",
            "Value": "1"
        },
        {
            "Variable_name": "Handler_delete",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_discover",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_external_lock",
            "Value": "2"
        },
        {
            "Variable_name": "Handler_mrr_init",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_prepare",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_read_first",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_read_key",
            "Value": "1"
        },
        {
            "Variable_name": "Handler_read_last",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_read_next",
            "Value": "49"
        },
        {
            "Variable_name": "Handler_read_prev",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_read_rnd",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_read_rnd_next",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_rollback",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_savepoint",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_savepoint_rollback",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_update",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_write",
            "Value": "0"
        }
    ]
}

编辑3

使用FORCE_INDEX(Providers_firstName):

EXPLAIN firstName 查询的输出:

{
    "query_block": {
      "select_id": 1,
      "cost_info": {
        "query_cost": "389514.60"
      },
    "ordering_operation": {
        "using_filesort": true,
        "table": {
          "table_name": "Providers",
          "access_type": "range",
          "possible_keys": [
            "Providers_firstName"
          ],
          "key": "Providers_firstName",
          "used_key_parts": [
            "firstName"
          ],
          "key_length": "83",
          "rows_examined_per_scan": 329914,
          "rows_produced_per_join": 329914,
          "filtered": "100.00",
          "index_condition": "(`db_name`.`providers`.`firstName` like 'mar%')",
          "cost_info": {
            "read_cost": "356523.20",
            "eval_cost": "32991.40",
            "prefix_cost": "389514.60",
            "data_read_per_join": "883M"
          },
        "used_columns": [
            "id",
            "firstName",
            "middleName",
            "lastName",
            // Other columns
          ]
      }
    }
  }
}

处理程序计数:

{
    "data":
    [
        {
            "Variable_name": "Handler_commit",
            "Value": "1"
        },
        {
            "Variable_name": "Handler_delete",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_discover",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_external_lock",
            "Value": "2"
        },
        {
            "Variable_name": "Handler_mrr_init",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_prepare",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_read_first",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_read_key",
            "Value": "51"
        },
        {
            "Variable_name": "Handler_read_last",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_read_next",
            "Value": "168497"
        },
        {
            "Variable_name": "Handler_read_prev",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_read_rnd",
            "Value": "50"
        },
        {
            "Variable_name": "Handler_read_rnd_next",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_rollback",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_savepoint",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_savepoint_rollback",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_update",
            "Value": "0"
        },
        {
            "Variable_name": "Handler_write",
            "Value": "0"
        }
    ]
}

最佳答案

查询 1

WHERE lastName LIKE 'smi%'
ORDER BY lastName ASC, firstName ASC, middleName

可能使用这个索引。 (请提供EXPLAIN...):

Providers_lastName_firstName_middleName

它的工作效率相对较高,因为它可以遍历 smi...索引的一部分。

我假设 SELECT *仅获取 4 列,并且 idPRIMARY KEY ??那Providers_lastName_firstName_middleNameINDEX(lastName, firstName, middleName) ,隐含 id加在最后是因为它是InnoDB??

这意味着整个查询可以在索引中运行。 EXPLAIN将通过说“使用索引”来确认这一点,这意味着“覆盖索引”。

此外,此查询仅触及 50 行 - 因为索引针对 WHERE 进行了很好的调整。和ORDER BY ,它实际上可以折叠在 LIMIT 50 中,也是。

查询 2

WHERE firstName LIKE 'mar%'
ORDER BY lastName ASC, firstName ASC, middleName

Providers_firstName

还可以遍历 mar... 的索引,但随后必须访问数据才能获取其余列。

但是其余的优化(覆盖等)均不适用。您可以添加INDEX(first, last, middle, id)使其更快。

此查询无法折叠在 LIMIT 中.

注释

在美国,10% 的名字以最常见的字母“S”开头。 (“10%”在全局范围内大致相同,但最受欢迎的字母可能有所不同。)

优化器有多种方法来执行任何查询,并根据有限的信息选择“最佳”方法。当很明显某个范围将是一个大范围( WHERE lastName LIKE 'S%' )时,它可能选择从使用索引切换到简单地丢弃许多行。我不认为这发生在这里,但又EXPLAIN会告诉我们。

有关创建最佳索引的更多信息:http://mysql.rjweb.org/doc.php/index_cookbook_mysql

解释后

如果我读了EXPLAINs正确的是,他们都使用 INDEX(last, first, middle) ,从而避免排序。另请注意"using_filesort": false ., 这允许查询在 LIMIT 50 之后停止.

要收集更多信息,请运行以下命令:

FLUSH STATUS;
SELECT ...
SHOW SESSION STATUS LIKE 'Handler%';

如果Handler_write*0 ,然后就没有排序了。同时, Handler_read* values gives you the number of rows (probably in the 的总和INDEX`)被触及。

我预计查询 1 总共会显示 50 次读取,因为它(理论上)可以深入到 smi 处的索引。并抓取接下来的 50(或更少)行。这应该需要几毫秒的时间。

查询 2 比较困惑,因为它需要扫描大量索引才能找到 50 个具有该名字的索引。它不会是 7M,但可能是 50K 行。如果缓存了索引的必要部分,这可能需要几秒钟;如果不是,则需要几分钟。

没有办法让 Q2 和 Q1 一样快。对于 mar% 来说,这可能更快,但 m% 速度较慢:INDEX(first, last, middle) 。也就是说,引入这样的索引是有风险的。

在大多数情况下,INDEX(a)如果您还有 INDEX(a,b) 则是多余的。也就是说,您有 2 个可以删除的索引。

关于mysql - 使用索引的 MySQL 查询速度慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56729201/

相关文章:

python - 如何建立从wxpython桌面应用程序到远程mysql数据库的连接?

java - 删除一行后我希望主键再次从1开始,请问有可能吗

python - 倒着写字

mysql - 奇怪的 MySql 连接行为

sql - PostgreSQL 中的 Gist 索引仅适用于顺序,但不适用于谓词

java - Lucene可以返回带有行号的搜索结果吗?

python - SQLAlchemy:如何按两个字段分组并按日期过滤

mysql - 将一列分成两列的更新?

mysql - 如果列不为 0,则 SQL 查询每个订单仅返回 1 条记录

lucene - 在 Neo4j Lucene 索引的单个属性中存储多个值