mysql - 为什么这个查询会导致锁等待超时？

我们的团队上周刚刚调试并试图找到许多 mysql 锁定超时和许多运行时间极长的查询的根源。最后，这个查询似乎是罪魁祸首。

mysql> explain 

SELECT categories.name AS cat_name, 
COUNT(distinct items.id) AS category_count 
FROM `items` 
INNER JOIN `categories` ON `categories`.`id` = `items`.`category_id` 
WHERE `items`.`state` IN ('listed', 'reserved') 
   AND (items.category_id IS NOT NULL) 
GROUP BY categories.name 
ORDER BY category_count DESC 
LIMIT 10\G

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: items
         type: range
possible_keys: index_items_on_category_id,index_items_on_state
          key: index_items_on_category_id
      key_len: 5
          ref: NULL
         rows: 119371
        Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: categories
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: production_db.items.category_id
         rows: 1
        Extra: 
2 rows in set (0.00 sec)

我可以看到它正在执行令人讨厌的表扫描并创建一个临时表来运行。

为什么这个查询会导致数据库响应时间增加 10 倍，而一些通常需要 40-50 毫秒(项目表更新)的查询有时会激增到 50,000 毫秒甚至更高？

最佳答案

如果没有更多信息就很难说

它是在事务内部运行吗？
如果是，隔离级别是多少？
有多少个类别？
有多少项目？

My guess would be that the query is too slow and its running inside a transaction (which it probably is since you have this problem) and is probably issuing range-locks on the items table which cannot allow writes to proceed hence slowing the updates till they can get a lock on the table.

根据我从您的查询和执行计划中看到的内容，我有几点评论:

1) 你的 items.state 可能会 作为目录可能更好，而不是在项目的每一行都有字符串，这是为了节省空间并且比较 ID 比比较字符串快得多(无论引擎可能进行何种优化)。

2) 我猜 items.state 是一个基数较低的列(唯一值很少)，因此该列中的索引可能对您的伤害大于对您的帮助。每个索引在插入/删除/更新行时都会增加开销，因为必须维护索引，这个特定的索引可能没有被使用那么多是值得的。当然，我只是猜测，这取决于其余的查询。

SELECT
    ; Grouping by name, means comparing strings. 
    categories.name AS cat_name, 
    ; No need for distinct, the same item.id cannot belong to different categories
    COUNT(distinct items.id) AS category_count  
FROM `items` 
INNER JOIN `categories` ON `categories`.`id` = `items`.`category_id` 
WHERE `items`.`state` IN ('listed', 'reserved') 
   ; Not needed, the inner join gets rid of items with no category_id
   AND (items.category_id IS NOT NULL) 
GROUP BY categories.name 
ORDER BY category_count DESC 
LIMIT 10\G

这个查询的构造方式基本上是必须扫描整个项目表，因为它使用 category_id 索引，然后通过 where 子句过滤，然后与类别表连接，这意味着索引查找主键 ( categories.id) 项目结果集中每个项目行的索引。然后按名称分组(使用字符串比较)进行计数，然后除掉 10 个结果之外的所有内容。

我会这样写查询:

SELECT categories.name, counts.n
FROM (SELECT category_id, COUNT(id) n
      FROM items 
      WHERE state IN ('listed', 'reserved') AND category_id is not null
      GROUP BY category_id ORDER BY COUNT(id) DESC LIMIT 10) counts 
JOIN categories on counts.category_id = categories.id
ORDER BY counts.n desc

(很抱歉，如果语法不完美，我没有运行 MySQL)

对于这个查询，引擎可能会做的是:

使用 items.state 索引获取“列出的”、“保留的”项目并按 category_id 比较数字分组，而不是字符串，然后仅获取最上面的 10 个计数，然后与类别连接以获取名称(但仅使用 10索引查找)。

关于mysql - 为什么这个查询会导致锁等待超时？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/12612117/

mysql - 为什么这个查询会导致锁等待超时？

上一篇：mysql - 是否可以使用 Sequelize.js 进行子查询？

下一篇：mysql - 什么是 MySQL 服务器实例？