InnoDB 的 MySQL 死锁问题

标签 mysql innodb deadlock

我有一个中央数据库服务器和几个“工作”服务器,它们同时执行这样的查询:

UPDATE job_queue 
SET
  worker = '108.166.81.112',
  attempts = attempts + 1,
  started = '2014-01-14 10:34:03',
  token = '13eb3e6a8c3e1becb34051e08f19fd62'
WHERE completed = '0000-00-00 00:00:00'
  AND (started = '0000-00-00 00:00:00' OR started < '2014-01-14 10:29:03')
  AND attempts < 2
ORDER BY priority DESC, inserted
LIMIT 1

有时我的 job_queue 表会被锁定,如果我运行“SHOW ENGINE INNODB STATUS”,我会得到如下信息:

------------------------
LATEST DETECTED DEADLOCK
------------------------
140114 10:34:15
*** (1) TRANSACTION:
TRANSACTION 0 46984514, ACTIVE 0 sec, process no 590, OS thread id 140366633146112 fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 20 lock struct(s), heap size 3024, 545 row lock(s)
MySQL thread id 677401, query id 19385205 10.179.103.110 root init
UPDATE job_queue SET worker='108.166.81.112', attempts=attempts+1, started='2014-01-14 10:34:03', token='13eb3e6a8c3e1becb34051e08f19fd62' WHERE completed='0000-00-00 00:00:00' AND (started='0000-00-00 00:00:00' OR started<'2014-01-14 10:29:03') AND attempts<2 ORDER BY priority DESC, inserted LIMIT 1
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 245767 n bits 128 index `PRIMARY` of table `database`.`job_queue` trx id 0 46984514 lock_mode X waiting
Record lock, heap no 34 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
 0: len 3; hex 800210; asc    ;; 1: len 6; hex 000002cced25; asc      %;; 2: len 7; hex 000003c00f1970; asc       p;; 3: len 30; hex 4f3a31343a2243425343616368654170704a6f62223a363a7b733a31393a; asc O:14:"CBSCacheAppJob":6:{s:19:;...(truncated); 4: len 1; hex 80; asc  ;; 5: len 8; hex 800012513c58bf24; asc    Q<X $;; 6: len 8; hex 800012513c58cc17; asc    Q<X  ;; 7: len 14; hex 31302e3137392e3130332e313333; asc 10.179.103.133;; 8: len 1; hex 81; asc  ;; 9: len 8; hex 800012513c58cc32; asc    Q<X 2;; 10: len 0; hex ; asc ;; 11: len 30; hex 353264393033616162656634346239626536306463346438666432303066; asc 52d903aabef44b9be60dc4d8fd200f;...(truncated);

*** (2) TRANSACTION:
TRANSACTION 0 46984485, ACTIVE 17 sec, process no 590, OS thread id 140366633547520 starting index read, thread declared inside InnoDB 500
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1216, 2 row lock(s), undo log entries 1
MySQL thread id 676723, query id 19385209 10.179.103.133 root init
UPDATE job_queue SET worker='10.179.103.133', attempts=attempts+1, started='2014-01-14 10:34:03', token='efd21d0d34f44badbc30386db4dd252e' WHERE completed='0000-00-00 00:00:00' AND (started='0000-00-00 00:00:00' OR started<'2014-01-14 10:29:03') AND attempts<2 ORDER BY priority DESC, inserted LIMIT 1
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 0 page no 245767 n bits 128 index `PRIMARY` of table `database`.`job_queue` trx id 0 46984485 lock_mode X locks rec but not gap
Record lock, heap no 34 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
 0: len 3; hex 800210; asc    ;; 1: len 6; hex 000002cced25; asc      %;; 2: len 7; hex 000003c00f1970; asc       p;; 3: len 30; hex 4f3a31343a2243425343616368654170704a6f62223a363a7b733a31393a; asc O:14:"CBSCacheAppJob":6:{s:19:;...(truncated); 4: len 1; hex 80; asc  ;; 5: len 8; hex 800012513c58bf24; asc    Q<X $;; 6: len 8; hex 800012513c58cc17; asc    Q<X  ;; 7: len 14; hex 31302e3137392e3130332e313333; asc 10.179.103.133;; 8: len 1; hex 81; asc  ;; 9: len 8; hex 800012513c58cc32; asc    Q<X 2;; 10: len 0; hex ; asc ;; 11: len 30; hex 353264393033616162656634346239626536306463346438666432303066; asc 52d903aabef44b9be60dc4d8fd200f;...(truncated);

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 57 n bits 120 index `PRIMARY` of table `database`.`job_queue` trx id 0 46984485 lock_mode X waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
 0: len 3; hex 800001; asc    ;; 1: len 6; hex 000002ccdab1; asc       ;; 2: len 7; hex 000003c0352b3f; asc     5+?;; 3: len 30; hex 4f3a31323a224175746f50696c6f744a6f62223a363a7b733a31383a2200; asc O:12:"AutoPilotJob":6:{s:18:" ;...(truncated); 4: len 1; hex 82; asc  ;; 5: len 8; hex 800012513c58af57; asc    Q<X W;; 6: len 8; hex 800012513c58bf22; asc    Q<X ";; 7: len 14; hex 3130382e3136362e38312e313132; asc 108.166.81.112;; 8: len 1; hex 81; asc  ;; 9: len 8; hex 800012513c58bf23; asc    Q<X #;; 10: len 0; hex ; asc ;; 11: len 30; hex 616331376430346339326163613366323330646164323239363764336266; asc ac17d04c92aca3f230dad22967d3bf;...(truncated);

*** WE ROLL BACK TRANSACTION (1)
------------
TRANSACTIONS
------------
Trx id counter 0 46989905
Purge done for trx's n:o < 0 46986227 undo n:o < 0 0
History list length 24
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0 0, not started, process no 590, OS thread id 140366628529920
MySQL thread id 703864, query id 20047015 localhost root
SHOW ENGINE INNODB STATUS
---TRANSACTION 0 46989894, not started, process no 590, OS thread id 140366636758784
MySQL thread id 702822, query id 20046897 10.179.1.63 root
---TRANSACTION 0 46986223, ACTIVE 39782 sec, process no 590, OS thread id 140366626322176
25 lock struct(s), heap size 3024, 710 row lock(s), undo log entries 9
MySQL thread id 677706, query id 19994505 10.179.103.114 root
Trx read view will not see trx with id >= 0 46986224, sees < 0 46986224

任何对表的进一步写入都会超时,直到我重新启动我的 MySQL 服务器(或手动终止死锁作业):

PHP Fatal error:  Lock wait timeout exceeded; try restarting transaction(Query: "UPDATE job_queue SET worker='108.166.81.250', attempts=attempts+1, started='2014-01-14 21:27:45', token='369eae55a7f0eacad3b678a3410de8e4' WHERE completed='0000-00-00 00:00:00' AND (started='0000-00-00 00:00:00' OR started<'2014-01-14 21:22:45') AND attempts<2 ORDER BY priority DESC, inserted LIMIT 1") in /utilities/Database.php on line 53

任何人都可以向我解释为什么这个查询会导致死锁吗?我的印象是 InnoDB 表上的所有查询都是原子发生的。有任何想法吗?

最佳答案

这会导致死锁,因为 UPDATE 查询锁定了表中的所有行,并且根据使用的索引(或缺少索引),两个不同的 session 可能会以稍微不同的顺序锁定它们.请记住,UPDATEDELETESELECT ... FOR UPDATE 将锁定它们遇到的所有行,无论这些行是否匹配所有 WHERE 条件与否。因此,在使用它们时,您应该努力确保它们遇到尽可能少的行,方法是使用索引(最好是主键)并避免模糊或广泛选择的条件。

我对工作队列的建议几乎是通用的:尽可能少地锁定,并始终以确定的顺序锁定。所以,一般来说:

  1. 使用非锁定读取(常规 SELECT)通过查找您的工作人员知道如何做且当前无人认领的东西来找到要做的工作(lease_owner IS NULL AND lease_expiry IS NULL——或类似的)。
  2. 选择一个工作项目(如果你敢,也可以选择几个,但一个要简单得多,而且通常可以提供完全可以接受的性能)。
  3. 更新您的工作项目(声明它,但无论如何它也需要更新):
    1. 开启交易。
    2. 使用 SELECT ... FOR UPDATE 锁定您选择的工作项 -- 如果它不再无人认领,则中止并选择另一个。
    3. 使用您的 worker ID 和租约到期时间更新您选择的工作项目。
    4. 立即提交您的交易。
  4. 开始处理您租用的工作项目。
  5. 在其他一些过程中,另一个轮询器查找已放弃的工作并取消声明它(通过上述相同的更新过程)。

您可以通过这种设计轻松获得非常高的吞吐量(每秒数千个作业),并且基本上没有争用和排序问题。选择不太可能与其他轮询器冲突的工作的优化简单而有效(例如,作业 ID 或类似的模数,选择以避免作业饥饿)。关键是要记住,工作选择上的冲突没关系 - 只需中止并重试,一切都会很快进行。

工作队列项/作业的所有锁定写入都应该只在单行上并且通过主键完成。

关于InnoDB 的 MySQL 死锁问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21130232/

相关文章:

二级索引上的 MySQL Innodb 键长度

c# - 在不能保证获取锁的顺序时避免死锁

c++ - Boost::Future 延迟延续展开死锁

php - 实时搜索无法使用 Bootstrap 4、PHP、MySQLi 和 Ajax 运行

mysql - 如何进行 'groups' 许多 where 语句的选择查询

sql - mysql表内连接

php - 存储过程 phpmyadmin

mysql - 虚假外键约束失败

phpMyAdmin 在转到 "Browse"时通过执行 COUNT(*) 来挂起服务器

firebird - 死锁更新与并发更新冲突