这是场景:
我有一个名为 MarketDataCurrent (MDC) 的表,可以实时更新股票价格。
我有一个名为“LiveFeed”的进程,它从网络读取价格流,将插入排队,并使用“批量上传到临时表,然后插入/更新到 MDC 表”。 (批量上传)
我有另一个进程,然后读取这些数据,计算其他数据,然后使用类似的 BulkUpsert 存储过程将结果保存回同一个表。
第三,有大量用户运行 C# Gui 轮询 MDC 表并从中读取更新。
现在,在数据快速变化的那一天,事情运行得非常顺利,但是,在开市时间之后,我们最近开始看到数据库中出现越来越多的死锁异常,现在我们每天看到 10-20 个.这里要注意的重要一点是,这些发生在值没有改变时。
以下是所有相关信息:
表定义:
CREATE TABLE [dbo].[MarketDataCurrent](
[MDID] [int] NOT NULL,
[LastUpdate] [datetime] NOT NULL,
[Value] [float] NOT NULL,
[Source] [varchar](20) NULL,
CONSTRAINT [PK_MarketDataCurrent] PRIMARY KEY CLUSTERED
(
[MDID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
——
我有一个 Sql Profiler Trace Running,捕捉死锁,这里是所有图形的样子。
进程 258 被称为以下“BulkUpsert”存储过程,重复,而 73 正在调用下一个:
ALTER proc [dbo].[MarketDataCurrent_BulkUpload]
@updateTime datetime,
@source varchar(10)
as
begin transaction
update c with (rowlock) set LastUpdate = getdate(), Value = t.Value, Source = @source
from MarketDataCurrent c INNER JOIN #MDTUP t ON c.MDID = t.mdid
where c.lastUpdate < @updateTime
and c.mdid not in (select mdid from MarketData where LiveFeedTicker is not null and PriceSource like 'LiveFeed.%')
and c.value <> t.value
insert into MarketDataCurrent
with (rowlock)
select MDID, getdate(), Value, @source from #MDTUP
where mdid not in (select mdid from MarketDataCurrent with (nolock))
and mdid not in (select mdid from MarketData where LiveFeedTicker is not null and PriceSource like 'LiveFeed.%')
commit
而另一个:
ALTER PROCEDURE [dbo].[MarketDataCurrent_LiveFeedUpload]
AS
begin transaction
-- Update existing mdid
UPDATE c WITH (ROWLOCK) SET LastUpdate = t.LastUpdate, Value = t.Value, Source = t.Source
FROM MarketDataCurrent c INNER JOIN #TEMPTABLE2 t ON c.MDID = t.mdid;
-- Insert new MDID
INSERT INTO MarketDataCurrent with (ROWLOCK) SELECT * FROM #TEMPTABLE2
WHERE MDID NOT IN (SELECT MDID FROM MarketDataCurrent with (NOLOCK))
-- Clean up the temp table
DELETE #TEMPTABLE2
commit
澄清一下,这些临时表是由 C# 代码在同一连接上创建的,并使用 C# SqlBulkCopy 类填充。
对我来说,它看起来像是在表的 PK 上陷入僵局,所以我尝试删除该 PK 并切换到唯一约束,但这使死锁数量增加了 10 倍。
我完全不知道该怎么处理这种情况,并且对任何建议都持开放态度。
帮助!!
响应 XDL 的请求,这里是:
<deadlock-list>
<deadlock victim="processc19978">
<process-list>
<process id="processaf0b68" taskpriority="0" logused="0" waitresource="KEY: 6:72057594090487808 (d900ed5a6cc6)" waittime="718" ownerId="1102128174" transactionname="user_transaction" lasttranstarted="2010-06-11T16:30:44.750" XDES="0xffffffff817f9a40" lockMode="U" schedulerid="3" kpid="8228" status="suspended" spid="73" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2010-06-11T16:30:44.750" lastbatchcompleted="2010-06-11T16:30:44.750" clientapp=".Net SqlClient Data Provider" hostname="RISKAPPS_VM" hostpid="3836" loginname="RiskOpt" isolationlevel="read committed (2)" xactid="1102128174" currentdb="6" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056">
<executionStack>
<frame procname="MKP_RISKDB.dbo.MarketDataCurrent_BulkUpload" line="28" stmtstart="1062" stmtend="1720" sqlhandle="0x03000600a28e5e4ef4fd8e00849d00000100000000000000">
UPDATE c WITH (ROWLOCK) SET LastUpdate = getdate(), Value = t.Value, Source = @source
FROM MarketDataCurrent c INNER JOIN #MDTUP t ON c.MDID = t.mdid
WHERE c.lastUpdate < @updateTime
and c.mdid not in (select mdid from MarketData where BloombergTicker is not null and PriceSource like 'Blbg.%')
and c.value <> t.value </frame>
<frame procname="adhoc" line="1" stmtstart="88" sqlhandle="0x01000600c1653d0598706ca7000000000000000000000000">
exec MarketDataCurrent_BulkUpload @clearBefore, @source </frame>
<frame procname="unknown" line="1" sqlhandle="0x000000000000000000000000000000000000000000000000">
unknown </frame>
</executionStack>
<inputbuf>
(@clearBefore datetime,@source nvarchar(10))exec MarketDataCurrent_BulkUpload @clearBefore, @source </inputbuf>
</process>
<process id="processc19978" taskpriority="0" logused="0" waitresource="KEY: 6:72057594090487808 (74008e31572b)" waittime="718" ownerId="1102128228" transactionname="user_transaction" lasttranstarted="2010-06-11T16:30:44.780" XDES="0x380be9d8" lockMode="U" schedulerid="5" kpid="8464" status="suspended" spid="248" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2010-06-11T16:30:44.780" lastbatchcompleted="2010-06-11T16:30:44.780" clientapp=".Net SqlClient Data Provider" hostname="RISKBBG_VM" hostpid="4480" loginname="RiskOpt" isolationlevel="read committed (2)" xactid="1102128228" currentdb="6" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056">
<executionStack>
<frame procname="MKP_RISKDB.dbo.MarketDataCurrentBlbgRtUpload" line="14" stmtstart="840" stmtend="1220" sqlhandle="0x03000600005f9d24c8878f00849d00000100000000000000">
UPDATE c WITH (ROWLOCK) SET LastUpdate = t.LastUpdate, Value = t.Value, Source = t.Source
FROM MarketDataCurrent c INNER JOIN #TEMPTABLE2 t ON c.MDID = t.mdid;
-- Insert new MDID </frame>
<frame procname="adhoc" line="1" sqlhandle="0x010006004a58132228bf8d73000000000000000000000000">
MarketDataCurrentBlbgRtUpload </frame>
</executionStack>
<inputbuf>
MarketDataCurrentBlbgRtUpload </inputbuf>
</process>
</process-list>
<resource-list>
<keylock hobtid="72057594090487808" dbid="6" objectname="MKP_RISKDB.dbo.MarketDataCurrent" indexname="PK_MarketDataCurrent" id="lock5ba77b00" mode="U" associatedObjectId="72057594090487808">
<owner-list>
<owner id="processc19978" mode="U"/>
</owner-list>
<waiter-list>
<waiter id="processaf0b68" mode="U" requestType="wait"/>
</waiter-list>
</keylock>
<keylock hobtid="72057594090487808" dbid="6" objectname="MKP_RISKDB.dbo.MarketDataCurrent" indexname="PK_MarketDataCurrent" id="lock65dca340" mode="U" associatedObjectId="72057594090487808">
<owner-list>
<owner id="processaf0b68" mode="U"/>
</owner-list>
<waiter-list>
<waiter id="processc19978" mode="U" requestType="wait"/>
</waiter-list>
</keylock>
</resource-list>
</deadlock>
</deadlock-list>
最佳答案
死锁似乎是键访问顺序上的直接死锁。一个简单的解释是两个批量更新操作之间更新的键的重叠。
一个不太简单的解释是,在 SQL Server(以及其他服务器)中,锁定的键被散列,并且存在(非常重要的)散列冲突概率。这可以解释为什么您最近看到的死锁比以前更多:只是您的数据量增加了,因此碰撞概率增加了。如果这看起来深奥且不可能,请继续阅读 %%lockres%% collision probability magic marker: 16,777,215 ,以及由此链接的文章。概率非常高,对于完美的 key 分布,仅在大约 16M 次插入后就有 50% 的碰撞概率。对于正常的、真实的、关键的分布,只有几千个插入就有很大的碰撞概率。不幸的是,没有解决方法。如果这确实是问题,您唯一的解决方案是减少批次的大小(#temp 表的大小),以便降低碰撞概率。或者处理死锁并重试……无论如何您都必须这样做,但至少您可以处理更少的死锁。
关于sql-server-2005 - SQL Server 2005 中的死锁!两个实时批量更新正在战斗。为什么?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3022907/