我有 3 个版本的查询,最终返回相同的结果。
在向相对较小的表和 where 子句中使用参数变量的情况下添加额外的内部连接时,其中一个会显着变慢。
快速查询和慢速查询(包含在每个查询下方)的执行计划非常不同。
我想了解为什么会发生这种情况以及如何预防。
此查询需要 < 1 秒。它没有额外的内部连接,但它在 where 子句中使用参数变量。
declare @start datetime = '20120115'
declare @end datetime = '20120116'
select distinct sups.campaignid
from tblSupporterMainDetails sups
inner join tblCallLogs calls on sups.supporterid = calls.supporterid
where calls.callEnd between @start and @end
|--Parallelism(Gather Streams)
|--Sort(DISTINCT ORDER BY:([sups].[campaignID] ASC))
|--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([sups].[campaignID]))
|--Hash Match(Partial Aggregate, HASH:([sups].[campaignID]))
|--Hash Match(Inner Join, HASH:([calls].[supporterID])=([sups].[supporterID]))
|--Bitmap(HASH:([calls].[supporterID]), DEFINE:([Bitmap1004]))
| |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([calls].[supporterID]))
| |--Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[IX_tblCallLogs_callend_supporterid] AS [calls]), SEEK:([calls].[callEnd] >= '2012-01-15 00:00:00.000' AND [calls].[callEnd] <= '2012-01-16 00:00:00.000') ORDERED FORWARD)
|--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([sups].[supporterID]))
|--Index Scan(OBJECT:([GOGEN].[dbo].[tblSupporterMainDetails].[AUTOGEN_IX_tblSupporterMainDetails_campaignID] AS [sups]), WHERE:(PROBE([Bitmap1004],[GOGEN].[dbo].[tblSupporterMainDetails].[supporterID] as [sups].[supporterID],N'[IN ROW]')))
此查询需要 < 1 秒。它有一个额外的内部连接,但在 where 子句中使用参数常量。
select distinct camps.campaignid
from tblCampaigns camps
inner join tblSupporterMainDetails sups on camps.campaignid = sups.campaignid
inner join tblCallLogs calls on sups.supporterid = calls.supporterid
where calls.callEnd between '20120115' and '20120116'
|--Parallelism(Gather Streams)
|--Hash Match(Right Semi Join, HASH:([sups].[campaignID])=([camps].[campaignID]))
|--Bitmap(HASH:([sups].[campaignID]), DEFINE:([Bitmap1007]))
| |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([sups].[campaignID]))
| |--Hash Match(Partial Aggregate, HASH:([sups].[campaignID]))
| |--Hash Match(Inner Join, HASH:([calls].[supporterID])=([sups].[supporterID]))
| |--Bitmap(HASH:([calls].[supporterID]), DEFINE:([Bitmap1006]))
| | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([calls].[supporterID]))
| | |--Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[IX_tblCallLogs_callend_supporterid] AS [calls]), SEEK:([calls].[callEnd] >= '2012-01-15 00:00:00.000' AND [calls].[callEnd] <= '2012-01-16 00:00:00.000') ORDERED FORWARD)
| |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([sups].[supporterID]))
| |--Index Scan(OBJECT:([GOGEN].[dbo].[tblSupporterMainDetails].[AUTOGEN_IX_tblSupporterMainDetails_campaignID] AS [sups]), WHERE:(PROBE([Bitmap1006],[GOGEN].[dbo].[tblSupporterMainDetails].[supporterID] as [sups].[supporterID],N'[IN ROW]')))
|--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([camps].[campaignID]))
|--Index Scan(OBJECT:([GOGEN].[dbo].[tblCampaigns].[IX_tblCampaigns_isActive] AS [camps]), WHERE:(PROBE([Bitmap1007],[GOGEN].[dbo].[tblCampaigns].[campaignID] as [camps].[campaignID],N'[IN ROW]')))
此查询需要 2 分钟。它有一个额外的内部连接并且它在 where 子句中使用参数变量。
declare @start datetime = '20120115'
declare @end datetime = '20120116'
select distinct camps.campaignid
from tblCampaigns camps
inner join tblSupporterMainDetails sups on camps.campaignid = sups.campaignid
inner join tblCallLogs calls on sups.supporterid = calls.supporterid
where calls.callEnd between @start and @end
|--Nested Loops(Inner Join, OUTER REFERENCES:([camps].[campaignID]))
|--Index Scan(OBJECT:([GOGEN].[dbo].[tblCampaigns].[IX_tblCampaigns_isActive] AS [camps]))
|--Top(TOP EXPRESSION:((1)))
|--Nested Loops(Inner Join, OUTER REFERENCES:([calls].[callID], [Expr1007]) OPTIMIZED WITH UNORDERED PREFETCH)
|--Nested Loops(Inner Join, OUTER REFERENCES:([sups].[supporterID], [Expr1006]) WITH UNORDERED PREFETCH)
| |--Index Seek(OBJECT:([GOGEN].[dbo].[tblSupporterMainDetails].[AUTOGEN_IX_tblSupporterMainDetails_campaignID] AS [sups]), SEEK:([sups].[campaignID]=[GOGEN].[dbo].[tblCampaigns].[campaignID] as [camps].[campaignID]) ORDERED FORWARD)
| |--Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[IX_tblCallLogs_supporterID_closingCall] AS [calls]), SEEK:([calls].[supporterID]=[GOGEN].[dbo].[tblSupporterMainDetails].[supporterID] as [sups].[supporterID]) ORDERED FORWARD)
|--Clustered Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[AUTOGEN_PK_tblCallLogs] AS [calls]), SEEK:([calls].[callID]=[GOGEN].[dbo].[tblCallLogs].[callID] as [calls].[callID]), WHERE:([GOGEN].[dbo].[tblCallLogs].[callEnd] as [calls].[callEnd]>=[@s2] AND [GOGEN].[dbo].[tblCallLogs].[callEnd] as [calls].[callEnd]<=[@e2]) LOOKUP ORDERED FORWARD)
笔记:
tblCallLogs
上的聚集索引搜索引起的。 ,但是不知道为什么SQL Server会选择这个执行计划。 tblCampaigns
上重建了索引并更新了统计信息和 tblSupporterMainDetails
.这没有效果。 campaignid
被索引。 表中的记录数:
更新:
tblcalllogs
上重建了索引并更新了统计信息。 .没有效果。 DBCC FREEPROCCACHE
清除了执行计划缓存涉及列的模式:
tblCampaign.campaignid int not null
tblSupporterMainDetails.campaignid int not null
tblSupporterMainDetails.supporterid int not null
tblCallLogs.supporterid int not null
tblCallLogs.callEnd datetime not null
索引:
更新 2:
向 tblCallLogs.supporterId 添加索引后 - 包含列:callEnd
“慢”查询加快了 40 秒。
更新后的执行计划:
|--Nested Loops(Inner Join, OUTER REFERENCES:([camps].[campaignID]))
|--Index Scan(OBJECT:([GOGEN].[dbo].[tblCampaigns].[IX_tblCampaigns_isActive] AS [camps]))
|--Top(TOP EXPRESSION:((1)))
|--Nested Loops(Inner Join, OUTER REFERENCES:([sups].[supporterID], [Expr1006]) WITH UNORDERED PREFETCH)
|--Index Seek(OBJECT:([GOGEN].[dbo].[tblSupporterMainDetails].[AUTOGEN_IX_tblSupporterMainDetails_campaignID] AS [sups]), SEEK:([sups].[campaignID]=[GOGEN].[dbo].[tblCampaigns].[campaignID] as [camps].[campaignID]) ORDERED FORWARD)
|--Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[IX_tblCallLogs_supporterid_callend] AS [calls]), SEEK:([calls].[supporterID]=[GOGEN].[dbo].[tblSupporterMainDetails].[supporterID] as [sups].[supporterID]), WHERE:([GOGEN].[dbo].[tblCallLogs].[callEnd] as [calls].[callEnd]>=[@s2] AND [GOGEN].[dbo].[tblCallLogs].[callEnd] as [calls].[callEnd]<=[@e2]) ORDERED FORWARD)
解决方案:
额外的连接实际上并没有直接导致问题,但是它显然改变了语句,以便 sql server 为其持有不同的执行计划。
通过添加
选项(重新编译)
到慢语句的结尾,我能够获得预期的快速性能。即 < 1 秒。我仍然不确定这个解决方案是否有效 - 为什么没有刷新所有计划?这是参数嗅探的经典案例吗?当我知道确切答案时,我会更新这篇文章 - 或者直到有人可以给出明确的答案。
感谢@LievenKeersmaekers 和@JNK 到目前为止的帮助...
最佳答案
导致解决方案的摘要:
在 supporterid, callEnd
上添加覆盖索引.
这里的假设是优化器 可以 使用此索引(与 callEnd、supporterid 相反)来
tblSupporterMainDetails
和 tblCallLogs
where
中进一步使用它选择条款 callEnd
添加选项
OPTION(RECOMPILE)
TiborK 和 Hunchback 的所有 cudo 用于解释使用硬编码常量或变量的优化器的差异。
Performance Impact - Constant value -vs- Variable
When you use the constant, the value is known to the optimizer so it can determine selectivity (and possible index usage) based on that. When you use a variable, the value is unknown to the optimizer (so it have to go by some hardwired value or possibly density info). So, technically, this isn't parameter sniffing, but whatever article you find on that subject should also explain the difference between a constant and a variable. Using OPTION(RECOMPILE) will actually turn the variabe to a parameter sniffing situation.
In essence, there is a big difference between a constant, a variable and a paramater (whcih can be sniffed).
关于performance - 参数变量导致查询慢,但为什么呢?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15682621/