sql-server - SQL Server 2005 T-SQL 问题 : Can you trust the Query Optimizer ? 我知道我不能!

标签 sql-server sql-server-2005 query-optimization

此问题链接至my previous one (作为匿名用户发布 - 现在我有一个帐户),在开始之前,我想感谢 Rob Farley 提供了正确的索引架构。

但问题不在于索引架构。

这是查询优化器!

查询:

SELECT s.ID_i
     , s.ShortName_v
     , sp.Path_v
     , ( SELECT TOP 1 1         -- has also user access on subsites ?
           FROM SitePath_T usp
              , UserSiteRight_t usr
          WHERE usr.SiteID_i = usp.SiteID_i
            AND usp.Path_v LIKE sp.Path_v + '%_'
            AND usr.UserID_i = 1 )
  FROM Site_T s
     , SitePath_T sp
 WHERE sp.SiteID_i = s.ID_i
   AND s.ShortName_v LIKE '[a-y]%'
   AND s.ParentID_i = 1
   AND EXISTS ( SELECT *
                  FROM SitePath_T usp
                     , UserSiteRight_t usr
                 WHERE usr.SiteID_i = usp.SiteID_i
                   AND usp.Path_v LIKE sp.Path_v + '%'
                   AND usr.UserID_i = 1 )

...运行于:

CPU   Reads  Writes Duration
2073  49572  0      2241      -- more than 2 sec

执行计划:

  |--Compute Scalar(DEFINE:([Expr1014]=[Expr1014]))
       |--Nested Loops(Left Outer Join, OUTER REFERENCES:([sp].[Path_v]))
            |--Nested Loops(Left Semi Join, OUTER REFERENCES:([Expr1016], [Expr1017], [Expr1018], [Expr1019]))
            |    |--Merge Join(Inner Join, MERGE:([sp].[SiteID_i])=([s].[ID_i]), RESIDUAL:([dbo].[SitePath_T].[SiteID_i] as [sp].[SiteID_i]=[dbo].[Site_T].[ID_i] as [s].[ID_i]))
            |    |    |--Compute Scalar(DEFINE:([Expr1016]=[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%', [Expr1017]=LikeRangeStart([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1018]=LikeRangeEnd([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1019]=LikeRangeInfo([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%')))
            |    |    |    |--Index Scan(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_SiteID_<Path>] AS [sp]), ORDERED FORWARD)
            |    |    |--Sort(ORDER BY:([s].[ID_i] ASC))
            |    |         |--Clustered Index Seek(OBJECT:([dbo].[Site_T].[IDXC_Site_ParentID+ShortName+ID] AS [s]), SEEK:([s].[ParentID_i]=(1) AND [s].[ShortName_v] >= '9þþþþþ' AND [s].[ShortName_v] < 'Z'),  WHERE:([dbo].[Site_T].[ShortName_v] as [s].[ShortName_v] like '[a-y]%') ORDERED FORWARD)
            |    |--Nested Loops(Inner Join, OUTER REFERENCES:([usp].[SiteID_i], [Expr1020]) WITH UNORDERED PREFETCH)
            |         |--Clustered Index Scan(OBJECT:([dbo].[SitePath_T].[IDXC_SitePath_Path+SiteID] AS [usp]), WHERE:([dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [Expr1016]))
            |         |--Index Seek(OBJECT:([dbo].[UserSiteRight_T].[IDX_UserSiteRight_UserID+SiteID] AS [usr]), SEEK:([usr].[UserID_i]=(1) AND [usr].[SiteID_i]=[dbo].[SitePath_T].[SiteID_i] as [usp].[SiteID_i]) ORDERED FORWARD)
            |--Compute Scalar(DEFINE:([Expr1014]=(1)))
                 |--Top(TOP EXPRESSION:((1)))
                      |--Nested Loops(Inner Join, OUTER REFERENCES:([usp].[SiteID_i], [Expr1021]) WITH UNORDERED PREFETCH)
                           |--Clustered Index Scan(OBJECT:([dbo].[SitePath_T].[IDXC_SitePath_Path+SiteID] AS [usp]), WHERE:([dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_'))
                           |--Index Seek(OBJECT:([dbo].[UserSiteRight_T].[IDX_UserSiteRight_UserID+SiteID] AS [usr]), SEEK:([usr].[UserID_i]=(1) AND [usr].[SiteID_i]=[dbo].[SitePath_T].[SiteID_i] as [usp].[SiteID_i]) ORDERED FORWARD)

但是如果我强制执行索引,则以下查询:

SELECT s.ID_i
     , s.ShortName_v
     , sp.Path_v
     , ( SELECT TOP 1 1        -- has also user access on subsites ?
           FROM SitePath_T usp WITH ( INDEX ( [IDX_SitePath_Path+SiteID] ) )
                               -- same performance when using WITH ( INDEX ( [IDX_SitePath_Path_INC<SiteID>] ) )
              , UserSiteRight_t usr WITH ( INDEX ( [IDX_UserSiteRight_UserID+SiteID] ) )
          WHERE usr.SiteID_i = usp.SiteID_i
            AND usp.Path_v LIKE sp.Path_v + '%_'
            AND usr.UserID_i = 1)
  FROM Site_T s
     , SitePath_T sp WITH ( INDEX ( [IDX_SitePath_SiteID+Path] ) )
                     -- same performance when using WITH ( INDEX ( [IDX_SitePath_SiteID_INC<Path>] ) )
 WHERE sp.SiteID_i = s.ID_i
   AND s.ShortName_v LIKE '[a-y]%'
   AND s.ParentID_i = 1
   AND EXISTS ( SELECT *
                  FROM SitePath_T usp WITH ( INDEX ( [IDX_SitePath_Path+SiteID] ) ) 
                                      -- same performance when using WITH ( INDEX ( [IDX_SitePath_Path_INC<SiteID>] ) )
                     , UserSiteRight_t usr WITH ( INDEX ( [IDX_UserSiteRight_UserID+SiteID] ) )
                 WHERE usr.SiteID_i = usp.SiteID_i
                   AND usp.Path_v LIKE sp.Path_v + '%'
                   AND usr.UserID_i = 1 )

将运行于:

CPU  Reads  Writes  Duration
50   11237  0       55

持续时间将降至 55 毫秒(从超过 2 秒)!!!!

我对这个结果很满意!

执行计划:

  |--Compute Scalar(DEFINE:([Expr1014]=[Expr1014]))
       |--Nested Loops(Left Outer Join, OUTER REFERENCES:([sp].[Path_v]))
            |--Nested Loops(Left Semi Join, OUTER REFERENCES:([Expr1016], [Expr1017], [Expr1018], [Expr1019]))
            |    |--Merge Join(Inner Join, MERGE:([sp].[SiteID_i])=([s].[ID_i]), RESIDUAL:([dbo].[SitePath_T].[SiteID_i] as [sp].[SiteID_i]=[dbo].[Site_T].[ID_i] as [s].[ID_i]))
            |    |    |--Compute Scalar(DEFINE:([Expr1016]=[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%', [Expr1017]=LikeRangeStart([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1018]=LikeRangeEnd([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1019]=LikeRangeInfo([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%')))
            |    |    |    |--Index Scan(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_SiteID_<Path>] AS [sp]), ORDERED FORWARD)
            |    |    |--Sort(ORDER BY:([s].[ID_i] ASC))
            |    |         |--Clustered Index Seek(OBJECT:([dbo].[Site_T].[IDXC_Site_ParentID+ShortName+ID] AS [s]), SEEK:([s].[ParentID_i]=(1) AND [s].[ShortName_v] >= '9þþþþþ' AND [s].[ShortName_v] < 'Z'),  WHERE:([dbo].[Site_T].[ShortName_v] as [s].[ShortName_v] like '[a-y]%') ORDERED FORWARD)
            |    |--Nested Loops(Inner Join, OUTER REFERENCES:([usp].[SiteID_i], [Expr1023]) WITH UNORDERED PREFETCH)
            |         |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1017], [Expr1018], [Expr1019]))
            |         |    |--Compute Scalar(DEFINE:([Expr1017]=[Expr1017], [Expr1018]=[Expr1018], [Expr1019]=[Expr1019]))
            |         |    |    |--Constant Scan
            |         |    |--Index Seek(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_Path+SiteID] AS [usp]), SEEK:([usp].[Path_v] > [Expr1017] AND [usp].[Path_v] < [Expr1018]),  WHERE:([dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [Expr1016]) ORDERED FORWARD)
            |         |--Index Seek(OBJECT:([dbo].[UserSiteRight_T].[IDX_UserSiteRight_UserID+SiteID] AS [usr]), SEEK:([usr].[UserID_i]=(1) AND [usr].[SiteID_i]=[dbo].[SitePath_T].[SiteID_i] as [usp].[SiteID_i]) ORDERED FORWARD)
            |--Compute Scalar(DEFINE:([Expr1014]=(1)))
                 |--Top(TOP EXPRESSION:((1)))
                      |--Nested Loops(Inner Join, OUTER REFERENCES:([usp].[SiteID_i], [Expr1027]) WITH UNORDERED PREFETCH)
                           |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1024], [Expr1025], [Expr1026]))
                           |    |--Compute Scalar(DEFINE:([Expr1024]=LikeRangeStart([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_'), [Expr1025]=LikeRangeEnd([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_'), [Expr1026]=LikeRangeInfo([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_')))
                           |    |    |--Constant Scan
                           |    |--Index Seek(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_Path+SiteID] AS [usp]), SEEK:([usp].[Path_v] > [Expr1024] AND [usp].[Path_v] < [Expr1025]),  WHERE:([dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_') ORDERED FORWARD)
                           |--Index Seek(OBJECT:([dbo].[UserSiteRight_T].[IDX_UserSiteRight_UserID+SiteID] AS [usr]), SEEK:([usr].[UserID_i]=(1) AND [usr].[SiteID_i]=[dbo].[SitePath_T].[SiteID_i] as [usp].[SiteID_i]) ORDERED FORWARD)

下一步是为不同的用户运行它,因此我将 UserID_i 声明为变量:

DECLARE @UserID_i INT 
SELECT @UserID_i = 1

但是现在下面的查询变得非常慢!!!

SELECT s.ID_i
  , s.ShortName_v
  , sp.Path_v
  , ( SELECT TOP 1 1        -- has also user access on subsites ?
        FROM SitePath_T usp WITH ( INDEX ( [IDX_SitePath_Path+SiteID] ) ) 
           , UserSiteRight_t usr WITH ( INDEX ( [IDX_UserSiteRight_UserID+SiteID] ) )
       WHERE usr.SiteID_i = usp.SiteID_i
         AND usp.Path_v LIKE sp.Path_v + '%_'
         AND usr.UserID_i = @UserID_i)
  FROM Site_T s
     , SitePath_T sp WITH ( INDEX ( [IDX_SitePath_SiteID+Path] ) )
 WHERE sp.SiteID_i = s.ID_i
   AND s.ShortName_v LIKE '[a-y]%'
   AND s.ParentID_i = 1
   AND EXISTS ( SELECT *
                  FROM SitePath_T usp WITH ( INDEX ( [IDX_SitePath_Path+SiteID] ) ) 
                     , UserSiteRight_t usr WITH ( INDEX ( [IDX_UserSiteRight_UserID+SiteID] ) )
                 WHERE usr.SiteID_i = usp.SiteID_i
                   AND usp.Path_v LIKE sp.Path_v + '%'
                   AND usr.UserID_i = @UserID_i )

持续时间现在超过 7 秒!!!

CPU     Reads   Writes  Duration
7421    149984  35      7625

以及执行计划:

  |--Compute Scalar(DEFINE:([Expr1014]=[Expr1014]))
       |--Nested Loops(Left Outer Join, OUTER REFERENCES:([sp].[Path_v]))
            |--Nested Loops(Left Semi Join, WHERE:([dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [Expr1016]))
            |    |--Merge Join(Inner Join, MERGE:([sp].[SiteID_i])=([s].[ID_i]), RESIDUAL:([dbo].[SitePath_T].[SiteID_i] as [sp].[SiteID_i]=[dbo].[Site_T].[ID_i] as [s].[ID_i]))
            |    |    |--Compute Scalar(DEFINE:([Expr1016]=[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%', [Expr1017]=LikeRangeStart([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1018]=LikeRangeEnd([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1019]=LikeRangeInfo([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%')))
            |    |    |    |--Index Scan(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_SiteID+Path] AS [sp]), ORDERED FORWARD)
            |    |    |--Sort(ORDER BY:([s].[ID_i] ASC))
            |    |         |--Clustered Index Seek(OBJECT:([dbo].[Site_T].[IDXC_Site_ParentID+ShortName+ID] AS [s]), SEEK:([s].[ParentID_i]=(1) AND [s].[ShortName_v] >= '9þþþþþ' AND [s].[ShortName_v] < 'Z'),  WHERE:([dbo].[Site_T].[ShortName_v] as [s].[ShortName_v] like '[a-y]%') ORDERED FORWARD)
            |    |--Table Spool
            |         |--Hash Match(Inner Join, HASH:([usr].[SiteID_i])=([usp].[SiteID_i]))
            |              |--Index Seek(OBJECT:([dbo].[UserSiteRight_T].[IDX_UserSiteRight_UserID+SiteID] AS [usr]), SEEK:([usr].[UserID_i]=[@UserID_i]) ORDERED FORWARD)
            |              |--Index Scan(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_Path+SiteID] AS [usp]))
            |--Compute Scalar(DEFINE:([Expr1014]=(1)))
                 |--Top(TOP EXPRESSION:((1)))
                      |--Nested Loops(Inner Join, WHERE:([dbo].[UserSiteRight_T].[SiteID_i] as [usr].[SiteID_i]=[dbo].[SitePath_T].[SiteID_i] as [usp].[SiteID_i]))
                           |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1020], [Expr1021], [Expr1022]))
                           |    |--Compute Scalar(DEFINE:([Expr1020]=LikeRangeStart([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_'), [Expr1021]=LikeRangeEnd([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_'), [Expr1022]=LikeRangeInfo([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_')))
                           |    |    |--Constant Scan
                           |    |--Index Seek(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_Path+SiteID] AS [usp]), SEEK:([usp].[Path_v] > [Expr1020] AND [usp].[Path_v] < [Expr1021]),  WHERE:([dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_') ORDERED FORWARD)
                           |--Table Spool
                                |--Index Seek(OBJECT:([dbo].[UserSiteRight_T].[IDX_UserSiteRight_UserID+SiteID] AS [usr]), SEEK:([usr].[UserID_i]=[@UserID_i]) ORDERED FORWARD)

当我使用变量而不是硬编码 UserID_i 值时,执行计划完全改变!

为什么查询优化器会有这样的行为?

如何强制执行计划与第二个快速查询相同?

谢谢。

<小时/>

更新 1

<小时/>

已删除(不相关)

<小时/>

更新2

<小时/>

看来这个问题不止我一个。

请检查以下主题:
Why does the SqlServer optimizer get so confused with parameters?
Known issue?: SQL Server 2005 stored procedure fails to complete with a parameter

<小时/>

更新3

<小时/>

来自 SQL Server 查询优化团队的一篇精彩文章,介绍了参数嗅探:I Smell a Parameter !

最佳答案

当您使用变量(在第三个查询中)时,是否有原因无法使用索引提示(如在第二个查询中)?奇怪的是,当有可用索引时,查询优化器会做出如此错误的决定,但它只了解有限的数据,并且会尽可能地进行选择。

实际上,有关索引列的一些统计信息可能会对您有所帮助 - 它们跟踪数据、数据布局以及有关表实际包含内容的一些其他信息,而索引本身仅构建在表之上元数据,并且查询优化器不会选择数据本身(除非有统计信息来帮助它这样做)。

您是否对查询运行了“数据库优化顾问”?突出显示查询并从 SSMS 的“查询”菜单中选择“在数据库引擎优化顾问中分析查询”将使用表数据为您建议一些统计信息 - 这可能会产生巨大的差异。

关于sql-server - SQL Server 2005 T-SQL 问题 : Can you trust the Query Optimizer ? 我知道我不能!,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2040314/

相关文章:

sql - 这个SQL有什么问题吗?

c# - 在 SQL Server 2005 IMAGE 列中存储 20 Meg 文件的最有效方法

sql - 表上的索引策略

sql - Oracle SQL 何时使用完整提示

SQl 选择特定列但使用 *

sql - SQL如果可能,将值转换为int,如果不可能,则将值设置为0

sql-server-2005 - 使用派生列为每行创建格式为 YYYY-MM-01 00 :00:00. 000 的日期值

SQL 如果没有返回行则执行此操作

mysql - 笛卡尔积过多,导致查询运行速度变慢

php - SQL 根据表 B 中的条件返回表 A 的结果