c# - EF 生成的查询执行时间过长

我有一个非常简单的查询，它是由 Entity Framework 生成的，有时当我尝试运行此查询时，执行时间几乎超过 30 秒，并且超时 Exception .

SELECT TOP (10) 
[Extent1].[LinkID] AS [LinkID], 
[Extent1].[Title] AS [Title], 
[Extent1].[Url] AS [Url], 
[Extent1].[Description] AS [Description], 
[Extent1].[SentDate] AS [SentDate], 
[Extent1].[VisitCount] AS [VisitCount], 
[Extent1].[RssSourceId] AS [RssSourceId], 
[Extent1].[ReviewStatus] AS [ReviewStatus], 
[Extent1].[UserAccountId] AS [UserAccountId], 
[Extent1].[CreationDate] AS [CreationDate]
FROM ( SELECT [Extent1].[LinkID] AS [LinkID], [Extent1].[Title] AS [Title], [Extent1].[Url] AS [Url], [Extent1].[Description] AS [Description], [Extent1].[SentDate] AS [SentDate], [Extent1].[VisitCount] AS [VisitCount], [Extent1].[RssSourceId] AS [RssSourceId], [Extent1].[ReviewStatus] AS [ReviewStatus], [Extent1].[UserAccountId] AS [UserAccountId], [Extent1].[CreationDate] AS [CreationDate], row_number() OVER (ORDER BY [Extent1].[SentDate] DESC) AS [row_number]
    FROM [dbo].[Links] AS [Extent1]
)  AS [Extent1]
WHERE [Extent1].[row_number] > 0
ORDER BY [Extent1].[SentDate] DESC

生成查询的代码是:

public async Task<IQueryable<TEntity>> GetAsync(Expression<Func<TEntity, bool>> filter = null,
    Func<IQueryable<TEntity>, IOrderedQueryable<TEntity>> orderBy = null)
{
    return await Task.Run(() =>
    {
        IQueryable<TEntity> query = _dbSet;
        if (filter != null)
        {
            query = query.Where(filter);
        }

        if (orderBy != null)
        {
            query = orderBy(query);
        }

        return query;
    });
}

请注意，当我删除内部 Select 时声明和 Where子句并将其更改为跟随，查询在不到一秒钟的时间内执行得很好。

SELECT TOP (10) 
[Extent1].[LinkID] AS [LinkID], 
[Extent1].[Title] AS [Title], 
.
.
.
FROM [dbo].[Links] AS [Extent1]
ORDER BY [Extent1].[SentDate] DESC

任何建议都会有所帮助。

更新:

下面是上面代码的用法:

var dbLinks = await _uow.LinkRespository.GetAsync(filter, orderBy);
var pagedLinks = new PagedList<Link>(dbLinks, pageNumber, PAGE_SIZE);
var vmLinks = Mapper.Map<IPagedList<LinkViewItemViewModel>>(pagedLinks);

并过滤:

var result = await GetLinks(null, pageNo, a => a.OrderByDescending(x => x.SentDate));

最佳答案

我从来没有想过你根本没有索引。经验教训 - 在进一步挖掘之前始终检查基础知识。

如果不需要分页，那么查询可以简化为

SELECT TOP (10) 
    [Extent1].[LinkID] AS [LinkID], 
    [Extent1].[Title] AS [Title], 
    ...
FROM [dbo].[Links] AS [Extent1]
ORDER BY [Extent1].[SentDate] DESC

正如您所验证的那样，它运行得很快。

显然，您确实需要分页，所以让我们看看我们能做些什么。

你当前版本之所以慢，是因为它扫描了整体 table 首先，计算每一行的行号，然后才返回 10 行。 我在这里错了。 SQL Server 优化器非常聪明。 你问题的根源在别处。请参阅下面的我的更新。

顺便说一句，正如其他人提到的，只有在 SentDate 时，此分页才能正常工作。列是唯一的。如果不是唯一的，则需要ORDER BY SentDate和另一个独特的专栏，如 ID来解决歧义。

如果您不需要直接跳转到特定页面的能力，而是始终从第 1 页开始，然后转到下一页、下一页等等，那么这篇优秀文章中描述了进行此类分页的正确有效方法: http://use-the-index-luke.com/blog/2013-07/pagination-done-the-postgresql-way
作者使用 PostgreSQL 进行说明，但该技术也适用于 MS SQL Server。归结为记住 ID显示页面的最后一行，然后使用此 ID在 WHERE子句具有适当的支持索引来检索下一页，而无需扫描所有先前的行。

SQL Server 2008 没有内置的分页支持，因此我们必须使用解决方法。我将展示一个允许直接跳转到给定页面的变体，并且在第一页上工作得很快，但在以后的页面上会变得越来越慢。

您将在 C# 代码中拥有这些变量( PageSize 、 PageNumber )。我把它们放在这里是为了说明这一点。

DECLARE @VarPageSize int = 10; -- number of rows in each page
DECLARE @VarPageNumber int = 3; -- page numeration is zero-based

SELECT TOP (@VarPageSize)
    [Extent1].[LinkID] AS [LinkID]
    ,[Extent1].[Title] AS [Title]
    ,[Extent1].[Url] AS [Url]
    ,[Extent1].[Description] AS [Description]
    ,[Extent1].[SentDate] AS [SentDate]
    ,[Extent1].[VisitCount] AS [VisitCount]
    ,[Extent1].[RssSourceId] AS [RssSourceId]
    ,[Extent1].[ReviewStatus] AS [ReviewStatus]
    ,[Extent1].[UserAccountId] AS [UserAccountId]
    ,[Extent1].[CreationDate] AS [CreationDate]
FROM
    (
        SELECT TOP((@VarPageNumber + 1) * @VarPageSize)
            [Extent1].[LinkID] AS [LinkID]
            ,[Extent1].[Title] AS [Title]
            ,[Extent1].[Url] AS [Url]
            ,[Extent1].[Description] AS [Description]
            ,[Extent1].[SentDate] AS [SentDate]
            ,[Extent1].[VisitCount] AS [VisitCount]
            ,[Extent1].[RssSourceId] AS [RssSourceId]
            ,[Extent1].[ReviewStatus] AS [ReviewStatus]
            ,[Extent1].[UserAccountId] AS [UserAccountId]
            ,[Extent1].[CreationDate] AS [CreationDate]
        FROM [dbo].[Links] AS [Extent1]
        ORDER BY [Extent1].[SentDate] DESC
    ) AS [Extent1]
ORDER BY [Extent1].[SentDate] ASC
;

第一页是第 1 到 10 行，第二页是第 11 到 20 行，依此类推。
让我们看看当我们尝试获取第四页时这个查询是如何工作的，即第 31 到 40 行。PageSize=10 , PageNumber=3 .在内部查询中，我们选择前 40 行。请注意，我们不要在这里扫描整个表，我们只扫描前 40 行。我们甚至不需要显式 ROW_NUMBER() .然后我们需要从找到的 40 行中选择最后 10 行，因此外部查询选择 TOP(10)与 ORDER BY在相反的方向。这将按相反的顺序返回第 40 到 31 行。您可以在客户端将它们重新排序为正确的顺序，或者再添加一个外部查询，只需按 SentDate DESC 再次对它们进行排序。 .像这样:

SELECT
    [Extent1].[LinkID] AS [LinkID]
    ,[Extent1].[Title] AS [Title]
    ,[Extent1].[Url] AS [Url]
    ,[Extent1].[Description] AS [Description]
    ,[Extent1].[SentDate] AS [SentDate]
    ,[Extent1].[VisitCount] AS [VisitCount]
    ,[Extent1].[RssSourceId] AS [RssSourceId]
    ,[Extent1].[ReviewStatus] AS [ReviewStatus]
    ,[Extent1].[UserAccountId] AS [UserAccountId]
    ,[Extent1].[CreationDate] AS [CreationDate]
FROM
    (
        SELECT TOP (@VarPageSize)
            [Extent1].[LinkID] AS [LinkID]
            ,[Extent1].[Title] AS [Title]
            ,[Extent1].[Url] AS [Url]
            ,[Extent1].[Description] AS [Description]
            ,[Extent1].[SentDate] AS [SentDate]
            ,[Extent1].[VisitCount] AS [VisitCount]
            ,[Extent1].[RssSourceId] AS [RssSourceId]
            ,[Extent1].[ReviewStatus] AS [ReviewStatus]
            ,[Extent1].[UserAccountId] AS [UserAccountId]
            ,[Extent1].[CreationDate] AS [CreationDate]
        FROM
            (
                SELECT TOP((@VarPageNumber + 1) * @VarPageSize)
                    [Extent1].[LinkID] AS [LinkID]
                    ,[Extent1].[Title] AS [Title]
                    ,[Extent1].[Url] AS [Url]
                    ,[Extent1].[Description] AS [Description]
                    ,[Extent1].[SentDate] AS [SentDate]
                    ,[Extent1].[VisitCount] AS [VisitCount]
                    ,[Extent1].[RssSourceId] AS [RssSourceId]
                    ,[Extent1].[ReviewStatus] AS [ReviewStatus]
                    ,[Extent1].[UserAccountId] AS [UserAccountId]
                    ,[Extent1].[CreationDate] AS [CreationDate]
                FROM [dbo].[Links] AS [Extent1]
                ORDER BY [Extent1].[SentDate] DESC
            ) AS [Extent1]
        ORDER BY [Extent1].[SentDate] ASC
    ) AS [Extent1]
ORDER BY [Extent1].[SentDate] DESC

仅当 SentDate 时，此查询(作为原始查询)才能始终正确工作是独特的。如果它不是唯一的，则将唯一列添加到 ORDER BY .例如，如果 LinkID是唯一的，那么在最内层的查询中使用 ORDER BY SentDate DESC, LinkID DESC .在外部查询中颠倒顺序:ORDER BY SentDate ASC, LinkID ASC .

显然，如果您想跳转到第 1000 页，那么内部查询将必须读取 10,000 行，因此您走得越远，速度就越慢。

在任何情况下，您都需要在 SentDate 上有一个索引。 (或 SentDate, LinkID )使其工作。如果没有索引，查询将再次扫描整个表。

我不会在这里告诉您如何将此查询转换为 EF，因为我不知道。我从来没有用过EF。可能有办法。此外，显然，您可以强制它使用实际的 SQL，而不是尝试使用 C# 代码。

更新

执行计划对比

在我的数据库中，我有一个表 EventLogErrors有 29,477,859 行，我在 SQL Server 2008 上将查询与 ROW_NUMBER 进行了比较EF 生成的以及我在这里建议的 TOP .我试图检索 10 行长的第四页。在这两种情况下，优化器都足够聪明，只能读取 40 行，正如您从执行计划中看到的那样。我使用主键列进行此测试的排序和分页。当我使用另一个索引列进行分页时，结果是相同的，即两个变体都只读取了 40 行。不用说，两个变体都在几分之一秒内返回了结果。

与 TOP 的变体

Variant with TOP

与 ROW_NUMBER 的变体

Variant with ROW_NUMBER

这一切意味着问题的根源在其他地方。您提到您的查询仅运行缓慢有时而我原本并没有真正关注它。有了这样的症状，我会做以下事情:

检查执行计划。

检查您是否有索引。

检查索引没有严重碎片化并且统计数据没有过时。

SQL Server 有一个名为 Auto-Parameterization 的功能。 .此外，它还有一个名为 Parameter Sniffing 的功能。 .此外，它还有一个名为 Execution plan caching 的功能。 .当所有三个功能一起工作时，可能会导致使用非最佳执行计划。 Erland Sommarskog 有一篇很棒的文章详细解释了它:http://www.sommarskog.se/query-plan-mysteries.html这篇文章解释了如何通过检查缓存的执行计划来确认问题真的出在参数嗅探上，以及如何解决问题。

关于c# - EF 生成的查询执行时间过长，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/29139013/

c# - EF 生成的查询执行时间过长

上一篇：c# - 应该是 IEquatable<T >'s Equals() be implemented via IComparable<T>' s CompareTo()？

下一篇：c# - 如何获取访问 token ？ (Reddit API)