c# - 优化 Entity Framework 查询

我正在尝试在自己的时间制作一个 stackoverflow 克隆来学习 EF6 和 MVC5，我目前正在使用 OWin 进行身份验证。

当我有 50-60 个问题时一切正常，我使用了 Red Gate data generator并尝试使用几千个没有关系的子表行将其增加到 100 万个问题，只是为了稍微“强调”ORM。这是 linq 的样子

var query = ctx.Questions
               .AsNoTracking()     //read-only performance boost.. http://visualstudiomagazine.com/articles/2010/06/24/five-tips-linq-to-sql.aspx
               .Include("Attachments")                                
               .Include("Location")
               .Include("CreatedBy") //IdentityUser
               .Include("Tags")
               .Include("Upvotes")
               .Include("Upvotes.CreatedBy")
               .Include("Downvotes")
               .Include("Downvotes.CreatedBy")
               .AsQueryable();

if (string.IsNullOrEmpty(sort)) //default
{
    query = query.OrderByDescending(x => x.CreatedDate);
}
else
{
    sort = sort.ToLower();
    if (sort == "latest")
    {
        query = query.OrderByDescending(x => x.CreatedDate);
    }
    else if (sort == "popular")
    {
        //most viewed
        query = query.OrderByDescending(x => x.ViewCount);
    }
}

var complaints = query.Skip(skipCount)
                      .Take(pageSize)
                      .ToList(); //makes an evaluation..

不用说，在安装 Miniprofiler 之后我遇到了 SQL 超时, 并查看生成的 sql 语句，它有几百行长。

我知道我正在加入/包含太多表，但现实生活中有多少项目，我们只需要加入 1 或 2 个表？在某些情况下，我们必须对数百万行进行如此多的连接，使用存储过程是唯一的方法吗？

如果是这样，EF 本身是否只适用于小型项目？

最佳答案

您遇到的问题很可能是 Cartesian product .

仅基于一些示例数据:

var query = ctx.Questions // 50 
  .Include("Attachments") // 20                                
  .Include("Location") // 10
  .Include("CreatedBy") // 5
  .Include("Tags") // 5
  .Include("Upvotes") // 5
  .Include("Upvotes.CreatedBy") // 5
  .Include("Downvotes") // 5
  .Include("Downvotes.CreatedBy") // 5

  // Where Blah
  // Order By Blah

这将返回多行数

50 x 20 x 10 x 5 x 5 x 5 x 5 x 5 x 5 = 156,250,000

说真的……要返回的行数太离谱了。

如果您遇到此问题，您确实有两个选择:

第一:最简单的方法是，依靠 Entity Framework 在模型进入上下文时自动连接模型。然后，使用实体 AsNoTracking() 并处理上下文。

// Continuing with the query above:

var questions = query.Select(q => q);
var attachments = query.Select(q => q.Attachments);
var locations = query.Select(q => q.Locations);

这将为每个表发出一个请求，但您只下载 110 行，而不是 1.56 亿行。但最酷的部分是它们都连接在 EF 上下文缓存内存中，所以现在 questions 变量已完全填充。

第二个:Create a stored procedure that returns multiple tables and have EF materialize the classes .

新第三:EF 现在支持如上所述拆分查询，同时保留不错的 .Include() 方法。 Split Queries确实有一些陷阱，所以我建议阅读所有文档。

上面链接的例子:

If a typical blog has multiple related posts, rows for these posts will duplicate the blog's information. This duplication leads to the so-called "cartesian explosion" problem.

using (var context = new BloggingContext())
{
    var blogs = context.Blogs
        .Include(blog => blog.Posts)
        .AsSplitQuery()
        .ToList();
}

It will produce the following SQL:

SELECT [b].[BlogId], [b].[OwnerId], [b].[Rating], [b].[Url]
FROM [Blogs] AS [b]
ORDER BY [b].[BlogId]

SELECT [p].[PostId], [p].[AuthorId], [p].[BlogId], [p].[Content], [p].[Rating], [p].[Title], [b].[BlogId]
FROM [Blogs] AS [b]
INNER JOIN [Post] AS [p] ON [b].[BlogId] = [p].[BlogId]
ORDER BY [b].[BlogId]

关于c# - 优化 Entity Framework 查询，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/22161234/

c# - 优化 Entity Framework 查询

上一篇：c# - Resharper 自定义模式更改方法名称

下一篇：c# - 将 POCO 对象类和 DBContext 从 Entity Framework 6 模型中分离出来