假设我有这个假设的多对多关系:
public class Paper
{
public int Id { get; set; }
public string Title { get; set; }
public virtual ICollection<Author> Authors { get; set; }
}
public class Author
{
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<Paper> Papers { get; set; }
}
我想使用 LINQ 构建一个查询,该查询将提供每位作者与其他作者相比的“受欢迎程度”,即作者贡献的论文数量除以所有作者贡献的总体总数文件。我想出了几个问题来实现这一目标。
选项 1:
var query1 = from author in db.Authors
let sum = (double)db.Authors.Sum(a => a.Papers.Count)
select new
{
Author = author,
Popularity = author.Papers.Count / sum
};
选项 2:
var temp = db.Authors.Select(a => new
{
Auth = a,
Contribs = a.Papers.Count
});
var query2 = temp.Select(a => new
{
Author = a,
Popularity = a.Contribs / (double)temp.Sum(a2 => a2.Contribs)
});
基本上,我的问题是:其中哪个更有效,还有其他更有效的单一查询吗?这些与两个单独的查询相比如何,如下所示:
double sum = db.Authors.Sum(a => a.Papers.Count);
var query3 = from author in db.Authors
select new
{
Author = author,
Popularity = author.Papers.Count / sum
};
最佳答案
嗯,首先,你可以自己尝试一下,看看哪个花费的时间最长。
您应该寻找的第一件事是它们可以完美地转换为 SQL 或尽可能接近,这样数据就不会全部加载到内存中只是为了应用这些计算。
但我觉得选项 2 可能是您的最佳选择,它还进行了一项优化以缓存贡献的页面总数。这样一来,您只需调用数据库一次即可获得您无论如何都需要的作者,其余的将在您的代码中运行,您可以在那里并行化并执行任何您需要的操作以加快速度。
所以像这样(抱歉,我更喜欢 Linq 的 Fluent 风格):
//here you can even load only the needed info if you don't need the whole entity.
//I imagine you might only need the name and the Pages.Count which you can use below, this would be another optimization.
var allAuthors = db.Authors.All();
var totalPageCount = allAuthors.Sum(x => x.Pages.Count);
var theEndResult = allAuthors .Select(a => new
{
Author = a,
Popularity = a.Pages.Count/ (double)totalPageCount
});
关于c# - 在子查询中涉及计数的高效查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14996966/