c# - LINQ to Objects Performance - 用于长时间运行进程的巨大数据集

标签 c# linq

我有一个从数据库中提取的详细列表(200,000 条记录),我需要找到每个详细信息的位置,下面是循环遍历详细列表并将位置分配给列表的代码。此循环的执行时间超过 15 分钟,但如果不填充 Locations 属性,则只需不到一分钟。

如何优化此代码?

class Program
{
    static void Main(string[] args)
    {
        List<Details> databaseDetailList = GetDetailsFromdatabase();
        List<Location1> databaseLocation1List = GetLocations1Fromdatabase();
        List<Location2> databaseLocation2List = GetLocations2Fromdatabase();

        List<Details> detailList = new List<Details>();
        foreach (var x in databaseDetailList)
        {
            detailList.Add(new Details
            {
                DetailId = x.DetailId,
                Code = x.Code,
                //If I comment out the Locations then it works faster
                Locations = new LocationIfo {
                    Locations1 = databaseLocation1List
                                .Where(l=>l.DetailId == x.DetailId && l.Code == x.Code).ToList(),
                    Locations2 = databaseLocation2List
                                .Where(l => l.DetailId == x.DetailId && l.Code == x.Code).ToList()
                }
            });
        }
    }

    private static List<Details> GetDetailsFromdatabase()
    {
        //This returns 200,000 records from database
        return new List<Details>();
    }

    private static List<Location1> GetLocations1Fromdatabase()
    {
        //This returns 100,000 records from database
        return new List<Location1>();
    }

    private static List<Location2> GetLocations2Fromdatabase()
    {
        //This returns 100,000 records from database
        return new List<Location2>();
    }
}

public class Details
{
    public string DetailId { get; set; }
    public string Code { get; set; }
    public LocationIfo Locations { get; set; }
}

public class LocationIfo
{
    public List<Location1> Locations1 { get; set; }
    public List<Location2> Locations2 { get; set; }
}

public class Location1
{
    public int LocationId { get; set; }
    public string DetailId { get; set; }
    public string Code { get; set; }
    public string OtherProperty { get; set; }
}

public class Location2
{
    public int LocationId { get; set; }
    public string DetailId { get; set; }
    public string Code { get; set; }
    public string OtherProperty { get; set; }
}

最佳答案

从概念上讲,您在这里所做的是 Join。使用适当的操作将确保它更有效地执行。理想情况下,您甚至会在数据库方面执行 Join,而不是在将所有数据拉入列表之后,但即使您拉下所有数据,将其加入内存使用 Join更有效率。

var query = from detail in databaseDetailList
            join location1 in databaseLocation1List
            on new { detail.DetailId, detail.Code }
            equals new { location1.DetailId, location1.Code }
            into locations1
            join location2 in databaseLocation2List
            on new { detail.DetailId, detail.Code }
            equals new { location2.DetailId, location2.Code }
            into locations2
            select new Details
            {
                Code = detail.Code,
                DetailId = detail.DetailId,
                Locations = new LocationIfo
                {
                    Locations1 = locations1.ToList(),
                    Locations2 = locations2.ToList(),
                }
            };

关于c# - LINQ to Objects Performance - 用于长时间运行进程的巨大数据集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22667506/

相关文章:

c# - 互斥体未释放

c# - LINQ 按 List<results> 分组并返回结果

linq - CompiledQuery 与 List.Contains (where...in list) 功能?

c# - 创建补丁以升级 .NET 应用程序

c# - 确保导出的 JPEG 小于最大文件大小

c# - 使用 Linq 计算 2 个日期之间的对象列表

c# - Entity Framework 核心 : many-to-many relationship with same entity

c# - 是否可以在 C# 中将数组声明为只读?

C# 列表,在*最后一个非空元素之后删除所有空值

c# - 从外部网站获取xml数据