c# - .net large for 循环变慢

我有一个很大的 for 循环(最多 30k 次迭代)，它似乎一直在变慢:

前一千次迭代耗时 1.34s
经过 12k 次迭代后，接下来的 1000 次迭代需要 5.31s
23k 次迭代后，接下来的 1000 次迭代需要 6.65s
最后一千次迭代耗时 7.43s

为了获得一点性能，我从 foreach 循环切换到 for 循环，并尝试发布配置，但我在 this question 中找不到任何其他内容这适用于我。循环在异步方法中

为什么循环变慢了？可以避免吗？

for(int iter = 0; iter < LargeList1.Count; iter++)
{
    var cl_from = LargeList1[iter];
    if(LargeList2.Any(cl => cl.str.Contains(cl_from.str)))
    {
        DateTime dt1 = //last write time of a file
        DateTime dt2 = //last write time of a different file
        if(DateTime.Compare(dt1, dt2) > 0)
        {
            try
            {
                CopyFile(//Kernel32 CopyFile a file overwrite);
                globals.fileX++;
            }
            catch(Exception filexx)
            {
                //error handler
            }
        }
        else
        {
            globals.fileS++;
        }
    }
    else
    {
        Directory.CreateDirectory(//create a directory, no check if it already exists);
        try
        {
            CopyFile(//Kernel32 CopyFile a file do not overwrite);
            globals.fileX++;
        }
        catch(Exception filex)
        {
            // error handler
        }

    }
    gui.UpdateCount(globals.fileF, globals.fileX, globals.fileS); //updates iteration on textboxes
    float p = (float)100.0*((float)globals.fileF + (float)globals.fileX + (float)globals.fileS)/(float)globals.totalCount;
    gui.setProgress(p); //updates progressbar
}

编辑:正如许多人所建议的那样，使用 hashset.Contains(cl_from.str) 解决了问题。

最佳答案

这 2 项的性质，我可以想象会是瓶颈。

for(int iter = 0; iter < LargeList1.Count; iter++)
{
    .....
    if(LargeList2.Any(cl => cl.str.Contains(cl_from.str)))
    ...........

您正在检查当前字符串中是否包含来自另一个大列表的任何单词。

随着时间的推移它可能变慢的几个原因:

最初速度更快，因为 GC 运行的次数不多，随着循环的深入，GC 必须越来越频繁地收集数据。
字符串 cl_from.st 的长度可能变大了？

需要考虑的几点:

cl_from.str 和 LargeList2 有多大，是否值得在 cl_from.str 中创建所有可能值的散列> 然后检查是否有查找或什至可能创建所有 LargeList2 字符串的哈希集，然后使用它，迭代 cl_From.str 中的每个字符串组合。
您可能想改进您的搜索算法，例如查看C# .NET: FASTEST WAY TO CHECK IF A STRING OCCURS WITHIN A STRING .或者用谷歌搜索其他字符串搜索索引/算法。为什么不使用像 Lucene.NET 这样的东西？
使用 .NET 探查器找出瓶颈所在，以及花费时间的地方。

关于c# - .net large for 循环变慢，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/30606235/

c# - .net large for 循环变慢

上一篇：c# - 使用 Entity Framework 进行级联删除

下一篇：c# - 正确设计 WCF 服务