当我在我的程序中使用 Parallel.ForEach
时,我发现有些线程似乎永远不会结束。事实上,它一遍又一遍地产生新线程,这是我没有预料到也绝对不希望出现的行为。
我能够使用以下代码重现此行为,就像我的“真实”程序一样,它们都大量使用处理器和内存(.NET 4.0 代码):
public class Node
{
public Node Previous { get; private set; }
public Node(Node previous)
{
Previous = previous;
}
}
public class Program
{
public static void Main(string[] args)
{
DateTime startMoment = DateTime.Now;
int concurrentThreads = 0;
var jobs = Enumerable.Range(0, 2000);
Parallel.ForEach(jobs, delegate(int jobNr)
{
Interlocked.Increment(ref concurrentThreads);
int heavyness = jobNr % 9;
//Give the processor and the garbage collector something to do...
List<Node> nodes = new List<Node>();
Node current = null;
for (int y = 0; y < 1024 * 1024 * heavyness; y++)
{
current = new Node(current);
nodes.Add(current);
}
TimeSpan elapsed = DateTime.Now - startMoment;
int threadsRemaining = Interlocked.Decrement(ref concurrentThreads);
Console.WriteLine("[{0:mm\\:ss}] Job {1,4} complete. {2} threads remaining.",
elapsed, jobNr, threadsRemaining);
});
}
}
当在我的四核上运行时,它最初以 4 个并发线程开始,正如您所期望的那样。然而,随着时间的推移,越来越多的线程被创建。最终,该程序会抛出一个 OutOfMemoryException
:
[00:00] Job 0 complete. 3 threads remaining.
[00:01] Job 1 complete. 4 threads remaining.
[00:01] Job 2 complete. 4 threads remaining.
[00:02] Job 3 complete. 4 threads remaining.
[00:05] Job 9 complete. 5 threads remaining.
[00:05] Job 4 complete. 5 threads remaining.
[00:05] Job 5 complete. 5 threads remaining.
[00:05] Job 10 complete. 5 threads remaining.
[00:08] Job 11 complete. 5 threads remaining.
[00:08] Job 6 complete. 5 threads remaining.
...
[00:55] Job 67 complete. 7 threads remaining.
[00:56] Job 81 complete. 8 threads remaining.
...
[01:54] Job 107 complete. 11 threads remaining.
[02:00] Job 121 complete. 12 threads remaining.
..
[02:55] Job 115 complete. 19 threads remaining.
[03:02] Job 166 complete. 21 threads remaining.
...
[03:41] Job 113 complete. 28 threads remaining.
<OutOfMemoryException>
上面实验的内存使用图如下:
(屏幕截图是荷兰语;顶部代表处理器使用情况,底部代表内存使用情况。)正如您所见,几乎每次垃圾处理时都会生成一个新线程收集器妨碍了(从内存使用量的下降中可以看出)。
谁能解释为什么会这样,我能做些什么?我只希望 .NET 停止生成新线程,并首先完成现有线程...
最佳答案
您可以通过指定带有 MaxDegreeOfParallelism
属性集的 ParallelOptions
实例来限制创建的最大线程数:
var jobs = Enumerable.Range(0, 2000);
ParallelOptions po = new ParallelOptions
{
MaxDegreeOfParallelism = Environment.ProcessorCount
};
Parallel.ForEach(jobs, po, jobNr =>
{
// ...
});
至于为什么你会得到你正在观察的行为:默认情况下,TPL (它是 PLINQ 的基础) 可以自由猜测最佳数字要使用的线程数。每当并行任务阻塞时,任务调度程序可能会创建一个新线程以保持进度。在您的情况下,阻塞可能是隐式发生的;例如,通过 Console.WriteLine
调用,或者(如您所见)在垃圾回收期间。
来自 Concurrency Levels Tuning with Task Parallel Library (How Many Threads to Use?) :
Since the TPL default policy is to use one thread per processor, we can conclude that TPL initially assumes that the workload of a task is ~100% working and 0% waiting, and if the initial assumption fails and the task enters a waiting state (i.e. starts blocking) - TPL with take the liberty to add threads as appropriate.
关于c# - Parallel.ForEach 不断产生新线程,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14039051/