c#线程化嵌套循环函数，每次执行有两个单独的作业

我试图在不改变算法的情况下提高我的程序的速度。

目前我使用 DFT 的这个实现:

public double[] dft(double[] data) {
        int n = data.Length;
        int m = n;// I use m = n / 2d;
        float[] real = new float[n];
        float[] imag = new float[n];
        double[] result = new double[m];
        float pi_div = (float)(2.0 * Math.PI / n);
        for (int w = 0; w < m; w++) {
            float a = w * pi_div;
            for (int t = 0; t < n; t++) {
                real[w] += (float)(data[t] * Math.Cos(a * t)); //thinking of threading this
                imag[w] += (float)(data[t] * Math.Sin(a * t)); //and this
            }
            result[w] = (float)(Math.Sqrt(real[w] * real[w] + imag[w] * imag[w]) / n);
        }
        return result;
    }

它相当慢，但它有一个地方我可以看到可以做出改进。函数的内部部分是两个独立的任务。实部和虚部的求和可以单独进行，但应该始终联合起来计算结果。

有什么想法吗？我尝试了一些在网上看到的实现，但它们都崩溃了，而且我的线程经验很少。

最佳答案

当您拥有可以并行化的 CPU 绑定(bind)算法时，您可以使用 Parallel 轻松地将单线程实现转换为多线程实现。类。

在您的情况下，您有两个嵌套循环，但外循环的迭代次数远大于您可以执行的 CPU 核心数量，因此只需要并行化外循环以使所有核心旋转:

public double[] ParallelDft(double[] data) {
  int n = data.Length;
  int m = n;// I use m = n / 2d;
  float[] real = new float[n];
  float[] imag = new float[n];
  double[] result = new double[m];
  float pi_div = (float)(2.0 * Math.PI / n);
  Parallel.For(0, m,
    w => {
      float a = w * pi_div;
      for (int t = 0; t < n; t++) {
        real[w] += (float)(data[t] * Math.Cos(a * t)); //thinking of threading this
        imag[w] += (float)(data[t] * Math.Sin(a * t)); //and this
      }
      result[w] = (float)(Math.Sqrt(real[w] * real[w] + imag[w] * imag[w]) / n);
    }
  );
  return result;
}

我已获取您的代码并将外部 for 循环替换为 Parallel.For .在配备八个超线程内核的计算机上，我的执行速度提高了七倍。

另一种提高执行速度的方法是在 CPU 上使用 SIMD 指令集。 System.Numerics.Vectors图书馆和 Yeppp!库允许您从托管代码调用 SIMD 指令，但它需要代表您做一些工作才能使用这些指令实现算法。

关于c#线程化嵌套循环函数，每次执行有两个单独的作业，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33979158/

c#线程化嵌套循环函数，每次执行有两个单独的作业

上一篇：c# - 表单不显示控件，它显示一个空表单

下一篇：c# - 确保方法逻辑在没有 bool 标志的情况下执行一次的更好方法