我在实现并行计算前缀和的算法时遇到了问题。尽管这个算法有 3 个步骤,但我无法编写代码,因为没有给出伪代码。
我浏览了网络上的各种 Material 以及堆栈溢出,但我没有得到 wiki 上给出的算法的确切实现。 . wiki 提到了以下内容:
A prefix sum can be calculated in parallel by the following steps::
- Compute the sums of consecutive pairs of items in which the first item of the pair has an even index: z0 = x0 + x1, z1 = x2 + x3, etc.
- Recursively compute the prefix sum w0, w1, w2, ... of the sequence z0, z1, z2, ...
- Expand each term of the sequence w0, w1, w2, ... into two terms of the overall prefix sum: y0 = x0, y1 = w0, y2 = w0 + x2, y3 = w1, etc. After the first value, each successive number yi is either copied from a position half as far through the w sequence, or is the previous value added to one value in the x sequence
有人可以建议一个伪代码实现供我检查和实现吗?
最佳答案
我认为提供的答案不足以理解算法,所以我在这里提供了一个带有更全面伪代码的实际答案:https://stackoverflow.com/a/12874227/697862基于 Parallel Prefix Sum (Scan) with CUDA这是一篇描述最佳并行算法的完整文章:
Blelloch, Guy E. 1990. "Prefix Sums and Their Applications." Technical Report CMU-CS-90-190, School of Computer Science, Carnegie Mellon University.
关于algorithm - 计算前缀和的并行算法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10060346/