CUDA推力矢量专门扩展

我有一个扩展推力矢量的特殊要求。假设我有键向量 K、值向量 V 和扩展因子向量 E(对应于键向量)。我想将与某个键对应的值复制(扩展因子)倍。我查看了几个 Thrust::expand 示例，但它似乎不适用于我的特定用法。通过thrust::reduce_by_key 为结果数组分配空间很容易，但我不知道如何实际扩展我的向量。

例如:

key is   [0,0,0,1,2,2,2,2,4]
value is [1,2,3,5,6,7,8,4,7]
key 0 has values [1,2,3] 
key 1 has value  [5]
key 2 has values [6,7,8,4]
key 4 has value  [7]
(This is not code but the website won't let me submit unless I indent these statements)

扩展因子数组:

Expansion factor: [2,3,1,1,3]
desired result array: [1,2,3,1,2,3,5,5,5,6,7,8,4,7,7,7]
1,2,3   are the values of key[0], expanded 2 times according to E[0]
5       is the value of key[1], expanded 3 times according to E[1]
6,7,8,4 are the values of key[2], expanded 1 times according to E[2]
[none]  is the value of key[3], expanded 1 times according to E[3]
7       is the value of key[4], expanded 3 times according to E[4]

有没有有效的方法来做到这一点？提前致谢。

最佳答案

原发帖者报告了该问题的解决方案如下:

获取三个辅助数组，分别是:每个键的开始位置、每个键的元素数量以及每个键的元素数量(展开后)的排除结果。
复制独占扫描结果数组，并使用 thrust::expand 扩展它。
使用计数迭代器遍历扩展数组，每个键的起始位置[(迭代器 - 独占扫描结果)%元素数量]是当前迭代器的结果

此社区 wiki 条目是根据评论添加的，目的是将问题从未解答的列表中删除。

关于CUDA推力矢量专门扩展，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/19870115/

CUDA推力矢量专门扩展

上一篇：regex - 与 R 中的 str_locate 正则表达式完全匹配

下一篇：dot42 - 使用 dot42 的异步 SQLite 示例？