c++ - 将数据从较小的 vector 复制到较大的 vector

我正在使用 Thrust 开发一个 GPU 项目。我不会试图解释我在做什么，而是提供一个简单的、稍微通用的场景，它更容易解释，并可能在将来帮助其他人。

假设我有一个 vector ，我想在其中修改 vector 的每三个元素。

我能想到的两个解决方案是:

1) 使用像 transform 这样的推力调用来修改每三个元素，可能使用谓词或其他东西。

2) 将每隔三个元素复制到一个较小的 vector 中，对其调用变换，将这些元素复制回原始 vector 的原始位置。

这些中的任何一个都可以使用推力吗？

是否有其他方法或更好的方法来实现这一目标？

感谢所有建议!

最佳答案

Are either of these possible using Thrust?

是的，两者都是可能的。

Is there another way or a better way to pull this off?

在某种程度上，最佳方法可能会有所不同，具体取决于应用程序中此数据可能发生的其他情况。但在你概述的范围内，我认为一个推力 strided range很可能是一个不错的选择，可能是最好的选择。

您的第一个方法当然可以按原样使用，使用适当定义的仿函数来调节行为。 (例如，使用 constant_iterator 压缩您的数据以提供数据“索引”，并让仿函数根据相应的“索引”条件转换数据)。然而，它会有一个缺点，即我们需要启动 3 倍于所需数量的线程(因为 3 个线程中只有 1 个在进行任何实际的 vector 修改)。跨步范围方法对此进行了改进，因为每个线程都将执行修改所选 vector 元素的工作，并且没有“浪费”的线程。

这种方法仍然存在一定程度的“低效”，因为由于 GPU 数据加载特性，我们正在访问 3 倍的数据(使用仿函数/谓词方法，或跨步范围方法)。您的第二种方法(将每 3 个元素复制到一个较小的 vector )将缓解这种低效率，但您需要支付数据复制操作的成本，这将抵消单个 transform 操作上下文的任何好处。但是，如果您想对这个缩小的 vector 执行一些额外的步骤，那么将数据复制到一个较小的 vector 的开销/成本可能会通过多个剩余操作的序列来恢复，这些操作不会支付“低效率” "访问 3 倍多的数据。

然而，步幅范围方法应该仍然有用，可以将元素从较大的 vector 复制到较小的 vector ，或者直接对较大的 vector 启动 transform 操作，但仅修改特定元素。

这是一个可行的示例，基本上是对跨步范围示例的简单修改，它演示了两种可能的方法 - 第一种是复制然后转换，第二种是就地转换:

$ cat t996.cu
#include <thrust/iterator/counting_iterator.h>
#include <thrust/iterator/transform_iterator.h>
#include <thrust/iterator/permutation_iterator.h>
#include <thrust/functional.h>
#include <thrust/fill.h>
#include <thrust/device_vector.h>
#include <thrust/copy.h>
#include <thrust/transform.h>
#include <iostream>

#define STRIDE 3

// this example illustrates how to make strided access to a range of values
// examples:
//   strided_range([0, 1, 2, 3, 4, 5, 6], 1) -> [0, 1, 2, 3, 4, 5, 6]
//   strided_range([0, 1, 2, 3, 4, 5, 6], 2) -> [0, 2, 4, 6]
//   strided_range([0, 1, 2, 3, 4, 5, 6], 3) -> [0, 3, 6]
//   ...

template <typename Iterator>
class strided_range
{
    public:

    typedef typename thrust::iterator_difference<Iterator>::type difference_type;

    struct stride_functor : public thrust::unary_function<difference_type,difference_type>
    {
        difference_type stride;

        stride_functor(difference_type stride)
            : stride(stride) {}

        __host__ __device__
        difference_type operator()(const difference_type& i) const
        {
            return stride * i;
        }
    };

    typedef typename thrust::counting_iterator<difference_type>                   CountingIterator;
    typedef typename thrust::transform_iterator<stride_functor, CountingIterator> TransformIterator;
    typedef typename thrust::permutation_iterator<Iterator,TransformIterator>     PermutationIterator;

    // type of the strided_range iterator
    typedef PermutationIterator iterator;

    // construct strided_range for the range [first,last)
    strided_range(Iterator first, Iterator last, difference_type stride)
        : first(first), last(last), stride(stride) {}

    iterator begin(void) const
    {
        return PermutationIterator(first, TransformIterator(CountingIterator(0), stride_functor(stride)));
    }

    iterator end(void) const
    {
        return begin() + ((last - first) + (stride - 1)) / stride;
    }

    protected:
    Iterator first;
    Iterator last;
    difference_type stride;
};

int main(void)
{
    thrust::device_vector<int> data(8);
    data[0] = 10;
    data[1] = 20;
    data[2] = 30;
    data[3] = 40;
    data[4] = 50;
    data[5] = 60;
    data[6] = 70;
    data[7] = 80;

    // print the initial data
    std::cout << "initial data: " << std::endl;
    thrust::copy(data.begin(), data.end(), std::ostream_iterator<int>(std::cout, " "));  std::cout << std::endl;

    typedef thrust::device_vector<int>::iterator Iterator;

    // create strided_range with indices [0,3,6]
    strided_range<Iterator> strided(data.begin(), data.end(), STRIDE);
    // method 1: copy data from larger vector to smaller, then transform it:
    thrust::device_vector<int> result1(data.size()/STRIDE+1);
    thrust::copy(strided.begin(), strided.end(), result1.begin());
    thrust::transform(result1.begin(), result1.end(), result1.begin(), thrust::negate<int>());
    std::cout << "method 1 result: " << std::endl;
    thrust::copy(result1.begin(), result1.end(), std::ostream_iterator<int>(std::cout, " "));
    std::cout << std::endl;

    // method 2: transform data "in-place":
    std::cout << "method 2 result: " << std::endl;
    thrust::transform(strided.begin(), strided.end(), strided.begin(), thrust::negate<int>());
    thrust::copy(data.begin(), data.end(), std::ostream_iterator<int>(std::cout, " "));  std::cout << std::endl;

    return 0;
}
$ nvcc -o t996 t996.cu
$ ./t996
initial data:
10 20 30 40 50 60 70 80
method 1 result:
-10 -40 -70
method 2 result:
-10 20 30 -40 50 60 -70 80
$

关于c++ - 将数据从较小的 vector 复制到较大的 vector ，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/34111317/

c++ - 将数据从较小的 vector 复制到较大的 vector

上一篇：c++ - 静态数据成员

下一篇：c++ - 在 C++ 中创建模板以处理指向对象和原始类型的指针