c++ - CUDA，使用 memset(或 fill 或 ...)将 float 数组设置为 max val possible

编辑:感谢之前的回答。但实际上我想在 CUDA 中进行，显然 CUDA 没有 Fill 函数。我必须为每个线程填充一次矩阵，所以我想确保我使用的是最快的方法。这是我最好的选择吗？

我想将 float 矩阵设置为可能的最大值(在 float 中)。做这项工作的正确方法是什么？

float *matrix=new float[N*N];

for (int i=0;i<N*N;i++){
        matrix[i*N+j]=999999;
}

提前致谢。

最佳答案

CUDA 中最简单的方法是使用 thrust::fill . Thrust 包含在 CUDA 4.0 及更高版本中，或者您可以 install it如果您使用的是 CUDA 3.2。

#include <thrust/fill.h>
#include <thrust/device_vector.h>
...
thrust::device_vector<float> v(N*N);
thrust::fill(v.begin(), v.end(), std::numeric_limits<float>::max()); // or 999999.f if you prefer

您也可以像这样编写纯 CUDA 代码:

template <typename T>
__global__ void initMatrix(T *matrix, int width, int height, T val) {
    int idx = blockIdx.x * blockDim.x + threadIdx.x;

    for (int i = idx; i < width * height; i += gridDim.x * blockDim.x) {
        matrix[i]=val;
    }
}

int main(void) {
    float *matrix = 0;
    cudaMalloc((void*)&matrix, N*N * sizeof(float));

    int blockSize = 256;
    int numBlocks = (N*N + blockSize - 1) / (N*N);
    initMatrix<<<numBlocks, blockSize>>>(matrix, N, N, 
                                         std::numeric_limits<float>::max()); // or 999999.f if you prefer
}

关于c++ - CUDA，使用 memset(或 fill 或 ...)将 float 数组设置为 max val possible，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/6835960/

c++ - CUDA，使用 memset(或 fill 或 ...)将 float 数组设置为 max val possible

上一篇：c++ - 变量而不是类调用

下一篇：c++ - 使用malloc代替new，创建对象时调用拷贝构造函数