c++ - 在设备上的线性内存中循环二维数组时将 float* 转换为 char*

在 CUDA 4.0 编程指南的第 21 页有一个示例(下面给出)来说明循环遍历设备内存中二维 float 组的元素。 2D的尺寸是width*height

// Host code
int width = 64, height = 64;
float* devPtr;
size_t pitch;
cudaMallocPitch(&devPtr, &pitch,
width * sizeof(float), height);
MyKernel<<<100, 512>>>(devPtr, pitch, width, height);


// Device code
__global__ void MyKernel(float* devPtr, size_t pitch, int width, int height)
{
   for (int r = 0; r < height; ++r) 
    {
       float* row = (float*)((char*)devPtr + r * pitch);
          for (int c = 0; c < width; ++c) 
              {
              float element = row[c];
              }
     }
}

为什么 devPtr 设备内存指针在 global 内核函数中被强制转换为字符指针 char*？有人可以解释一下那条线吗？看起来有点奇怪。

最佳答案

这是由于方式pointer arithmetic在 C 中工作。当您将整数 x 添加到指针 p 时，它并不总是添加 x 字节。它增加了 x 倍 sizeof(*p)(p 指向的对象的大小)。

float* row = (float*)((char*)devPtr + r * pitch);

通过将 devPtr 转换为 char*，应用的偏移量 (r * pitch*) 为 1 字节数增量。 (因为 char 是一个字节)。如果转换不存在，应用于 devPtr 的偏移量将是 r * pitch 乘以 4 字节，因为 float 是四个字节。

例如，如果我们有:

float* devPtr = 1000;
int r = 4;

现在，让我们忽略类型转换:

float* result1 = (devPtr + r);
// result1 = devPtr + (r * sizeof(float)) = 1016;

现在，如果我们包括类型转换:

float* result2 = (float*)((char*)devPtr + r);
// result2 = devPtr + (r * sizeof(char)) = 1004;

关于c++ - 在设备上的线性内存中循环二维数组时将 float* 转换为 char*，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/8772172/

c++ - 在设备上的线性内存中循环二维数组时将 float* 转换为 char*

上一篇：c++ - CreateFile/WriteFile 没有破坏旧文件的内容

下一篇：c# - 如何更改我的 C++ 代码以用作 C# 中的 DLL？