c++ - 在设备代码中创建对象

我想在设备上创建一个对象并将其分配给主机上可用的指针。我在这里做错了什么吗？

__global__ void createAProduction(DeviceProduction* production) {
    production = new AProduction();
}

DeviceProduction * devAProduction = NULL;
cudaMalloc(&devAProduction, sizeof(AProduction));
createAProduction<<<1, 1>>>(devAProduction);
deviceProductions["A"] = devAProduction;

在代码中的某个地方我想做某事。像:

BatchOperation ** devBatchOperations;
    cudaMalloc((void **) &devBatchOperations, sizeof(BatchOperation *) * operationCount);

然后我用它填充该指针数组:

void DeviceBatchExecutor::execute(vector<BatchOperation> operationsToPerform) {
    BatchOperation ** devBatchOperations;
    cudaMalloc((void **) &devBatchOperations, sizeof(BatchOperation *) * operationsToPerform.size());
    int i = 0;
    for(batchOperationIt it = operationsToPerform.begin(); it != operationsToPerform.end(); ++it) {
        BatchOperation * devBatchOperation;
        cudaMalloc(&devBatchOperation, sizeof(BatchOperation));
        cudaMemcpy(&devBatchOperation, &it, sizeof(BatchOperation), cudaMemcpyHostToDevice);
        Vertex * devInputNode = it->inputNode->allocateToDevice();
        cudaMemcpy(&(devBatchOperation->inputNode), &devInputNode, sizeof(Vertex *), cudaMemcpyDeviceToDevice);
        cudaMemcpy(&(devBatchOperation->production), &(it->production), sizeof(Production *), cudaMemcpyDeviceToDevice);
        cudaMemcpy(&devBatchOperations[i], &devBatchOperation, sizeof(BatchOperation *), cudaMemcpyDeviceToDevice);
        i++;
    }
    int operationCount = operationsToPerform.size();
    executeOperations<<<operationCount, 1>>>(devBatchOperations);
}

其中 Production 是指向保存所创建对象 AProduction 的设备内存的指针。然后我最终通过

调用处理

executeOperations<<<operationCount, 1>>>(devBatchOperations);

所以我依赖于虚拟方法调用。由于这些 DeviceProduction 对象是在设备上创建的，因此还有一个虚拟指针表，因此它应该可以工作。参见示例here 。但事实并非如此，因为接收到的批处理操作似乎是随机的......调用时崩溃。

__global__ void executeOperations(BatchOperation ** operation) {    
    operation[blockIdx.x]->production->apply(operation[blockIdx.x]->inputNode);
}

批处理操作是一个保存要执行的生产的结构。

struct BatchOperation {
    Production * production;
    Vertex * inputNode;
    Vertex * outputNode;
};

最佳答案

Is there something I'm doing wrong in here?

是的，可能。指针product通过值传递给内核:

createAProduction<<<1, 1>>>(devAProduction);

它指向设备内存中的某个位置，因为您已经在其上运行了cudaMalloc。这行内核代码:

production = new AProduction();

用内核new返回的新指针覆盖生产指针的按值传递拷贝。这几乎肯定不是您的初衷。 (并且您还没有定义 AProduction 是什么。)。在该内核调用完成后，指针的按值传递“拷贝”无论如何都会丢失。您也许可以像这样修复它:

*production = *(new DeviceProduction());

现在，您的生产指针指向设备内存中的一个区域，该区域保存一个实例化的(在设备上)对象，这似乎是您的意图。创建一个新对象只是为了复制它可能没有必要，但这不是我在这里试图指出的问题的关键。您也可以通过将指针传递给内核来“修复”此问题。然后，您需要分配一个指针数组，并直接使用内核 new 分配单个指针之一，如您所示。

代码的其余部分有很多未定义的项目。例如，在上面的代码中，不清楚为什么要声明 product 是指向 DeviceProduction 类型的指针，但随后尝试分配 AProduction > 键入它。据推测，这是某种形式的对象继承，目前尚不清楚。

由于您还没有真正提供任何接近完整代码的内容，因此我从 here 借用了一些代码。整理一个完整的工作示例，显示一个内核中的对象创建/设置，然后是调用这些对象上的虚拟方法的另一个内核:

$ cat t1086.cu #include <stdio.h> #define N 4 class Polygon { protected: int width, height; public: __host__ __device__ void set_values (int a, int b) { width=a; height=b; } __host__ __device__ virtual int area () { return 0; } }; class Rectangle: public Polygon { public: __host__ __device__ int area () { return width * height; } }; class Triangle: public Polygon { public: __host__ __device__ int area () { return (width * height / 2); } }; __global__ void setup_f(Polygon ** d_polys) { int idx = threadIdx.x+blockDim.x*blockIdx.x; if (idx < N) { if (idx%2) d_polys[idx] = new Rectangle(); else d_polys[idx] = new Triangle(); d_polys[idx]->set_values(5,12); }}; __global__ void area_f(Polygon ** d_polys) { int idx = threadIdx.x+blockDim.x*blockIdx.x; if (idx < N){ printf("area of object %d = %d\n", idx, d_polys[idx]->area()); }}; int main () { Polygon **devPolys; cudaMalloc(&devPolys,N*sizeof(Polygon *)); setup_f<<<1,N>>>(devPolys); area_f<<<1,N>>>(devPolys); cudaDeviceSynchronize(); } $ nvcc -o t1086 t1086.cu $ cuda-memcheck ./t1086 ========= CUDA-MEMCHECK area of object 0 = 30 area of object 1 = 60 area of object 2 = 30 area of object 3 = 60 ========= ERROR SUMMARY: 0 errors $

关于c++ - 在设备代码中创建对象，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/35603035/

c++ - 在设备代码中创建对象

上一篇：c++ - 如何从 Photon eventContent 字典中获取数据

下一篇：c++ - 在这种情况下，哪种模型最适合 QTreeView？