c++ - 调用 delete[] 时 OpenCL 调试断言失败

标签 c++ opencl

我开始使用 openCL,但遇到了内存释放问题。一切都执行得很好,我得到的数据是我期望的,但我似乎无法在我的数组上调用 delete[]。代码如下所示。

gpu_dest 上调用 delete[] 工作正常,但在 matrix 上调用它会导致错误。环顾四周,我发现我可能已经改变了数组的位置,但由于它只在这个程序中读取,我不确定我应该在哪里做这样的事情。

请问有没有大佬解惑一下?

完整错误如下:

---------------------------
Microsoft Visual C++ Runtime Library
---------------------------
Debug Assertion Failed!

Program: ...2014 - 2015\Parallel Systems\project-opencl\Debug\Project.exe
File: f:\dd\vctools\crt\crtw32\misc\dbgdel.cpp
Line: 52
Expression: _BLOCK_TYPE_IS_VALID(pHead->nBlockUse)

For information on how your program can cause an assertion
failure, see the Visual C++ documentation on asserts.

(Press Retry to debug the application)

代码

#include <exception>
#include <iostream>
#include <sstream>
#include <string>
#include <cstdlib>
#include <vector>

#include <JC/util.hpp>

//#define A(x,y) a[x*width + y]



int main(int argc, char *argv[])
{
    try {
        if (argc != 4) {
            std::ostringstream oss;
            oss << "Usage: " << argv[0] << " <kernel_file> <kernel_name> <workgroup_size>";
            throw std::runtime_error(oss.str());
        }

        std::string kernel_file(argv[1]);
        std::string kernel_name(argv[2]);
        unsigned int workgroup_size = atoi(argv[3]);
        std::cout << "Workgroup size: " << workgroup_size << std::endl;

        // Initialize test matrix
        int matrix_size = 9;
        float input[9] = { 10, -7, 0, -3, 2, 6, 5, -1, 5 };
        // Allocate memory on the host and populate source

        float *gpu_dst = new float[matrix_size];
        float *matrix = input;

        // OpenCL initialization
        std::vector<cl::Platform> platforms;
        std::vector<cl::Device> devices;
        cl::Platform::get(&platforms);
        platforms[0].getDevices(CL_DEVICE_TYPE_GPU, &devices);
        cl::Context context(devices);
        cl::CommandQueue queue(context, devices[0], CL_QUEUE_PROFILING_ENABLE);

        // Allocate memory on the device
        cl::Buffer source_buf(context, CL_MEM_READ_ONLY, matrix_size*sizeof(float));
        cl::Buffer dest_buf(context, CL_MEM_WRITE_ONLY, matrix_size*sizeof(float));

        // Create the kernel
        cl::Program program = jc::buildProgram(kernel_file, context, devices);
        cl::Kernel kernel(program, kernel_name.c_str());
        // set the kernel arguments
        kernel.setArg<cl::Memory>(0, source_buf);
        kernel.setArg<cl::Memory>(1, dest_buf);
        kernel.setArg<cl_uint>(2, matrix_size);

        // transfer source data from the host to the device
        queue.enqueueWriteBuffer(source_buf, CL_TRUE, 0, matrix_size*sizeof(float), matrix);

        // execute the code on the device
        cl_ulong t;
        t = jc::runAndTimeKernel(kernel, queue, cl::NDRange(matrix_size), cl::NDRange(workgroup_size));

        // transfer destination data from the device to the host
        queue.enqueueReadBuffer(dest_buf, CL_TRUE, 0, matrix_size*sizeof(float), gpu_dst);

        // compute the data throughput in GB/s
        float throughput = (2.0*matrix_size*sizeof(float)) / t; // t is in nano seconds
        std::cout << "Achieved throughput: " << throughput << std::endl;

        for (int i = 0; i < 9; i++)
        {
            std::cout << gpu_dst[i] << matrix[i] << std::endl;
        }

        std::cout << "Deallocating memory" << std::endl;

        // Deallocate memory
        delete[] gpu_dst;
        delete[] matrix;// <-- This causes an error, for some reason..

        std::cout << "Done" << std::endl;

        return 0;
    }
    catch (cl::Error& e) {
        std::cerr << e.what() << ": " << jc::readable_status(e.err());
        return 3;
    }
    catch (std::exception& e) {
        std::cerr << e.what() << std::endl;
        return 2;
    }
    catch (...) {
        std::cerr << "Unexpected error. Aborting!\n" << std::endl;
        return 1;
    }
}

最佳答案

matrix不是动态分配的,因此使用 delete[]无效。

float input[9] = { 10, -7, 0, -3, 2, 6, 5, -1, 5 };
float *matrix = input;
//..
delete [] matrix;  // wrong

其次,你为什么不用std::vector而不是 new[]

std::vector<float> gpu_dst(matrix_size);

那么你不需要delete [] gpu_dst;在最后。

关于c++ - 调用 delete[] 时 OpenCL 调试断言失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26565030/

相关文章:

azure - OpenCL 程序无法在 Tesla M60 GPU 上运行

c++ - 使用 C++ 从定界符拆分字符串中提取整数数组

c++ - 在 C++ 中优雅地移植 lambda 表达式

c++ - 长双字面量的 C++ 后缀是什么?

memory - 在 OpenCL 中,当工作组大小不是架构的一部分时,__local 内存如何更快?

database - NBody 模拟 - 适当的设计方法

c++ - 模板类中的枚举

c++ - 非构造函数成员函数中的显式字符串行为

opencl - openTK 和 cloo 的区别?

c - OpenCL/C pow(x,0.5) != sqrt(x)