c - 具有 avx 支持的 nvcc 找不到 gcc 内置内在函数

这是我的第一个问题;-)

我尝试在 CUDA 应用程序 (ccminer) 中使用 AVX，但 nvcc 显示错误:

/usr/local/cuda/bin/nvcc -Xcompiler "-Wall -mavx" -O3 -I . -Xptxas "-abi=no -v" -gencode=arch=compute_50,code=\"sm_50,compute_50\" --maxrregcount=80 --ptxas-options=-v -I./compat/jansson -o x11/x11.o -c x11/x11.cu
/usr/lib/gcc/x86_64-linux-gnu/4.8/include/avxintrin.h(118): error: identifier "__builtin_ia32_addpd256" is undefined

[...]

这只是第一个错误。有许多“未定义”的内置函数:-(

对于带有 .c 或 .cpp 扩展名的“C/C++”程序来说一切正常。但是 .cu - 错误:-( 我做错了什么？我可以编译 ccminer，但无法将 AVX 内在函数添加到 .cu 文件 - 仅限 .c 文件。我使用 Intel 内在函数而不是 gcc。

非常感谢任何帮助。提前致谢。

Linux Mint (ubuntu 13) 64 位、gcc 4.8.1、cuda 6.5。

我不希望 AVX 在 GPU 上工作。在 .cu 文件中，有一小部分基于 CPU 的代码，我想对其进行矢量化。

这是重现错误的示例。我举了一个最简单的例子: http://computer-graphics.se/hello-world-for-cuda.html

在开头添加了一行:

#include <immintrin.h>

并尝试使用以下命令进行编译:

nvcc cudahello.cu -Xcompiler -mavx

出现错误:

/usr/lib/gcc/x86_64-linux-gnu/4.8/include/avxintrin.h(118): error: identifier "__builtin_ia32_addpd256" is undefined

没有 #include <immintrin.h> 的相同代码编译没有问题。

这是完整的代码:

#include <stdio.h>
#if defined(__AVX__)
#include <immintrin.h>
#endif

const int N = 16; 
const int blocksize = 16; 

__global__ 
void hello(char *a, int *b) 
{
    a[threadIdx.x] += b[threadIdx.x];
}

int main()
{
    char a[N] = "Hello \0\0\0\0\0\0";
    int b[N] = {15, 10, 6, 0, -11, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};

    char *ad;
    int *bd;
    const int csize = N*sizeof(char);
    const int isize = N*sizeof(int);

    printf("%s", a);

    cudaMalloc( (void**)&ad, csize ); 
    cudaMalloc( (void**)&bd, isize ); 
    cudaMemcpy( ad, a, csize, cudaMemcpyHostToDevice ); 
    cudaMemcpy( bd, b, isize, cudaMemcpyHostToDevice ); 

    dim3 dimBlock( blocksize, 1 );
    dim3 dimGrid( 1, 1 );
    hello<<<dimGrid, dimBlock>>>(ad, bd);
    cudaMemcpy( a, ad, csize, cudaMemcpyDeviceToHost ); 
    cudaFree( ad );
    cudaFree( bd );

    printf("%s\n", a);
    return EXIT_SUCCESS;
}

编译

nvcc cudahello.cu -Xcompiler -mavx

获取错误或使用

nvcc cudahello.cu

编译干净。

最佳答案

我想我已经有了答案。功能如下:

_builtin_ia32_addpd256

内置于 gcc 中，而 nvcc 不知道它们。由于它们是在 immintrin.h 中声明的，因此 nvcc 在编译包含 immintrin.h 的 .cu 文件时会返回错误。因此我们不能将 cuda 功能与内置 gcc 功能混合在一个文件中。

关于c - 具有 avx 支持的 nvcc 找不到 gcc 内置内在函数，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/26301577/

c - 具有 avx 支持的 nvcc 找不到 gcc 内置内在函数

上一篇：c - 如何比较两个链表的字符串？

下一篇：c - 检查 open_lockfile 时出错？