opencv - 尝试使用任何 CUDA 功能时出现段错误

标签 opencv cuda

工作环境:

  • 带有库存开发版的 Ubuntu 14.04。工具(CMake 3.4.3、GCC 4.8.4 ..)
  • OpenCV 3.1.0
  • CUDA 7.5

OpenCV 使用以下 cmake 配置构建并安装 OK 构建,并且所有“正常”功能按预期工作

cmake -D CMAKE_BUILD_TYPE=DEBUG -D CMAKE_INSTALL_PREFIX=/usr/local -D WITH_CUDA=ON -D ENABLE_FAST_MATH=1 -D CUDA_FAST_MATH=1 -D WITH_CUBLAS=1 -D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-3.1.0/modules -D PYTHON_INCLUDE_DIR=$(python-config --prefix)/include/python2.7 -D BUILD_TESTS=OFF -D BUILD_PERF_TESTS=OFF -D BUILD_DOCS=OFF -D BUILD_EXAMPLES=ON ..

cmake 报告的 OpenCV 配置:

-- General configuration for OpenCV 3.1.0 =====================================
--   Version control:               d097d6d
-- 
--   Platform:
--     Host:                        Linux 4.2.0-27-generic x86_64
--     CMake:                       3.4.3
--     CMake generator:             Unix Makefiles
--     CMake build tool:            /usr/bin/make
--     Configuration:               DEBUG
-- 
--   C/C++:
--     Built as dynamic libs?:      YES
--     C++ Compiler:                /usr/bin/c++  (ver 4.8.4)
--     C++ flags (Release):         -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wno-narrowing -Wno-delete-non-virtual-dtor -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffast-math -msse -msse2 -mno-avx -msse3 -mno-ssse3 -mno-sse4.1 -mno-sse4.2 -ffunction-sections -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG  -DNDEBUG
--     C++ flags (Debug):           -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wno-narrowing -Wno-delete-non-virtual-dtor -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffast-math -msse -msse2 -mno-avx -msse3 -mno-ssse3 -mno-sse4.1 -mno-sse4.2 -ffunction-sections -fvisibility=hidden -fvisibility-inlines-hidden -g  -O0 -DDEBUG -D_DEBUG
--     C Compiler:                  /usr/bin/cc
--     C flags (Release):           -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wno-narrowing -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffast-math -msse -msse2 -mno-avx -msse3 -mno-ssse3 -mno-sse4.1 -mno-sse4.2 -ffunction-sections -fvisibility=hidden -O3 -DNDEBUG  -DNDEBUG
--     C flags (Debug):             -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wno-narrowing -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffast-math -msse -msse2 -mno-avx -msse3 -mno-ssse3 -mno-sse4.1 -mno-sse4.2 -ffunction-sections -fvisibility=hidden -g  -O0 -DDEBUG -D_DEBUG
--     Linker flags (Release):
--     Linker flags (Debug):
--     Precompiled headers:         YES
--     Extra dependencies:          /usr/lib/x86_64-linux-gnu/libpng.so /usr/lib/x86_64-linux-gnu/libz.so /usr/lib/x86_64-linux-gnu/libtiff.so /usr/lib/x86_64-linux-gnu/libjasper.so /usr/lib/x86_64-linux-gnu/libjpeg.so gtk-x11-2.0 gdk-x11-2.0 atk-1.0 gio-2.0 pangoft2-1.0 pangocairo-1.0 gdk_pixbuf-2.0 cairo pango-1.0 fontconfig gobject-2.0 freetype gthread-2.0 glib-2.0 dc1394 v4l1 v4l2 avcodec avformat avutil swscale dl m pthread rt cudart nppc nppi npps cublas cufft -L/usr/local/cuda/lib64
--     3rdparty dependencies:       libwebp IlmImf libprotobuf
-- 
--   OpenCV modules:
--     To be built:                 cudev core cudaarithm flann imgproc ml reg surface_matching video cudabgsegm cudafilters cudaimgproc cudawarping dnn fuzzy imgcodecs photo shape videoio cudacodec highgui objdetect plot xobjdetect xphoto bgsegm bioinspired dpm face features2d line_descriptor saliency text calib3d ccalib cudafeatures2d cudalegacy cudaobjdetect cudaoptflow cudastereo datasets rgbd stereo structured_light superres tracking videostab xfeatures2d ximgproc aruco optflow stitching python2
--     Disabled:                    world contrib_world
--     Disabled by dependency:      -
--     Unavailable:                 java python3 ts viz cvv hdf matlab sfm
-- 
--   GUI: 
--     QT:                          NO
--     GTK+ 2.x:                    YES (ver 2.24.23)
--     GThread :                    YES (ver 2.40.2)
--     GtkGlExt:                    NO
--     OpenGL support:              NO
--     VTK support:                 NO
-- 
--   Media I/O: 
--     ZLib:                        /usr/lib/x86_64-linux-gnu/libz.so (ver 1.2.8)
--     JPEG:                        /usr/lib/x86_64-linux-gnu/libjpeg.so (ver )
--     WEBP:                        build (ver 0.3.1)
--     PNG:                         /usr/lib/x86_64-linux-gnu/libpng.so (ver 1.2.50)
--     TIFF:                        /usr/lib/x86_64-linux-gnu/libtiff.so (ver 42 - 4.0.3)
--     JPEG 2000:                   /usr/lib/x86_64-linux-gnu/libjasper.so (ver 1.900.1)
--     OpenEXR:                     build (ver 1.7.1)
--     GDAL:                        NO
-- 
--   Video I/O:
--     DC1394 1.x:                  NO
--     DC1394 2.x:                  YES (ver 2.2.1)
--     FFMPEG:                      YES
--       codec:                     YES (ver 54.35.0)
--       format:                    YES (ver 54.20.4)
--       util:                      YES (ver 52.3.0)
--       swscale:                   YES (ver 2.1.1)
--       resample:                  NO
--       gentoo-style:              YES
--     GStreamer:                   NO
--     OpenNI:                      NO
--     OpenNI PrimeSensor Modules:  NO
--     OpenNI2:                     NO
--     PvAPI:                       NO
--     GigEVisionSDK:               NO
--     UniCap:                      NO
--     UniCap ucil:                 NO
--     V4L/V4L2:                    Using libv4l1 (ver 1.0.1) / libv4l2 (ver 1.0.1)
--     XIMEA:                       NO
--     Xine:                        NO
--     gPhoto2:                     NO
-- 
--   Parallel framework:            pthreads
-- 
--   Other third-party libraries:
--     Use IPP:                     9.0.1 [9.0.1]
--          at:                     /home/developer/projects/opencv/3rdparty/ippicv/unpack/ippicv_lnx
--     Use IPP Async:               NO
--     Use VA:                      NO
--     Use Intel VA-API/OpenCL:     NO
--     Use Eigen:                   NO
--     Use Cuda:                    YES (ver 7.5)
--     Use OpenCL:                  YES
--     Use custom HAL:              NO
-- 
--   NVIDIA CUDA
--     Use CUFFT:                   YES
--     Use CUBLAS:                  YES
--     USE NVCUVID:                 NO
--     NVIDIA GPU arch:             20 21 30 35
--     NVIDIA PTX archs:            30
--     Use fast math:               YES
-- 
--   OpenCL:
--     Version:                     dynamic
--     Include path:                /home/developer/projects/opencv/3rdparty/include/opencl/1.2
--     Use AMDFFT:                  NO
--     Use AMDBLAS:                 NO
-- 
--   Python 2:
--     Interpreter:                 /usr/bin/python2.7 (ver 2.7.6)
--     Libraries:                   /usr/lib/x86_64-linux-gnu/libpython2.7.so (ver 2.7.6)
--     numpy:                       /usr/lib/python2.7/dist-packages/numpy/core/include (ver 1.8.2)
--     packages path:               lib/python2.7/dist-packages
-- 
--   Python 3:
--     Interpreter:                 /usr/bin/python3.4 (ver 3.4.3)
-- 
--   Python (for build):            /usr/bin/python2.7
-- 
--   Java:
--     ant:                         NO
--     JNI:                         NO
--     Java wrappers:               NO
--     Java tests:                  NO
-- 
--   Matlab:                        Matlab not found or implicitly disabled
-- 
--   Tests and samples:
--     Tests:                       NO
--     Performance tests:           NO
--     C/C++ Examples:              YES
-- 
--   Install path:                  /usr/local
-- 
--   cvconfig.h is in:              /home/developer/projects/opencv/build
-- -----------------------------------------------------------------

但是任何调用 opencv gpu 加速 CUDA 功能的尝试都会导致段错误。 Samples/gpu 中的示例项目都无法正常运行。似乎在初始函数调用中,不平凡的 CUDA 函数出现了 seg 错误,但是观察到了几个平凡的 CUDA 函数,例如 getCudaEnabledDeviceCount() 和 printShortCudaDeviceInfo() 确实执行正常并返回合理的数据,但是后续的 opencv 调用 (< strong>不一定是 CUDA 函数)随后会出现段错误。

我尝试从/usr/local/cuda-7.5/samples(1_utilities/deviceQuery、1_utilities/bandwithTest)构建和运行几个 CUDA 库示例,它们看起来不错。

显示问题的简单程序:

生成文件

CFLAGS = `pkg-config --cflags opencv`
LIBS = `pkg-config --libs opencv`

fail : fail.cpp
        g++ $(CFLAGS) $< $(LIBS) -o $@

失败.cpp

#include "opencv2/cvconfig.h"
#include "opencv2/core.hpp"
#include "opencv2/cudaarithm.hpp"

using namespace std;
using namespace cv;
using namespace cv::cuda;


int main(int argc, const char* argv[])
{
    int i = getCudaEnabledDeviceCount();    

    Mat src(1000, 1000, CV_32F);
    Mat dst;

    printf("got to here #1\n");

    cv::transpose(src, dst);   

    printf("got to here #2\n");

    return 0;
}

程序输出:

got to here #1
Segmentation fault (core dumped)

注释掉 getCudaEnabledDeviceCount() 语句,它会按预期运行完成。

有点像 opencv 和 CUDA 库之间的调用约定不匹配或类似的东西,但我认为这应该“正常工作”...

[编辑]

启用核心转储后,在 gdb 中获取以下调用堆栈...

developer@odin:~/projects/temp/opencv_CUDA$ gdb -c core -e fail
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".

warning: exec file is newer than core file.
[New LWP 28690]
[New LWP 28691]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./fail'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __GI___pthread_mutex_lock (mutex=0x0) at ../nptl/pthread_mutex_lock.c:66
66  ../nptl/pthread_mutex_lock.c: No such file or directory.
(gdb) bt
#0  __GI___pthread_mutex_lock (mutex=0x0) at ../nptl/pthread_mutex_lock.c:66
#1  0x00007f9e85782008 in ?? () from /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
#2  0x00007f9e85836671 in ?? () from /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
#3  0x00007f9e858367e5 in ?? () from /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
#4  0x00007f9e85787cb4 in ?? () from /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
#5  0x00007f9e857894e7 in ?? () from /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
#6  0x00007f9e8575cc66 in ?? () from /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
#7  0x00007f9e8565bf3d in ?? () from /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
#8  0x00007f9e8565bed8 in ?? () from /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
#9  0x00007f9e85fe2022 in ?? () from /usr/local/cuda/lib64/libOpenCL.so
#10 0x00007f9e85fe3d42 in ?? () from /usr/local/cuda/lib64/libOpenCL.so
#11 0x00007f9e85fe34d0 in clGetPlatformIDs () from /usr/local/cuda/lib64/libOpenCL.so
#12 0x00007f9e895cea83 in (anonymous namespace)::opencl_fn3<58, int, unsigned int, _cl_platform_id**, unsigned int*>::switch_fn (p1=0, p2=0x0, 
p3=0x7ffe56410aac) at /home/developer/projects/opencv/modules/core/src/opencl/runtime/autogenerated/opencl_core_impl.hpp:127
#13 0x00007f9e8965c452 in cv::ocl::haveOpenCL () at /home/developer/projects/opencv/modules/core/src/ocl.cpp:1466
#14 0x00007f9e8965c4af in cv::ocl::useOpenCL () at /home/developer/projects/opencv/modules/core/src/ocl.cpp:1487
#15 0x00007f9e896a6962 in cv::transpose (_src=..., _dst=...) at /home/developer/projects/opencv/modules/core/src/matrix.cpp:3235
#16 0x0000000000400f3c in ?? ()
#17 0x00007ffe56410e88 in ?? ()
#18 0x0000000156410df8 in ?? ()
#19 0x00007ffe56410dd0 in ?? ()
#20 0x000000018a755d48 in ?? ()
#21 0x0000000001010000 in ?? ()
#22 0x00007ffe56410cd0 in ?? ()
#23 0x0000000000000000 in ?? ()

最佳答案

在网络上找到了与上述调用堆栈的一些匹配项,表明根本问题在于 nvidia 驱动程序中的错误。从 361.93 升级到 367.57(我的 GeForce GTX 970 卡的当前驱动程序)似乎已经解决了问题。

关于opencv - 尝试使用任何 CUDA 功能时出现段错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40619532/

相关文章:

macos - 无法制作 OpenCV

python - opencv中cvtColor()之前Numpy 8位图像转换为16/32位图像

c++ - OpenCV VLFeat Slic 函数调用

编译和链接纯 C 和 CUDA 代码 [警告 : implicit declaration of function]

c++ - CUDA内存分配性能

c - 如何使用带有 PGI 编译器的 C/OpenACC 声明全局动态数组

c - 在CUDA中并行处理for循环(1D天真卷积)

python - cv2.threshold 转换它不应该的细胞

c++ - 为什么转置 CUDA 网格(但不是它的线程 block )仍然会减慢计算速度?

c# - 使用 EmguCV 进行全景图像拼接