c++ - Linux 可执行文件通过 dlopen 打开共享库时在 emplace_back 上崩溃

我已经创建了一个共享库(在 OSX 上为“dylib”，在 Ubuntu 上为“so”)和一个加载该库的可执行文件。如果我只是将共享库链接到可执行文件(cmake 中的 link_libraries)，一切正常。

现在我不链接它，而是用 dlopen/dlsym 打开库。在 OSX 上运行正常，可执行文件运行平稳，但在 Linux 上它在特定点崩溃。这是 valgrind 跟踪:

==7253== Jump to the invalid address stated on the next line
 ==7253==    at 0x0: ???
==7253==    by 0x61DB539: void __gnu_cxx::new_allocator<std::thread>::construct<std::thread, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(std::thread*, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (new_allocator.h:136)
==7253==    by 0x61D7780: void std::allocator_traits<std::allocator<std::thread> >::construct<std::thread, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(std::allocator<std::thread>&, std::thread*, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (alloc_traits.h:475)
==7253==    by 0x61D7840: void std::vector<std::thread, std::allocator<std::thread> >::_M_realloc_insert<ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(__gnu_cxx::__normal_iterator<std::thread*, std::vector<std::thread, std::allocator<std::thread> > >, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (vector.tcc:415)
==7253==    by 0x61D371D: void std::vector<std::thread, std::allocator<std::thread> >::emplace_back<ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (vector.tcc:105)
==7253==    by 0x61D19F5: ThreadPool::ThreadPool(unsigned long) (ThreadPool.h:38)
==7253==    by 0x112545: main (testexecutable.cpp:216)
==7253==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==7253== Process terminating with default action of signal 11 (SIGSEGV)
==7253==  Bad permissions for mapped region at address 0x0
==7253==    at 0x0: ???
==7253==    by 0x61DB539: void __gnu_cxx::new_allocator<std::thread>::construct<std::thread, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(std::thread*, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (new_allocator.h:136)
==7253==    by 0x61D7780: void std::allocator_traits<std::allocator<std::thread> >::construct<std::thread, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(std::allocator<std::thread>&, std::thread*, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (alloc_traits.h:475)
==7253==    by 0x61D7840: void std::vector<std::thread, std::allocator<std::thread> >::_M_realloc_insert<ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(__gnu_cxx::__normal_iterator<std::thread*, std::vector<std::thread, std::allocator<std::thread> > >, ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (vector.tcc:415)
==7253==    by 0x61D371D: void std::vector<std::thread, std::allocator<std::thread> >::emplace_back<ThreadPool::ThreadPool(unsigned long)::{lambda()#1}>(ThreadPool::ThreadPool(unsigned long)::{lambda()#1}&&) (vector.tcc:105)
==7253==    by 0x61D19F5: ThreadPool::ThreadPool(unsigned long) (ThreadPool.h:38)
==7253==    by 0x112545: main (testexecutable.cpp:216)

代码实际上是这样的:

...
// need to keep track of threads so we can join them
std::vector< std::thread > workers;
// the task queue
std::queue< std::function<void()> > tasks;
...

// the constructor just launches some amount of workers
inline ThreadPool::ThreadPool(size_t threads)
: stop(false)
{
for (size_t i = 0; i<threads; ++i)
    workers.emplace_back(
        [this]
   {
...

崩溃恰好发生在 emplace_back 调用处。为什么会发生这种情况的任何想法？ GCC 是 7.3.0，Ubuntu 18.04。

编辑 1

Link to github repo with code

编辑2

好的，这是解决方案的一部分。我的同事指出，这可能是由于将函数指针 (lambda) 放置在可执行文件和共享库的不同堆栈上，造成了混淆——我还无法验证这一点，但这是我发现的:

ldd test
linux-vdso.so.1 (0x00007ffd6bdc7000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd8766de000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fd876350000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd875f5f000)
/lib64/ld-linux-x86-64.so.2 (0x00007fd876ae5000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd875bc1000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fd8759a9000)

没有将 pthread 显示为必需的库。然而，共享库引用了 pthread。

ldd liblibrary.so 
linux-vdso.so.1 (0x00007ffc97b74000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007efce4d30000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007efce49a2000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007efce478a000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007efce4399000)
/lib64/ld-linux-x86-64.so.2 (0x00007efce515f000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007efce3ffb000)

尽管它被引用，但在共享库中对需要 pthread 的函数的任何调用都会导致应用程序崩溃 - 看起来，pthread 库未加载根本。

如果我将对线程的调用放入主线程，即

void dummyfunction() {}

int main(int argc, char* argv[]) {
   std::thread dummy(&dummyfunction);
   dummy.join();
   ...
   // dlopen/dlsym here...
   ...
   initFunction();
   ...
   // dlclose
   return 0;
}

pthread 被添加到库列表中，

ldd test
linux-vdso.so.1 (0x00007ffdc7bd0000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f5d13777000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f5d13573000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f5d131e5000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f5d12fcd000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5d12bdc000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5d13b9c000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f5d1283e000)

它被加载并且一切都在共享库中。

但为什么 pthread 库没有从共享库加载？

还尝试在 pthread 上的共享库中使用 dlopen，但没有成功。

最佳答案

感谢 @o11c指出这一点。解决此问题的一种方法是为可执行文件的链接器添加一个标志，并将 pthread 显式添加到库列表

SET(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--no-as-needed")
target_link_libraries(test pthread dl)

关于c++ - Linux 可执行文件通过 dlopen 打开共享库时在 emplace_back 上崩溃，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52580992/

c++ - Linux 可执行文件通过 dlopen 打开共享库时在 emplace_back 上崩溃

上一篇：c++ - int 到 float 转换的精度损失

下一篇：Android NDK - 在 Visual Studio 中添加更多目标 API 级别