线程的 C++11 vector 参数似乎未初始化

我正在尝试通过共享状态的含义为线程间通信创建概念证明:主线程创建工作线程，通过引用为每个工作线程提供一个单独的 vector ，让每个工作线程完成其工作并用结果填充其 vector ，并且最终收集结果。

但是，奇怪的事情正在发生，除了 vector 初始化和工作线程启动之间的某种竞争之外，我找不到其他解释。这是代码。

#include <iostream>
#include <vector>
#include <thread>


class Case {
public:
    int val;
    Case(int i):val(i) {}
};

void
run_thread (std::vector<Case*> &case_list, int idx)
{
    std::cout << "size in thread " << idx <<": " << case_list.size() << '\n';
    for (int i=0; i<10; i++) {
        case_list.push_back(new Case(i));
    }
}

int
main(int argc, char **argv)
{
    int nthrd = 3;
    std::vector<std::thread> threads;
    std::vector<std::vector<Case*>> case_lists;

    for (int i=0; i<nthrd; i++) {
        case_lists.push_back(std::vector<Case*>());
        std::cout << "size of " << i << " in main:" << case_lists[i].size() << '\n';
        threads.push_back( std::thread( run_thread, std::ref(case_lists[i]), i) );
    }

    std::cout << "All threads lauched.\n";

    for (int i=0; i<nthrd; i++) {
        threads[i].join();
        for (const auto cp:case_lists[i]) {
            std::cout << cp->val << '\n';
        }
    }
    return 0;
}

测试于 repl.it (gcc 4.6.3)，程序给出如下结果:

size of 0 in main:0
size of 1 in main:0
size of 2 in main:0
All threads lauched.
size in thread 0: 18446744073705569740
size in thread 2: 0
size in thread 1: 0
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
exit status -1

在我的电脑上，除了上面类似的东西，我还得到:

Segmentation fault (core dumped)

看起来线程 0 正在获取一个尚未初始化的 vector ，尽管该 vector 在 main 中似乎已正确初始化。

为了隔离问题，我尝试通过更改行来实现单线程:

threads.push_back( std::thread( run_thread, std::ref(case_lists[i]), i) );

到

run_thread(case_lists[i], i);

并注释掉:

threads[i].join();

现在程序按预期运行，“线程”在主程序收集结果之前一个接一个地运行。

我的问题是:上面的多线程版本有什么问题？

最佳答案

只要 vector 的容量发生变化，vector 的引用(和迭代器)就会失效。过度分配的确切规则因实现而异，但可能性是，在第一个 push_back 和最后一个之间至少有一个容量变化，并且在最终容量增加之前所做的所有引用都是垃圾它发生的那一刻，调用未定义的行为。

要么 reserve 你的总 vector 大小预先(所以 push_back 不会导致容量增加)，初始化整个 vector 到前面的最终大小(因此根本不会发生调整大小)，或者完全填充一个循环，然后启动线程(因此所有调整大小都发生在您提取任何引用之前)。这里最简单的修复方法是将其初始化为最终大小，更改:

std::vector<std::vector<Case*>> case_lists;

for (int i=0; i<nthrd; i++) {
    case_lists.push_back(std::vector<Case*>());
    std::cout << "size of " << i << " in main:" << case_lists[i].size() << '\n';
    threads.push_back( std::thread( run_thread, std::ref(case_lists[i]), i) );
}

到:

std::vector<std::vector<Case*>> case_lists(nthrd);  // Default initialize nthrd elements up front

for (int i=0; i<nthrd; i++) {
    // No push_back needed
    std::cout << "size of " << i << " in main:" << case_lists[i].size() << '\n';
    threads.push_back( std::thread( run_thread, std::ref(case_lists[i]), i) );
}

您可能认为 vector 会相当激进地过度分配，但至少在许多流行的编译器上，情况并非如此；两者 gcc和 clang遵循严格的加倍模式，因此前三个插入每次都会重新分配(容量从 1、到 2、到 4)；对第一个元素的引用因插入第二个元素而失效，对第二个元素的引用因插入第三个元素而失效。

关于线程的 C++11 vector 参数似乎未初始化，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53200434/

线程的 C++11 vector 参数似乎未初始化

上一篇：c++ - 如何减少 constexpr 函数的编译时间？

下一篇：c++ - 如何在不增加 sizeof 的情况下将 bool 添加到结构中(如果结构中有填充)？