c++ - 并发写入 unordered_map (C++) 中的不同存储桶?

标签 c++ multithreading c++11 concurrency

这里是 C++ 新手。我正在尝试在 unordered_map 中同时写入不同的存储桶。从我可以通过搜索得知,我的理解是这应该是一个线程安全的操作。我(可能不正确)的理解是基于答案 herehere ,以及 C++11 标准的引用部分(特别是第 2 项——强调我的):

23.2.2 Container data races [container.requirements.dataraces]

1 For purposes of avoiding data races (17.6.5.9), implementations shall consider the following functions to be const: begin, end, rbegin, rend, front, back, data, find, lower_bound, upper_bound, equal_range, at and, except in associative or unordered associative containers, operator[].

2 Notwithstanding (17.6.5.9), implementations are required to avoid data races when the contents of the contained object in different elements in the same sequence, excepting vector<bool>, are modified concurrently.

3 [ Note: For a vector x with a size greater than one, x[1] = 5 and *x.begin() = 10 can be executed concurrently without a data race, but x[0] = 5 and *x.begin() = 10 executed concurrently may result in a data race. As an exception to the general rule, for a vector < bool > y, y[0] = true may race with y[1] = true. —end note ]

无论如何,使用标准容器写入不同的存储桶似乎不是线程安全的,如下面的代码所示。您会看到我在写入之前启用了与正在修改的存储桶相对应的锁,但有时对没有正确记录。对于它的值(value),如果我使用单个锁 - 例如,只需更改 auto bkt = mm->bucket(key);auto bkt=0; ,有效地锁定了整个 unordered_map 容器——一切都按预期工作。

#include <iostream>
#include <unordered_map>
#include <atomic>
#include <vector>
#include <thread>

#define NUM_LOCKS 409
#define N 100
#define NUM_THREADS 2

using namespace std;


class SpinLock
{
    public:
        void lock()
        {
            while(lck.test_and_set(memory_order_acquire)){}
        }
    void unlock()
        {
            lck.clear(memory_order_release);
        }

    private:
        atomic_flag lck = ATOMIC_FLAG_INIT;
};


vector<SpinLock> spinLocks(NUM_LOCKS);


void add_to_map(unordered_map<int,int> * mm, const int keyStart, const int keyEnd, const int tid){

    for(int key=keyStart;key<keyEnd;++key){
        auto bkt = mm->bucket(key);

        //lock bucket
        spinLocks[bkt].lock();

        //insert pair
        mm->insert({key,tid});

        //unlock bucket
        spinLocks[bkt].unlock();
    }

}


int main() {

    int Nbefore, Nafter;
    thread *t = new thread[NUM_THREADS];

    //create an unordered map, and reserve enough space to avoid a rehash
    unordered_map<int,int> my_map;
    my_map.reserve(2*NUM_THREADS*N);

    //count number of buckets to make sure that a rehash didn't occur
    Nbefore=my_map.bucket_count();


    // Launch NUM_THREADS threads.  Thread k adds keys k*N through (k+1)*N-1 to the hash table, all with associated value = k.

    for(int threadID=0;threadID<NUM_THREADS;++threadID){
        t[threadID]=thread(add_to_map,&my_map,threadID*N,(threadID+1)*N,threadID);
    }

    // Wait for the threads to finish
    for(int threadID=0;threadID<NUM_THREADS;++threadID){
        t[threadID].join();
    }

    //count number of buckets to make sure that a rehash didn't occur
    Nafter=my_map.bucket_count();


    cout << "Number of buckets before adding elements: " << Nbefore <<endl;
    cout << "Number of buckets after  adding elements: " << Nafter  << " <--- same as above, so rehash didn't occur" <<endl;

    //see if any keys are missing
    for(int key=0;key<NUM_THREADS*N;++key){

        if(!my_map.count(key)){

            cout << "key " << key << " not found!" << endl;

        }
    }

    return 0;
}

当错误地没有输入 key 时,程序将退出。示例输出为:

Number of buckets before adding elements: 401
Number of buckets after  adding elements: 401 <--- same as above, so rehash didn't occur
key 0 not found!
key 91 not found!
key 96 not found!
key 97 not found!
key 101 not found!
key 192 not found!
key 193 not found!
key 195 not found!

所以,我的问题有两个:

  1. 我在锁定存储桶的方式上做错了吗?
  2. 如果是这样,是否有更好的方法来逐个存储桶锁定映射以实现对不同存储桶的并发写入?

最后,我要提到我已经尝试过 TBB 的 concurrent_unordered_map,但它在我的应用程序中比简单地串行执行要慢得多。撇开杂散错误不谈,我使用 std::unordered_map 的存储桶锁定方法表现得更好。

最佳答案

容器的元素不是桶,而是 value_type 元素。

修改 std 容器中的一个元素对其他元素没有并发影响。但是修改一个 bucket 没有这样的保证。

在存储桶中添加或删除元素是对容器的非const 操作,它不属于可安全使用的非const 操作的特殊列表没有同步。

关于c++ - 并发写入 unordered_map (C++) 中的不同存储桶?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29381355/

相关文章:

c++ - 将函数的返回值存储在元组中

c++ - 快速排序后进行二进制搜索是否比线性搜索更快?

c++ - sizers 是处理表单的好方法吗?

java - 安卓 : Thread not passing data to Handler

c++ - 如何使用正态分布生成最小值和最大值之间的整数?

python - 用 libclang 解析;无法解析某些标记(Windows 中的 Python)

C++工厂实现麻烦

java - MSVC++ 只编译/禁用链接器

.net - 处理时如何处理控件更新?

multithreading - 对 TThread 的一些帮助(Terminate、FreeOnTerminate 和线程领域的其他冒险)