c++ - 性能随着线程数量的增加而降低(无同步)

我有一个数据结构( vector )，其中的元素必须由函数解析，其中元素可以由不同的线程解析。

解析方法如下:

void ConsumerPool::parse(size_t n_threads, size_t id)
{
    for (size_t idx = id; idx < nodes.size(); idx += n_threads)
    {
        // parse node
        //parse(nodes[idx]);
        parse(idx);
    }
}

地点:

n_threads是线程总数
id 是当前线程的(唯一)索引

线程创建如下:

std::vector<std::thread> threads;

for (size_t i = 0; i < n_threads; i++)
    threads.emplace_back(&ConsumerPool::parse, this, n_threads, i);

不幸的是，即使此方法有效，如果线程数太高，我的应用程序的性能也会降低。我想了解为什么即使这些线程之间没有同步，性能也会下降。

根据使用的线程数，以下是耗时(从线程开始到最后一个 join() 返回):

2 个线程:500 毫秒
3 个线程:385 毫秒
4 个线程:360 毫秒
5 个线程:475 毫秒
6 个线程:580 毫秒
7 个线程:635 毫秒
8 个线程:660 毫秒

创建线程所需的时间始终在 1/2 毫秒之间。该软件已使用其发布版本进行了测试。以下是我的配置:

2x Intel(R) Xeon(R) CPU E5507 @ 2.27GHz

Maximum speed:  2.26 GHz
Sockets:    2
Cores:  8
Logical processors: 8
Virtualization: Enabled
L1 cache:   512 KB
L2 cache:   2.0 MB
L3 cache:   8.0 MB

编辑:

parse() 函数的作用如下:

// data shared between threads (around 300k elements)
std::vector<std::unique_ptr<Foo>> vfoo;
std::vector<rapidxml::xml_node<>*> nodes;
std::vector<std::string> layers;

void parse(int idx)
{
    auto p = vfoo[idx];

    // p->parse() allocate memory according to the content of the XML node
    if (!p->parse(nodes[idx], layers))
        vfoo[idx].reset();
}

最佳答案

您正在使用 Intel(R) Xeon(R) CPU E5507 的处理器只有 4 个内核(参见 http://ark.intel.com/products/37100/Intel-Xeon-Processor-E5507-4M-Cache-2_26-GHz-4_80-GTs-Intel-QPI)。因此，从您提供的数据中可以看出，由于上下文切换，线程数超过 4 会导致速度变慢。

您可以在以下链接中阅读有关上下文切换的更多信息:https://en.wikipedia.org/wiki/Context_switch

关于c++ - 性能随着线程数量的增加而降低(无同步)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/38829974/

c++ - 性能随着线程数量的增加而降低(无同步)

上一篇：c++ - 检查类型是否定义

下一篇：c++ - C++ 中的静态鸭子类型(duck typing)