c - 当线程执行for循环迭代时如何继续与master一起工作?

标签 c multithreading performance parallel-processing openmp

我被要求用 C 语言编写一个 OpenMP 程序,这样主线程将工作分配给其他线程,当它们正在执行任务时,主线程应该定期检查它们是否完成,如果没有,则应该增加一个共享变量。

这是线程任务的函数:

void work_together(int *a, int n, int number, int thread_count) {
#   pragma omp parallel for num_threads(thread_count) \
        shared(a, n, number) private(i) schedule(static, n/thread_count)
    for (long i=0; i<n; i++) {
        // do a task, such as:
        a[i] = a[i] * number;
    }
}

它是从 main 调用的:

int main(int argc, char *argv[]) {
    int n = atoi(argv[1]);
    int arr[n];
    initialize(arr, n);

    // this will be the shared variable
    int number = 2;
    work_together(arr, n, number, thread_count);

    //I want to write a function or an if to check whether threads are still working
    /* if (threads_still_working()) {
        number++;
        sleep(100);
    }
    */

    printf("There are %d threads\n", omp_get_num_threads());
}

thread_count 初始化为 4,我尝试执行大型 n (>10000),但主线程将始终等待其他线程完成 for 循环的执行,并且仅在 work_together() 返回时才继续执行 main:printf() 将始终打印出只有一个线程在运行。

现在,有什么方法可以从主线程检查其他线程是否仍在运行,并在运行时进行一些增量操作?

最佳答案

来自OpenMP standard人们可以阅读:

When a thread encounters a parallel construct, a team of threads is created to execute the parallel region. The thread that encountered the parallel construct becomes the master thread of the new team, with a thread number of zero for the duration of the new parallel region. All threads in the new team, including the master thread, execute the region. Once the team is created, the number of threads in the team remains constant for the duration of that parallel region.

因此,使用子句 #pragma omp parallel for num_threads所有线程都将执行并行工作(即计算循环的迭代),这是您不希望发生的事情。为了解决这个问题,您可以实现部分功能

`#pragma omp parallel for num_threads`

因为,显式使用上述子句将使编译器自动在团队中的线程之间划分循环迭代,包括该团队的主线程。代码如下所示:

# pragma omp parallel num_threads(thread_count) shared(a, n, number)
{
      int thread_id = omp_get_thread_num();
      int total_threads = omp_get_num_threads();
      if(thread_id != 0) // all threads but the master thread
      {
        thread_id--; // shift all the ids
        total_threads = total_threads - 1;
        for(long i = thread_id ; i < n; i += total_threads) {
            // do a task, such as:
            a[i] = a[i] * number;
        }
      }
} 

首先,我们确保除master(if(thread_id != 0))之外的所有线程都执行要并行化的循环,然后我们将循环的迭代划分为剩余线程( i.e., for(int i = thread_id ; i < n; i += total_threads) )。我选择了 chunk=1 的静态分布,您可以选择不同的分布,但您必须相应地调整循环。

现在您只需将逻辑添加到:

Now, what would be a way to check from the master thread whether the other threads are still running, and do some incrementing if they are?

为了不透露太多内容,我将添加伪代码,您必须将其转换为真实代码才能使其工作:

// declare two shared variable 
// 1) to count the number of threads that have finished working count_thread_finished
# pragma omp parallel num_threads(thread_count) shared(a, n, number)
{
      int thread_id = omp_get_thread_num();
      int total_threads = omp_get_num_threads();
      if(thread_id != 0) // all threads but the master thread
      {
        thread_id--; // shift all the ids
        total_threads = total_threads - 1;
        for(long i = thread_id ; i < n; i += total_threads) {
            // do a task, such as:
            a[i] = a[i] * number;
        }
        // count_thread_finished++
      }
      else{ // the master thread 
          while(count_thread_finished != total_threads -1){
              // wait for a while....
          }
     }
} 

但是请记住,由于变量 count_thread_finished在线程之间共享,您需要确保 mutual exclusion (例如在其更新中使用 omp atomic ),否则您将遇到竞争条件。这应该足以让您继续前进。

顺便说一句:schedule(static, n/thread_count)基本上不需要,因为默认情况下大多数 OpenMP 实现已经将循环的迭代(在线程之间)划分为连续 block 。

关于c - 当线程执行for循环迭代时如何继续与master一起工作?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65759940/

相关文章:

c - 在 linux 内核模式下,如何检测进程?

python - SWIG python 初始化指向 NULL 的指针

Java - ExecutorService 具有明显空闲线程

c++ - 有没有快速内存访问的技巧?

xml - XML 文件中的数据 : One large file or multiple small ones?

performance - 使用 slider 滚动内容时 ionic 不平滑

c - 理解 char *、char[] 和 strcpy()

我可以在用户空间中遍历进程的页表吗?

c - 为什么箭头键会弄乱标准输出?

C++ `Timer` 类实现