multithreading - 使用 openmp 和私有(private)子句的梯形规则集成

标签 multithreading fortran openmp gfortran

我正在更改用于串行执行的代码,将其调整为并行执行(openmp),但我得到的期望结果(pi 值)的近似值很差。我在下面显示了这两个代码。
有什么不对?

program trap
use omp_lib 
implicit none
double precision::suma=0.d0 ! sum is a scalar
double precision:: h,x,lima,limb
integer::n,i, istart, iend, thread_num, total_threads=4, ppt
integer(kind=8):: tic, toc, rate
double precision:: time
double precision, dimension(4):: pi= 0.d0

call system_clock(count_rate = rate)
call system_clock(tic)

lima=0.0d0; limb=1.0d0; suma=0.0d0; n=10000000
h=(limb-lima)/n

suma=h*(f(lima)+f(limb))*0.5d0 !first and last points

ppt= n/total_threads
!$ call omp_set_num_threads(total_threads)

!$omp parallel private (istart, iend, thread_num, i)
  thread_num = omp_get_thread_num()
  !$ istart = thread_num*ppt +1
  !$ iend = min(thread_num*ppt + ppt, n-1)
do i=istart,iend ! this will control the loop in different images
  x=lima+i*h
  suma=suma+f(x) 
  pi(thread_num+1)=suma
enddo
!$omp end parallel

suma=sum(pi) 
suma=suma*h

print *,"The value of pi is= ",suma ! print once from the first image
!print*, 'pi=' , pi
call system_clock(toc)
time = real(toc-tic)/real(rate)
print*, 'Time ', time, 's'

contains

double precision function f(y)
double precision:: y
f=4.0d0/(1.0d0+y*y)
end function f

end program trap

!----------------------------------------------------------------------------------
program trap
implicit none
double precision::sum ! sum is a scalar
double precision:: h,x,lima,limb
integer::n,i
integer(kind=8):: tic, toc, rate
double precision:: time

call system_clock(count_rate = rate)
call system_clock(tic)

lima=0.0d0; limb=1.0d0; sum=0.0d0; n=10000000
h=(limb-lima)/n

sum=h*(f(lima)+f(limb))*0.5d0 !first and last points

do i=1,n-1 ! this will control the loop in different images
  x=lima+i*h
  sum=sum+f(x)
enddo

sum=sum*h

print *,"The value of pi is (serial exe)= ",sum ! print once from the first image

call system_clock(toc)
time = real(toc-tic)/real(rate)
print*, 'Time serial execution', time, 's'

contains

double precision function f(y)
double precision:: y
f=4.0d0/(1.0d0+y*y)
end function f

end program trap
编译使用:
$ gfortran -fopenmp -Wall -Wextra -O2 -Wall -o prog.exe test.f90 
$ ./prog.exe
$ gfortran -Wall -Wextra -O2 -Wall -o prog.exe testserial.f90 
$ ./prog.exe
在串行执行中,我得到了很好的 pi (3.1415) 近似值,但使用并行我得到了(我展示了几个并行执行):
 The value of pi is=    3.6731101425922810     

 Time    3.3386986702680588E-002 s

-------------------------------------------------------

 The value of pi is=    3.1556004791445953     

 Time    8.3681479096412659E-002 s

------------------------------------------------------

 The value of pi is=    3.2505952856717966     

 Time    5.1473543047904968E-002 s

最佳答案

您的 openmp 并行语句存在问题。
你继续加到变量 suma 上。 .
因此,您需要指定 reduction陈述。
此外,您没有指定变量 x是私有(private)的。
我还更改了您的代码的更多部分

  • 您明确告诉每个线程它应该使用哪个索引范围。大多数情况下,编译器可以自己更有效地解决这个问题。我改了parallelparallel do为了那个原因。
  • 最好将 openmp 并行区域中的变量属性设置为 default(none) .您将需要明确设置每个变量属性。
  • program trap
      use omp_lib
      implicit none
      double precision   :: suma,h,x,lima,limb, time
      integer            :: n, i
      integer, parameter :: total_threads=5
      integer(kind=8)    :: tic, toc, rate
    
      call system_clock(count_rate = rate)
      call system_clock(tic)
    
      lima=0.0d0; limb=1.0d0; suma=0.0d0; n=10000000
      h=(limb-lima)/n
    
      suma=h*(f(lima)+f(limb))*0.5d0 !first and last points
    
      call omp_set_num_threads(total_threads)
      !$omp parallel do default(none) private(i, x) shared(lima, h, n)  reduction(+: suma)
      do i = 1, n
        x=lima+i*h
        suma=suma+f(x)
      end do
      !$omp end parallel do
    
      suma=suma*h
    
      print *,"The value of pi is= ", suma ! print once from the first image
      call system_clock(toc)
      time = real(toc-tic)/real(rate)
      print*, 'Time ', time, 's'
    
    contains
    
      double precision function f(y)
        double precision:: y
        f=4.0d0/(1.0d0+y*y)
      end function
    
    end program
    

    关于multithreading - 使用 openmp 和私有(private)子句的梯形规则集成,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66606053/

    相关文章:

    python - 将数组广播到不同的形状(添加 "fake"维度)

    python - 带有包含文件的 f2py

    performance - 带有 "collapse()"的用于嵌套 for 循环的 OpenMP 在没有时性能更差

    visual-c++ - Visual Studio 2005 Standard 中的 OpenMP

    c - pragma omp for/parallel 不起作用?

    .NET 远程处理线程模型

    java - Java 线程间通信

    fortran - 如何解释 if 语句中的点?

    java - Java并发更新oracle表

    java - 安排在轮询情况下可能超时的可调用对象的最佳方法是什么?