-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Using OpenMP #213
Comments
OpenMP is specified using pragmas which are comments, so if the file is compiled without openmp enabled, it will just be ignored. So I think it's fine if we allow openmp in our code base, because if users do not want stdlib to be parallelized using openmp, they'll just compile the stdlib files without openmp enabled. However, some others think that stdlib should not use openmp, and rather users should parallelize themselves: https://github.com/fortran-lang/stdlib/pull/189/files#r426173077 Although it's not clear to me how to do it in the context of the @zerothi, do you want to discuss it more here? |
If OpenMP is used in |
Can you give an example? What is an orphaned procedure?
…On Sat, Jun 20, 2020, at 2:25 PM, Jeremie Vandenplas wrote:
If OpenMP is used in `stdlib`, I think it should be with orphaned
procedures. The user can then control the parallelization (e.g. to call
a `stdlib` procedure inside or outside a parallel region). Using
parallel regions inside `stdlib` procedures will limit their utility.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#213 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAFAWG6DMCQRHLS6B6Y6CDRXULKTANCNFSM4ODQTKIA>.
|
Here is an example with an orphaned subroutine (the example is stupid but it is to illustrate an OpenMP orphaned procedure): program test
!$ use omp_lib
implicit none
integer :: i
real :: a(5)
a = [(i, i=1, 5)]
print*,' Outside a parallel region'
call printa(a)
print*,' Inside a parallel region'
!$omp parallel
call printa(a)
!$omp end parallel
contains
subroutine printa(a)
real, intent(in) :: a(:)
integer :: i, rang
rang = -1
!$omp do
do i=1,size(a)
!$ rang = omp_get_thread_num()
print*,'value: ',a(i),' at thread ', rang
end do
!$omp end do
end subroutine
end program Compiled without OpenMP, the output is:
Compiled with OpenMP (and run with 3 threads):
|
Ah I see, that's what @zerothi meant in the comment at https://github.com/fortran-lang/stdlib/pull/189/files#r426173077. The idea is to use openmp, but never to use the |
That's exactly what i meant so that our stdlib subroutines has parallel features but without using parallel region in order to control parallelism by user. |
Agreed, this was my idea, let the user decide how parallelism is enabled :) |
What happens when a user calls a function (which they want to parallelize and control the parallelization of) which then calls a stdlib function?
Do you not end up with nested parallelization? which i doubt is what people expect. |
Your example works exactly as intended (only 1 level of parallelism is used): However, if you do: program test
!$ use omp_lib
implicit none
integer :: i
real :: a(5)
!$omp parallel
call expensive_sub()
!$omp end parallel
contains
subroutine expensive_sub()
integer :: i
!$omp parallel do
do i=1,100
! Other expensive calculations
call printa(a)
end do
!$omp end do
end subroutine expensive_sub
subroutine printa(a)
real, intent(in) :: a(:)
integer :: i, rang
rang = -1
!$omp do
do i=1,size(a)
!$ rang = omp_get_thread_num()
print*,'value: ',a(i),' at thread ', rang
end do
!$omp end do
end subroutine
end program you'll get nesting. Either we should implement EDIT: you can control inside |
Sorry that was my mistake i meant parrallel do. |
Ok, so this can still easily be mitigated. program test
!$ use omp_lib
implicit none
logical :: nested
integer :: imax_nested, it, il
!$omp parallel default(shared), private(it,il,nested,imax_nested)
nested = omp_get_nested()
imax_nested = omp_get_max_active_levels()
it = omp_get_thread_num()
il = omp_get_active_level()
!$omp master
print *, "Allow nested: ",nested, imax_nested
!$omp end master
!$omp barrier
call sub_a(it,il)
!$omp barrier
!$omp single
print *,''
flush(6)
!$omp end single
call sub_a_limit(it,il)
!$omp end parallel
contains
subroutine sub_a(ot, ol)
integer, intent(in) :: ot, ol
integer :: it, il
!$omp parallel default(shared), private(it,il)
it = omp_get_thread_num()
il = omp_get_active_level()
call sub_b(ot, ol, it, il)
!$omp end parallel
end subroutine sub_a
subroutine sub_a_limit(ot, ol)
integer, intent(in) :: ot, ol
integer :: it, il
!$omp parallel default(shared), private(it,il)
it = omp_get_thread_num()
il = omp_get_active_level()
call omp_set_max_active_levels(il)
call sub_b(ot, ol, it, il)
!$omp end parallel
end subroutine sub_a_limit
subroutine sub_b(ot, ol, it, il)
integer, intent(in) :: ot, ol, it, il
integer :: ct, cl
!$omp parallel default(shared), private(ct,cl)
ct = omp_get_thread_num()
cl = omp_get_active_level()
print '(a,tr2,5(tr2,i0,"/",i0))','sub_b: ', ot, ol, it, il, ct, cl
!$omp end parallel
end subroutine sub_b
end program test I.e. users can use Perhaps I should clarify runned parameters. In pre OpenMP 5, one should do: OMP_NUM_THREADS=2 OMP_MAX_ACTIVE_LEVELS=3 OMP_NESTED=true ./a.out where |
Hi there
I started learning OpenMP library couple weeks ago and would like to parallelize and speed up fortran standard library codebase.
The text was updated successfully, but these errors were encountered: