-
-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate to gfortran-15 ABI for accessing data in non-coarray aware codes. #792
base: main
Are you sure you want to change the base?
Conversation
This is the first step of fixing #774. As long as programs only use get from a remote image they already should work fine with a most recent gfortran compiler.
old_get_array_test.f90 is included in get_array_test.f90.
OpenMPI is incompatible with the way opencoarray uses MPI_wins. A MPI_Win on OpenMPI is a pointer exclusive to each image. Because tokens (aka MPI_Wins) are transfered between images, OpenMPI can not be used, because no way of mapping exists. MPIch and Intel's MPI work fine. This is not a bug, but a design decision.
Increase alloc_comp_multidim_shape test's timeout to allow running on sadly slow Windows.
Further notice, that current pipelines to not test the new ABI, because the images do not support gfortran 15 yet. |
GFortran 15 has been updated to contain both missing patches now. |
Hi @zbeekman , |
@vehre this is somewhat problematic, as OpenMPI is the MPI implementation used by Homebrew. If we drop OpenMPI support then we will no longer be able distribute OpenCoarrays through Homebrew. If I have time, I will try to look through the changes and try to find a way to continue supporting OpenMPI.
Can you please explain what this difference is a little bit more?
Does that mean that when building OpenCoarrays with GFortran 15 both ABIs are built and supported? Some additional context/explanation here would be helpful. |
@@ -233,6 +233,9 @@ endif() | |||
if ( gfortran_compiler AND ( NOT CMAKE_Fortran_COMPILER_VERSION VERSION_LESS 8.0.0 ) ) | |||
add_definitions(-DGCC_GE_8) # Tell library to build against GFortran 8.x bindings w/ descriptor change | |||
endif() | |||
if ( gfortran_compiler AND ( NOT CMAKE_Fortran_COMPILER_VERSION VERSION_LESS 14.0.0 ) ) | |||
add_definitions(-DGCC_GE_15) # Tell library to build against GFortran 15.x bindings | |||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point I need to refactor the CMake files: we should not be using add_definitions
, rather we should be specifying properties on targets (that way they can be propagated to dependencies and/or exported)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that would be the modern way of using cmake. But also can be a huge effort. A colleague had this project for major ERB software and struggled a long time to renovate the cmake build process. So don't take this step to be easy.
I understand that not supporting OpenMPI is problematic. I would like to support it, too. But it does not stick to the convention, that an MPI window is a unique and identical ID on all processes involved. OpenMPI uses a different address for the same window handle in each process. Furthermore does it not provide any (custom) interface to identify memory on the target process associated to a window handle. The new approach relies on being able to send a message to the remote process, identify the memory associated to the specific window there and do the access. OpenMPI has no support for this. Please note, we are no longer doing one-sided communication here! On each process a second thread is waiting for incoming access requests and processing them. MpiCH has a brewfile here: https://formulae.brew.sh/formula/mpich Can't that be used? I am on Linux, so I have no experience with proprietary Macs-SW management. When you find a way to resolve this OpenMPI deficiency (IMO it is a defect), please let me know.
GFortran pre 15 used one-sided communication: The image wanting to access memory on a remote image was doing all the heavy lifting. To get the data, the Fortran data access patterns were emulated in C code. That is, all array access patterns, component references, hops over allocatable components or arrays were done on the image wanting the access. This involved many remote memory accesses. For each allocatable component, the remote token had to be fetched. For each remote array, the array descriptor and the data segment had to be fetched. This was done twice consecutively. First to figure the size of the data, that was fetched. Then a memory buffer was allocated, big enough, and then the data was copied over doing the access pattern again. This was the old way! GFortran from 15 onwards, creates a function (well, it's a procedure, but let's not be picky) for each coarray access. These functions get a hash value associated to them and get entered into a hashmap on program start. The hashmap is a map of hashes to function pointers. When a coarray access to a remote image is requested now, then OpenCoarrays identifies the function pointer in the hashmap and retrieves its index (This is done once per call site only; the indexes are buffered). Next a message is composed, containing the window to access, the function's index and a lot of other data. This message is send to the remote image. Each image on initialization has started a second thread, that is waiting for these communication requests (side note: mpich uses the same pattern for one-sided communication, i.e. a secondary thread on each process handles those requests). When the communication thread receives such a request, it parses the message, prepares the objects needed for the access function, and executes the access function. The result of executing the access function (e.g. a get of memory) is then transferred back to the calling image, which has been waiting for the data. The huge benefits of this approach are: 1. for each access only one round trip is needed and 2. the data access is done in Fortran and no longer emulated in C. The latter allowing for better optimisation and supporting all Fortran access pattern (including future ones). And of course 3. : its waaaaay faster! I hope this explains the motivation and the technique used in the change better. Also it should make clear why we need the capability of identifying the memory associated to an MPI window on each image.
Yes, both ABIs are still present in the current OpenCoarrays implementation. And will probably be for a long time. The ABIs do not interfere (at least to my knowledge) and can be used at the same time. GFortran 15 uses only the new ABI to get the maximum performance. But code that has been compiled with GFortran pre 15 using the old ABI can be linked to OpenCoarrays and then just does not benefit from the higher performance nor its bugfixes. I hope this answered your questions. If not, do not hesitate to ask more specifically.
|
I like Homebrew and hope that we can support it, but just for the sake of discussion, we could consider switching to spack. I suspect that would give us more flexibility to specify the dependencies. @vehre Homebrew is open-source and also runs on Linux. And on a separate note, @zbeekman enabled OpenCoarrays to support Windows using Intel MPI, which we have a need to support for one of our project sponsors. If there's any way for you ascertain whether this PR breaks support for Intel MPI, that would be great. If you don't have easy access to a Windows platform for testing, possibly you could tell from the Intel MPI documentation. Fortunately, I think Intel MPI is based on MPICH so hopefully we'll be ok. |
@rouson Intel MPI is ok. It works in the same way as MPICH and can be used on Linux and Windows with the changes made in this PR. In fact, IMPI is a little bit faster than MPICH. |
@zbeekman please let us know what you find out about Homebrew. Would it be possible to switch to a tap in order to force the use of MPICH? |
Summary of changes
Migration of the ABI to gfortran-15 to allow data access for non-coarray aware codes.
Rationale for changes
GFortran pre 15 and Opencoarrays used one-sided communication to access data on remote images. With GFortran 15 a technique has been introduced to do the evaluation of the data access on the remote image and just pass the result. This commit makes use of this new ABI in OpenCoarrays.
Alongside with implementing the new ABI some compile issues like warnings have been fixed.
Note: OpenCoarrays as of now still supports the old and also the new ABI! Both ABIs are usable at the same time, i.e. an artefact that is not recompiled completely with gfortran 15 can make use of both ABIs. Or with other words the ABIs are not mutual exclusive.
Note 2: OpenCoarrays with gfortran 15 can not use OpenMPI anymore! OpenMPI uses addresses for MPI-Windows instead of globally unique identifiers like all other MPI implementations do. During cmake's configuration phase the use of OpenMPI with gfortran >= 15 is now prevented by an error message. This is unfortunate, but nothing to easily fix.
Note 3: Addressing #560 the performance increase of the coarray-icar code is significant with this PR and gfortran 15:
mpich 4.2.2
gfortran 14.2 real 8m57,800s (from time) / Model run time: 534.505 seconds
gfortran 15-trunk real 2m22,659s / Model run time: 137.577 seconds
Intel's mpi 2021.14
gfortran 14.2 real 14m16,045s / Model run time: 851.683 seconds
gfortran 15-trunk real 1m26,956s / Model run time: 81.447 seconds
Measurements have been taken on a single machine with an Intel i7-5775C @ 3.30GHz 24 GB Ram. I have nothing bigger to do performance measurements on.
Fixes #774.
Addresses #560.
Additional info and certifications
This pull request (PR) is a:
I certify that
OpenCoarrays developer a chance to review my proposed code
be introduced)
Code coverage data