Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenBLAS 0.3.27 (same applies to 0.3.26) compiled with Intel oneAPI (latest) fails the testing - dblas3 and zblas1 tests are problematic #4739

Closed
vessokolev opened this issue Jun 6, 2024 · 6 comments · Fixed by #4755

Comments

@vessokolev
Copy link

vessokolev commented Jun 6, 2024

There is an issue with the testing of (at least) the last two release of OpenBLAS whenever Intel oneAPI compilers on Linux (RHEL 9.2 and RHEL 8.4) are employed for compiling the code. Shortly, the tests dblas3 and zblast1 fails:

  7/115 Test   #7: dblas3 ......................................................................................***Failed    0.10 sec
 11/115 Test  #11: zblas1 ......................................................................................***Failed    0.01 sec

Bellow are the command lines executed to configure and build the source code, and then run the tests:

cmake -B build-intel -DTARGET=SKYLAKEX -DCMAKE_C_COMPILER=icx -DCMAKE_Fortran_COMPILER=ifx
cmake --build build-intel -j 24
ctest --test-dir build-intel

Note that the error is not related to the adopted CPU target. We were able to reproduce the error on AMD EPYC CPU that corresponds to ZEN target. Changing the compilers to GNU (gcc + gfortran) makes the errors gone, but the produced OpenBLAS libraries perform slower. We are not sure if that is a runtime issue. On the other side, setting ulimit -s unlimited does not solve the issue.

Attached is the LastTest.log file, which may shed a light upon the issue.

LastTest.log

Anybody experienced suimilar issue?

@martin-frbg
Copy link
Collaborator

This is probably related to (or even a duplicate of) #4713 - newer oneAPI versions default to fp-model=fast which apparently makes some unsafe assumptions when optimizing.

@vessokolev
Copy link
Author

vessokolev commented Jun 6, 2024

I managed to resolve the problem. After examining the content of the cmake folder, I realized that OpenBLAS keeps a compiler-explicit configuration model and does not examine the compiler to assume the correct set of capabilities and flags. Instead, it expects the user to pass the type of the compiler to the cmake, which in turn expands the corresponding macros.

Therefore, the correct way for configuring the compilation process should look similar to:

cmake -B build-intel -DTARGET=SKYLAKEX -DC_COMPILER=INTEL -DCMAKE_C_COMPILER=icx -DF_COMPILER=INTEL -DCMAKE_Fortran_COMPILER=ifx

Here one should also add -DINTERFACE64 and/or -DUSE_OPENMP depending on the case.

@martin-frbg
Copy link
Collaborator

I'm not entirely sure about that, C_COMPILER/F_COMPILER should normally be autodetected, and only the CMAKE_C_COMPILER and CMAKE_Fortran_COMPILER flags need to be given. Maybe there is something missing in the autodetection scripts.

@vessokolev
Copy link
Author

vessokolev commented Jun 6, 2024

I'm not entirely sure about that, C_COMPILER/F_COMPILER should normally be autodetected, and only the CMAKE_C_COMPILER and CMAKE_Fortran_COMPILER flags need to be given. Maybe there is something missing in the autodetection scripts.

By having the macros definitions gone over and by examining the generated Makefiles, I got that fp-model flag is not passing to ifx. For instance, strict is most likely to control the FP precision properly:

-fp-model=strict

@martin-frbg
Copy link
Collaborator

#4718 should be setting -fp-model=consistent, but of course that is more recent than 0.3.27

@vessokolev
Copy link
Author

vessokolev commented Jun 6, 2024

#4718 should be setting -fp-model=consistent, but of course that is more recent than 0.3.27

Passing -fp-model=consistent to 0.3.27 compilation (with latest Intel oneAPI) creates a binary code that fails to pass the testing:

  9/120 Test   #9: cblas1 ......................................................................................***Failed    0.01 sec
 13/120 Test  #13: zblas1 ......................................................................................***Failed    0.00 sec

Given bellow is the reason zblas1 fails:

 Test of subprogram number  1            ZDOTC
                                       FAIL

 CASE  N INCX INCY MODE  I                             COMP(I)                             TRUE(I)  DIFFERENCE     SIZE(I)

    1  1    1    1 9999  1                      0.00000000D+00                      0.90000000D+00 -0.9000D+00  0.9000D+00
    1  1    1    1 9999  2                      0.49406565-323                      0.60000000D-01 -0.6000D-01  0.9000D+00
    1  2    1    1 9999  1                      0.00000000D+00                      0.91000000D+00 -0.9100D+00  0.1630D+01
    1  2    1    1 9999  2                      0.49406565-323                     -0.77000000D+00  0.7700D+00  0.1730D+01
    1  4    1    1 9999  1                      0.00000000D+00                      0.18000000D+01 -0.1800D+01  0.2900D+01
    1  4    1    1 9999  2                      0.49406565-323                     -0.10000000D+00  0.1000D+00  0.2780D+01
    1  1    2   -2 9999  1                      0.00000000D+00                      0.90000000D+00 -0.9000D+00  0.9000D+00
    1  1    2   -2 9999  2                      0.49406565-323                      0.60000000D-01 -0.6000D-01  0.9000D+00
    1  2    2   -2 9999  1                      0.00000000D+00                      0.14500000D+01 -0.1450D+01  0.1630D+01
    1  2    2   -2 9999  2                      0.49406565-323                      0.74000000D+00 -0.7400D+00  0.1730D+01
    1  4    2   -2 9999  1                      0.00000000D+00                      0.20000000D+00 -0.2000D+00  0.2900D+01
    1  4    2   -2 9999  2                      0.49406565-323                      0.90000000D+00 -0.9000D+00  0.2780D+01
    1  1   -2    1 9999  1                      0.00000000D+00                      0.90000000D+00 -0.9000D+00  0.9000D+00
    1  1   -2    1 9999  2                      0.49406565-323                      0.60000000D-01 -0.6000D-01  0.9000D+00
    1  2   -2    1 9999  1                      0.00000000D+00                     -0.55000000D+00  0.5500D+00  0.1630D+01
    1  2   -2    1 9999  2                      0.49406565-323                      0.23000000D+00 -0.2300D+00  0.1730D+01
    1  4   -2    1 9999  1                      0.00000000D+00                      0.83000000D+00 -0.8300D+00  0.2900D+01
    1  4   -2    1 9999  2                      0.49406565-323                     -0.39000000D+00  0.3900D+00  0.2780D+01
    1  1   -1   -2 9999  1                      0.00000000D+00                      0.90000000D+00 -0.9000D+00  0.9000D+00
    1  1   -1   -2 9999  2                      0.49406565-323                      0.60000000D-01 -0.6000D-01  0.9000D+00
    1  2   -1   -2 9999  1                      0.00000000D+00                      0.10400000D+01 -0.1040D+01  0.1630D+01
    1  2   -1   -2 9999  2                      0.49406565-323                      0.79000000D+00 -0.7900D+00  0.1730D+01
    1  4   -1   -2 9999  1                      0.00000000D+00                      0.19500000D+01 -0.1950D+01  0.2900D+01
    1  4   -1   -2 9999  2                      0.49406565-323                      0.12200000D+01 -0.1220D+01  0.2780D+01

 Test of subprogram number  2            ZDOTU
forrtl: severe (174): SIGSEGV, segmentation fault occurred
forrtl: severe (174): SIGSEGV, segmentation fault occurred
forrtl: severe (174): SIGSEGV, segmentation fault occurred
<end of output>
Test time =   0.00 sec
----------------------------------------------------------
Test Failed.
"zblas1" end time: Jun 07 00:27 EEST
"zblas1" time elapsed: 00:00:00
----------------------------------------------------------

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants