Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about version 0.3.5 build with gcc 9.2.0 #2579

Closed
dkonst13 opened this issue Apr 24, 2020 · 4 comments
Closed

Question about version 0.3.5 build with gcc 9.2.0 #2579

dkonst13 opened this issue Apr 24, 2020 · 4 comments

Comments

@dkonst13
Copy link

On Centos7 with gcc 9.2.0 built from sources, there are two tests from OpenBlas ctests fail while everything runs well using gcc 8.3.0
Also, we don't see anything strange in configure and build logs.

Is it something known and expected?

3/25 Test #3: sblas2 ...........................***Failed 0.41 sec Start 4: sblas3 4/25 Test #4: sblas3 ........................... Passed 0.44 sec Start 5: dblas1 5/25 Test #5: dblas1 ........................... Passed 0.38 sec Start 6: dblas2 6/25 Test #6: dblas2 ...........................***Failed 0.41 sec Start 7: dblas3

Kind Regards,
Dmitri

@martin-frbg
Copy link
Collaborator

You did not mention the platform, but there were several unsafe assumptions in particular in x86_64 assembly code that led to miscompilation with recent, more aggressively optimizing compilers. Is there any specific reason why you prefer staying with 0.3.5 from over a year ago rather than updating to the current version ? (IIRC the assembly bugs were fixed in 0.3.6, more
recent releases added and improved AVX512 support among other things)

@dkonst13
Copy link
Author

Hi Martin,
The platform is x86_64-centos7-gcc9-opt. There is no any particular reason to stay with 0.3.5.
The installation is supposed to be used in GRID and AVX515 is too optimistic for it. We could compile with AVX512 but then it will fail on many distributed nodes without AVX512 support. Right?

@martin-frbg
Copy link
Collaborator

You could compile with DYNAMIC_ARCH=1 and TARGET set to the hardware of the weakest node, that way the AVX512 kernels would get used on supporting nodes while others still work with whatever is appropriate for them. One problem could be that it might make results less deterministic if they experience different rounding effects from FMA depending on which collection of nodes they get scheduled on.
BTW the key issue ticket with regard to your original question is probably #2009, so indeed something fixed a year ago.

@dkonst13
Copy link
Author

Thank you for so quick feedback! Let's close this ticket then. Bon weekend!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants