-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: avoid overflow on overflow check in halley_step on Apple M1 #937
Conversation
I'm sure I'm missing something obvious here, but if denom >=1 there can be no overflow, and if its < 1 then denom*numeric_limits<>::max() can't overflow either? Or is a logical branch which should never be taken being speculatively evaluated? |
That's what I thought at first because we have seen that before on Apple Silicon, but splitting the expression out into two if statements (which resolved this previously) didn't resolve it locally for me and I don't know of another way to prevent the speculative execution. So if it might be executed, the change in this PR was the only thing I could find that avoided the |
include/boost/math/tools/roots.hpp
Outdated
// |denom| >= |num| * max_value | ||
// RHS may overflow on Apple M1, so rearrange: | ||
// |denom| * 1/max_value >= |num| | ||
constexpr T inv_max_value = 1.0 / tools::max_value<T>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
constexpr T inv_max_value = 1.0 / tools::max_value<T>(); | |
const T inv_max_value = 1.0 / tools::max_value<T>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even better make that static const
and then it's evaluated just the once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted static constexpr
, but wasn't sure that is supported everywhere we want, and indeed, it appears even constexpr
is problematic. Will implement this right away
99773d8
to
4a8903e
Compare
Superceded by #945 |
[number]*tools::max_value<T>()
to prevent overflows on Apple M1 precessors