-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct bias in random float sampling. #33251
Conversation
Fixes JuliaLang#33222. Also changed some hardcoded test values accordingly.
I think this should have a news item. |
Modify test comparisons accordingly. This reverts part of commit 12419e4.
I added a new sampler I tried to keep things consistent, but since |
@nanosoldier |
# NOTE: bias correction for OpenOpen01 samplers, see #33222 | ||
|
||
function rand(r::AbstractRNG, ::SamplerTrivial{OpenOpen01{Float16}}) | ||
z = reinterpret(Float32, (rand(r, UInt10(UInt32)) << 13) | 0x3f800000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can make z
equal prevfloat(Float32(2))
, so that Float16(z - prevfloat(Float16(1)))
rounds to 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I am surprised, in
#33222 (comment)
I checked the extrema of the new generator. Or we should really use a hard-coded constant Float16(reinterpret(Float32,n(UInt32(i) << 13) | 0x3f800000) - 0.9995117f0)
as I proposed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. I didn't think this fully through. You're right, it can't become 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@perrutquist, thanks for bringing this up anyway, I am always uneasy operating around eps() precision, and @mschauer, thanks for checking this.
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
It would be great if someone could provide guidance about whether to address anything from the benchmarks. I don't have the perspective to say what's just acceptable noise and what I should investigate. |
Sure I will do, but shouldn't a decision be made first? In the discourse thread it seemed that the preference was for keeping |
That's not how I read that discussion. People (including me) argued that This issue is kind of orthogonal: it is about fixing a bias. Incidentally it makes random numbers in |
I think it's fine to provide a |
IIRC, one mentioned reason to favor |
I am not yet sure that this implies any slowdown (that is not fixable), so maybe it would be better to treat the issue of correctness separately from that. As I said in the other thread, if there is a bias we can fix, we should just fix it, even if it is "small". Regarding breaking changes: is there a commitment to the same random stream from a specific seed with the same RNG? I was under the impression that there wasn't. |
I was refering to the breaking behavior that |
From the documentation of There's some wiggle room of course, but I would at least call this a "minor change" that should not be in a patch release. As for code relying on rand(Float16) returning zero, I doubt that there's any in the wild, but if there is, code like rand(Float16) == 0 # this will happen with probability 0.000977 (approximately) would have to be rewritten as rand(Float16) < eps(Float16) # ≈ 0.000977 which gives the same result with or without the bias correction. (And I would argue that this code is also easier to read, as the probability is stated more explicitly.) |
It appears that for example
is slower than
As I proposed hardcoding the constants two times before, let me just propose this once again. |
Thanks for pointing this out again, I will experiment and modify the PR accordingly. |
Let's hope this is actually the problem. |
With an explicit conversion |
Another thing that might be a (very slight) performance issue is that dSFMT seems to be producing one more bit of entropy for OpenOpen01 than for CloseOpen01. (Haven't verified this, but it looks that way from the fact that only half of the rands changed when filling an array.) Mixing in this bit probably costs a couple of clock cycles per SIMD vector. |
@perrutquist: are you sure about this? I thought that was just rounding. For the "native" scalar samplers, here is a script that benchmarks everything, comparing the proposed samplers for ### NOTE run on https://github.com/JuliaLang/julia/pull/33251
using Random: AbstractRNG, SamplerTrivial, Sampler, GLOBAL_RNG, OpenOpen01, UInt10, UInt23,
CloseOpen12, OpenOpen01_64, CloseOpen01
using BenchmarkTools
RNG = GLOBAL_RNG
S16 = Sampler(RNG, Float16, Val(1))
S32 = Sampler(RNG, Float32, Val(1))
S64 = Sampler(RNG, Float64, Val(1))
C16 = SamplerTrivial(CloseOpen01{Float16}())
C32 = SamplerTrivial(CloseOpen01{Float32}())
C64 = SamplerTrivial(CloseOpen01{Float64}())
###
### Float16
###
function rand_const(r::AbstractRNG, ::SamplerTrivial{OpenOpen01{Float16}})
z = reinterpret(Float32, (rand(r, UInt10(UInt32)) << 13) | 0x3f800000)
Float16(z - 0.9995117f0)
end
function rand_conv(r::AbstractRNG, ::SamplerTrivial{OpenOpen01{Float16}})
z = reinterpret(Float32, (rand(r, UInt10(UInt32)) << 13) | 0x3f800000)
Float16(z - Float32(prevfloat(Float16(1))))
end
###
### Float32
###
function rand_const(r::AbstractRNG, ::SamplerTrivial{OpenOpen01{Float32}})
reinterpret(Float32, rand(r, UInt23()) | 0x3f800000) - 0.99999994f0
end
###
### Float64
###
function rand_const(r::AbstractRNG, ::SamplerTrivial{OpenOpen01_64})
rand(r, CloseOpen12()) - 0.9999999999999999
end
@btime rand($RNG, $S16);
@btime rand_const($RNG, $S16);
@btime rand_conv($RNG, $S16);
@btime rand($RNG, $C16);
@btime rand($RNG, $S32);
@btime rand_const($RNG, $S32);
@btime rand($RNG, $C32);
@btime rand($RNG, $S64);
@btime rand_const($RNG, $S64);
@btime rand($RNG, $C64); With these results:
The only regression is for the Body::Float16
1 ─ %1 = Random.UInt10(Random.UInt32)::Core.Compiler.Const(UInt10{UInt32}(), false)
│ %2 = Random.rand(r, %1)::UInt32
│ %3 = (%2 << 13)::UInt32
│ %4 = (%3 | 0x3f800000)::UInt32
│ (z = Random.reinterpret(Random.Float32, %4))
│ %6 = z::Float32
│ %7 = Random.Float16(1)::Core.Compiler.Const(Float16(1.0), false)
│ %8 = Random.prevfloat(%7)::Core.Compiler.Const(Float16(0.9995), false)
│ %9 = (%6 - %8)::Float32
│ %10 = Random.Float16(%9)::Float16
└── return %10
julia> @code_warntype rand(RNG, S32)
Variables
#self#::Core.Compiler.Const(rand, false)
r::Core.Compiler.Const(Random._GLOBAL_RNG(), false)
#unused#::Core.Compiler.Const(SamplerTrivial{OpenOpen01{Float32},Float32}(OpenOpen01{Float32}()), false)
Body::Float32
1 ─ %1 = Random.UInt23()::Core.Compiler.Const(UInt23{UInt32}(), false)
│ %2 = Random.rand(r, %1)::UInt32
│ %3 = (%2 | 0x3f800000)::UInt32
│ %4 = Random.reinterpret(Random.Float32, %3)::Float32
│ %5 = Random.prevfloat(1.0f0)::Core.Compiler.Const(0.99999994f0, false)
│ %6 = (%4 - %5)::Float32
└── return %6 which shows that constant folding is not happening for In any case, I will fix the |
fix suggested by @mschauer, thanks!
No. I'm not sure. Looking at the dSFMT code, it seems it sets the least significant bit to 1 before subtracting 1.0, rather than directly subtracting prevfloat(1.0) as we do. This would waste one bit of entropy and one clock cycle. However, looking at the random number vector in the test code, there seems to be randomness in the eps()/2 bit, so maybe I'm looking at the wrong piece of code. I think this is the relevant part of the dSFMT code: #if defined(HAVE_SSE2)
...
inline static void convert_o0o1(w128_t *w) {
w->si = _mm_or_si128(w->si, sse2_int_one);
w->sd = _mm_add_pd(w->sd, sse2_double_m_one);
}
...
#else
...
inline static void convert_o0o1(w128_t *w) {
w->u[0] |= 1;
w->u[1] |= 1;
w->d[0] -= 1.0;
w->d[1] -= 1.0;
}
...
#endif |
@nanosoldier |
Triggering Nanosoldier requires commit access to Julia. @nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
I am wondering if the benchmark improvements/regressions are just noise: in some cases, nothing was changed, in other cases I compared directly and there should be no performance hit. @rfourquet raised an important point above: it is still not decided whether a change like this would be accepted. It would be great if triage could discuss this. |
I'm skeptical of this. Random floats are supposed to produce samples uniformly in [0,1), but of course that is only up to roundoff error, and any practical application will surely have many other accumulated roundoff errors that swamp a bias of It doesn't seem reasonable to even slightly slow down all random-number generation—or just change the statistical distribution, for that matter—to correct an effectively invisible bias. Especially since virtually any nontrivial calculation using |
FWIW I don't think there is a slowdown, those numbers are just benchmarking noise. This of course should be investigated if triage is open to the idea, but there should be literally no cost to fixing this, as we are just subtracting a different constant (and dsfmt also supports this natively). Also, as @mschauer demonstrated in the discussion of #33222, the bias is actually detectable for Incidentally, a possible benefit is that algorithms that need to branch for |
This makes triage nervous. It's ok to add the open-open sampler and make it available, but we don't want to change the default without a very compelling argument. We also discussed that closed-closed is very defensible, since it's what you'd get if you sampled ideal real numbers and then rounded (some numbers would round to exactly 0 or 1). Anyway, we should not rush to change this. |
I understand. I will revert the changes that make the new sampler default, and then this PR will just add |
Closing based on this comment: #33222 (comment) |
Fixes #33222.
Also changed some hardcoded test values accordingly.