-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make rand
unbiased for floats
#33222
Comments
Just as a data point, to observe this bias, you would need ballpark
number of trials. |
Possibly, but since we can do it right with a very simple correction, why shouldn't we? |
cc @rfourquet |
Some more checks (right column after adding
This doesn't look too bad at first sight. |
@mschauer: I am sorry, I don't understand the purpose of your execise. The bias exists and can be described exactly. There is no need to simulate anything. The bias is correctable, rather cheaply. Whether it "looks bad" in some statistical tests that may or may not have power for detecting it (it is a rather small bias, after all) is not the issue here. The key question is whether there are any objections to correcting this bias on the table. Otherwise, I would just make a PR. @rfourquet, I would appreciate your opinion on this. |
A good implementation of *) Using the information "rand() effectively generates numbers as n*eps(T), where n is an integer between 0 and 1/eps(T)-1 (inclusive)" you gave above. |
Sorry if I came across as defensive, that was not the intention. The consequences of this change are that julia> using Random, BenchmarkTools
julia> rand_corrected(T) = rand(T) + eps(T)/2
rand_corrected (generic function with 1 method)
julia> Random.seed!(1);
julia> @btime foreach(_ -> rand(Float64), 1:1024)
4.382 μs (0 allocations: 0 bytes)
julia> Random.seed!(1);
julia> @btime foreach(_ -> rand_corrected(Float64), 1:1024)
4.368 μs (0 allocations: 0 bytes) |
The performance loss wouldn't just be negligible, it would be exactly zero. The current implementation is something like I.e. the only change is in a compile-time constant. |
My understanding is that this rearrangement is valid because |
I can verify the bias. |
First the
being a deterministic model for
|
Although floating-point addition/subtraction is not associative in general, it's quite easy to show that
|
We do not need to add |
The problem discussed here appears to be of very small significance in practice. See #8958 (comment) and #8958 (comment) |
Can you please elaborate on this? Which of the Test01 test do you think would pick this up? My understanding is that to detect an |
Not for
|
If an issue with a pseudo RNG isn't picked up by the best random number test suite around then I don't think there is an issue. To my understanding, it's the only meaningful way of evaluating pseudo RNGs for statistical applications. |
I think that this is a pretty narrow view: in this case the bias is too small to be picked up using a black-box test, but can be demonstrated analytically, and fixed very easily. Test01 is a very nice suite, but I am not sure that even its authors would claim that what it was designed to pick up everything, and conversely what it doesn't show is by construction not important. |
Thinking about it, having
Output
|
Short explanation: A bivariate Normal(0,1) conditioned on having distance larger than 3 to the origin, sampled with symmetric random walk Metropolis-Hastings. Should "never" values as large as 11, because
|
That's a nice demo for |
Or |
Fixes JuliaLang#33222. Also changed some hardcoded test values accordingly.
There's nothing special about the tests in Test01. Testing that the mean of values sampled from |
Would we fail that test suite because of the bias test? It would take a huge number of draws for However, I am a fan of this issue and associated PR. |
No, detecting such a small bias using black-box statistical techniques is prohibitively expensive. From the CLT, where julia> abs2(2*0.75/eps())
4.563542160821626e31 draws to detect this reliably. Even detecting something much more crude is expensive. Eg a bias of |
Thats correct, I got numbers of the same order (see my first post, using a slightly simpler approach of just checking after how many samples the bias exceeds the standard error of the mean.) |
Check again after #40546? |
Ah, I guess the correction is eps/4 now instead of eps/2 due to the extra bit? |
Yes, exactly. And it would make it |
Adding eps/4 to a number in the range [0.5, 1) is problematic. It will either increment by 0 or eps/2, since the exact result cannot be represented. (A "round to nearest or even" policy will effectively remove one bit of entropy and make the output range include 1.) |
The formula
would have a bias of eps/8192, and also yield more entropy in the small random numbers. The only downside that I can think of is that it would return 1 with a probability of eps/4, which is mathematically correct, but may break some peoples assumptions. |
I get
|
Yes. (That's what I meant when I said that adding eps/4 would make the output range include 1.) If we let
If rand() should never return 1.0, then adding a correction term of |
Personally it seems cool to me if 0 and 1 are possible values, but that's really just subjective and in practice I don't think it matters. So I like the idea of using |
The current implementation, from what I see, provides only 52 significant bits. In my experience, the shift-by-11-and-multiply method is practically equivalent in speed on modern CPUs and gives 53 significant bits, so I prefer it. But you can try to benchmark it in Julia and see what happens. The "real" method to generate floats is linked at https://prng.di.unimi.it/random_real.c (I didn't invent it) and it consists in finding a float with all available precision (something like 1000 bits) and then approximate to the closest representable float. In practice, you use one or two 64-bit values. Note that you do want the [0..1) interval. Or inversion won't work for discrete distributions. E.g., if you're generating the uniform distribution on the first n integers multiplying a uniform value in [0..1) by n and taking the floor (inversion) this will work provided that 1 is never generated. This is just one example, but there are a number of reasons why [0..1) is the natural interval for a uniform float in [0..1). The formula that uses all 64 bits introduces some bias in the lower bits due to rounding (I read this in the Java documentation a loooong time ago). This can be bad or not, depending on what you intend for "uniform in [0..1)", given that the actual distribution is continuous and whatever you do on a computer is, in the end, discrete. |
Interesting, thanks! We still use the 52-bit version for our older RNGs so that they give the same values as before, but for the new xoshiro generator we're using shift-by-11-and-multiply currently. So it sounds like your recommendation is just to keep things as they are, IIUC. |
Ah, I just realized that |
OK, I just read the comment from @perrutquist about returning 1 when using all 64 bits. I never realized that. Yes, I think that's one of the reasons everybody uses 53 bits of precision. You really do not want to return 1. The classical C idiom for uniform generation in the unit interval is indeed And yes, I would keep things as they are. If you want more precision for smaller numbers, the multiple-value generation used by the code I pointed to in my previous message is the way to go, but it's a slower execution path that contains a test. |
Technically, (0, 1) is a subset of [0, 1), so I would not consider changing this breaking. That said, if it's too much of a hassle to change this, then we should just document the bias. For |
One thing that come to mind is: do you know other implementations trying to fix this bias? Because all recent code I've ever seen uses the stranded 53-bit technique. It is possible that there is some unintended consequence in the change you're proposing. On the other hand there are generators, such was MRG32ka, which by design never return zero. This is why MRG32ka fails PractRand bit-pattern tests. |
It is worth noting that in the 52 bit case (where this issue started) the negative bias in the expected value was 1 ULP, but in the 53 bit case it is only 0.5 ULP, which is a commonly used threshold for when errors in floating point functions are acceptable. |
In VectorizedRNG I mask the I haven't seriously considered the consequences, I just didn't want the RNG to be able to return |
That's my question, but with "other" mean "really other". Like, what do FORTRAN implementations do? Also, does that shift leaves the uniform generation in ranges [0..2^k) perfectly uniform (I guess so)? |
Just to make sure: there are numbers between
|
So what is the right number then? IIUC, eps()/4 is it. |
If rand() should never return 1.0, then eps()/4 cannot be used. With 53 bits of randomness, the best course is probably to add nothing, since the the rounding following the addition would introduce unevenness in the distribution. On the other hand, I think eps()/2 is the best number to add with 52 bits of randomness, since no rounding would take place. |
It seems to me that these three properties cannot all hold at the same time for a floating-point RNG:
For real numbers, yes, one can do unbiased, uniform sampling from |
Triage feels that we should do what everyone else is doing, which seems to be generating 53-bit sample from |
@oscardssmith noted that we could allow passing a rounding mode to |
I'll close this since the issue has been mitigated at least somewhat, and we've more or less decided that the current algorithm is ok. |
Apologies for adding to this old, long, and closed thread, but I encountered an example that I think is relevant context for the discussion: cuRAND, the rng library bundled with CUDA, samples uniform floats with the following properties:
More elaboration here: JuliaGPU/CUDA.jl#1464 (comment) (note that I hadn't read this thread yet so I was a bit ignorant about the subtleties of rounding). |
The current implementation of samplers for eg
Float23
andFloat64
effectively generates numbers asn*eps(T)
, wheren
is an integer between0
and1/eps(T)-1
(inclusive). Eg forFloat64
this range is0:(2^52-1)
.This makes the random draws (slightly) biased, by
-eps(T)/2
.The proposal is to add this term back to the result in the implementation.
Cf this discussion, especially the proposal of @perrutquist
The text was updated successfully, but these errors were encountered: