Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rmath broken #8874

Closed
lindahua opened this issue Nov 2, 2014 · 24 comments
Closed

Rmath broken #8874

lindahua opened this issue Nov 2, 2014 · 24 comments
Labels
bug Indicates an unexpected problem or unintended behavior

Comments

@lindahua
Copy link
Contributor

lindahua commented Nov 2, 2014

With recent changes in Julia 0.4, Rmath is now broken:

For example, using Rmath to generate random numbers, we get:

julia> const Rmath = "libRmath-julia"
"libRmath-julia"

julia> [ccall((:rbinom, Rmath), Float64, (Float64, Float64), 10, 0.5) for _ = 1:100]'
1x100 Array{Float64,2}:
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0    0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0

julia> [ccall((:rchisq, Rmath), Float64, (Float64, Float64), 10, 0.5) for _ = 1:100]'
1x100 Array{Float64,2}:
 9.0  9.0  9.0  9.0  9.0  9.0  9.0  9.0  9.0    9.0  9.0  9.0  9.0  9.0  9.0  9.0  9.0  9.0

julia> [ccall((:rnorm, Rmath), Float64, (Float64, Float64), 10, 0.5) for _ = 1:100]'
1x100 Array{Float64,2}:
 10.0  10.0  10.0  10.0  10.0  10.0  10.0  10.0    10.0  10.0  10.0  10.0  10.0  10.0  10.0

# runif just hang there
julia> [ccall((:runif, Rmath), Float64, (Float64, Float64), 1.0, 10.0) for _ = 1:100]'
^CERROR: interrupt
 in anonymous at no file   

Some important distributions in Distributions.jl are still relying on R-math for random number generations.

This is related: JuliaStats/Distributions.jl#293

cc: @simonbyrne, @ViralBShah

@lindahua lindahua added the bug Indicates an unexpected problem or unintended behavior label Nov 2, 2014
@lindahua
Copy link
Contributor Author

lindahua commented Nov 2, 2014

cc: @rfourquet

@ViralBShah
Copy link
Member

I believe this is because Rmath is using the dSFMT's global RNG and that is no longer being initialized. Initializing the global RNG will solve the immediate problem, but we should try to get rid of using Rmath's RNGs during the course of the 0.4 release.

Will get to doing this in the next few hours unless @rfourquet or someone else beats me to it.

@davidavdav
Copy link
Contributor

is this because Rmath has an evil license?

@rfourquet
Copy link
Member

I'm going to not have time to look at this for the next few days.

@andreasnoack
Copy link
Member

Can't we just use the quantile function of all the distributions where we haven't implemented our own rand yet? Maybe some will be slightly slower, but we are free of the patched Rmath.

@lindahua
Copy link
Contributor Author

lindahua commented Nov 2, 2014

@andreasnoack Generating a random number can be much more efficient than computing quantile. Using quantile is just a fallback when you have no other ways.

@lindahua
Copy link
Contributor Author

lindahua commented Nov 2, 2014

We are developing our own samplers, but at this point some important distributions still rely on Rmath.

@andreasnoack
Copy link
Member

I know. It was only meant as a temporary fix until we have our own implementations. I don't think that it is likely that Rmath get synced with the global RNG again.

@simonbyrne
Copy link
Contributor

Why don't we take this as an opportunity to get rid of Rmath from Base julia? From what I can tell, we would need:

  1. A "wrapper" package for Rmath library, which does all the BinDeps stuff: this can then be used as a dependency for Distributions.jl until we've completely replaced everything. We can just use the vanilla package without the RNG patch.
  2. New rand routines in Distributions: I think this is feasible by 0.4 (a lot of the routines are written, they need to be tested and tuned). Until then we can emit a warning for anyone using 0.4-dev that the random numbers are not synced with the julia global RNG.

@andreasnoack
Copy link
Member

I think that is a good idea. Du you have a place where the missing RNGs are tracked?

@dmbates
Copy link
Member

dmbates commented Nov 3, 2014

Is there a replacement for the incomplete beta function in Rmath? Getting a version of the incomplete beta that is comparable in coverage and accuracy to the one in Rmath has always been a roadblock to removal of Rmath.

@dmbates
Copy link
Member

dmbates commented Nov 3, 2014

Grepping for _jl_dist in src/univariate produces

grep -nH -e _jl_dist *.jl
beta.jl:13:@_jl_dist_2p Beta beta
binomial.jl:17:@_jl_dist_2p Binomial binom
chisq.jl:9:@_jl_dist_1p Chisq chisq
fdist.jl:11:@_jl_dist_2p FDist f
gamma.jl:18:@_jl_dist_2p Gamma gamma
hypergeometric.jl:14:@_jl_dist_3p Hypergeometric hyper
negativebinomial.jl:20:@_jl_dist_2p NegativeBinomial nbinom
noncentralbeta.jl:12:@_jl_dist_3p NoncentralBeta nbeta
noncentralchisq.jl:22:@_jl_dist_2p NoncentralChisq nchisq
noncentralf.jl:12:@_jl_dist_3p NoncentralF nf
noncentralt.jl:10:@_jl_dist_2p NoncentralT nt
poisson.jl:10:@_jl_dist_1p Poisson pois
tdist.jl:9:@_jl_dist_1p TDist t

Those all need to be replaced before Rmath can be removed.

@andreasnoack
Copy link
Member

I think the plan is to keep Rmath as a dependency because of the special functions. The important step now is to get rid of the dependence of the RNGs as they cannot be synced with the global state anymore. That should be much easier than matching the special functions.

@ViralBShah
Copy link
Member

@davidavdav We want to move away from Rmath to have a pure julia implementation that is more flexible to work with, and also avoids using GPL for a foundational library.

@ViralBShah
Copy link
Member

The reason Rmath is included in Base is just for convenience, so that Distributions works out of the box for people coming to Julia from R, sort of a good first date expereince. We can make the BinDeps process smooth, but it will still end up turning people away.

We certainly don't want to ship Rmath in 0.4. If we don't use the Rmath RNGs, then we no longer need the forked version either, and can use the stock version that ships on linux distros, etc.

@tkelman
Copy link
Contributor

tkelman commented Nov 3, 2014

@ViralBShah seems like the RNG issue is the only reason for privileging Distributions over any other package with binary dependencies. (If github stars are still our best proxy for usage, there are several other packages more popular than Distributions that have to rely on BinDeps right now.) If the RNG's are going to be decoupled so Rmath can be used with fewer modifications relative to upstream, seems it might be time to revisit its bundling with base.

@ViralBShah
Copy link
Member

Originally, we started with some of this stuff in Base. Then we got packages and distributions moved there. Rmath stayed mainly because it linked with our dSFMT and its RNGs used the same stream of random numbers as Julia. So, there is a little bit of history in having started out in Base. Other packages did not have this legacy.

Now that the tight linkages are no longer there, it is perfectly reasonable to stop bundling it in base. I am happy to stop doing that as soon as we have a decent BinDeps version and binaries wherever possible.

I do think we should have a julia distribution that includes lots of packages with their binary dependencies with documentation and tests, but that is a separate project altogether.

@tkelman
Copy link
Contributor

tkelman commented Nov 5, 2014

I am happy to stop doing that as soon as we have a decent BinDeps version and binaries wherever possible.

I can help with BinDeps, writing a WinRPM spec file, etc. Is Distributions.jl ready to use unpatched rmath? That would be a bit easier as far as using distribution binaries goes, and getting the opensuse maintainers to agree to the package request for WinRPM once it's ready.

I do think we should have a julia distribution that includes lots of packages with their binary dependencies with documentation and tests, but that is a separate project altogether.

Agreed.

@ViralBShah
Copy link
Member

The thing is that unpatched Rmath has slow RNGs. But it can definitely be used as is.

@ViralBShah
Copy link
Member

We can go with the unlatched one since Distributions.jl will soon stop using the Rmath RNGs.

@lindahua
Copy link
Contributor Author

lindahua commented Nov 5, 2014

We are working on gradually removing the use of Rmath in Distributions.jl, one function/distribution at a time. It is going to take quite a while. Before that we wish Rmath stays as it is.

@StefanKarpinski
Copy link
Member

We are working on gradually removing the use of Rmath in Distributions.jl, one function/distribution at a time. It is going to take quite a while. Before that we wish Rmath stays as it is.

Isn't moving moving Rmath to a package-specific binary dependency a natural step in that direction?

@nalimilan
Copy link
Member

IIUC, the problem is that, from a new Rmath.jl Julia package, it would be nicer to use a pre-built Rmath using BinDeps so that people don't need to compile it when they installed Julia from binaries (so that they don't need a C compiler, etc.). But for that it shouldn't use a custom Rmath, which isn't available as a package in distributions, and wouldn't be easily accepted by OpenSuSE/WinRPM maintainers (what @tkelman said). This is probably what @lindahua is worried about, since not using a custom Rmath would mean losing performance for random numbers generation.

@tkelman
Copy link
Contributor

tkelman commented Nov 5, 2014

Yes. Though we could just put up a dll of our custom version somewhere if we need to, so it's not a hard requirement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

10 participants