-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Cholesky factorization of dual matrices #11
Conversation
Allow convenient creation of dual arrays from two parallel arrays RE and DU, and extraction of the epsilon part as an array.
Cholesky factorization using chol() for the real part and solving the matrix equation B = M'*U + U'*M for the epsilon part. This version is fairly fast, as it calls through to the BLAS for the heavy lifting. Requires additional work to play nicely with the Base.cholfact!() infrastructure.
Great, Chris! |
WIP: Cholesky factorization of dual matrices
Ok, I will let you work on the tests and improve it @mlubin, happy if you can commit to it. |
Yes, it's incomplete, I should have been clearer that I didn't expect it to be merged quite yet (that was what the "WIP" was meant to indicate). Never mind though - I'll just create a new pull request for additional changes. I'm happy to do those, just wanted to open the discussion. @mlubin The current chol() interface for duals was more of a proof of concept than anything. It correctly gets called for chol(A) where A is a dual matrix, but nothing more. The cholfact!() and cholfact() docs say
Seems like quite a bit more work to be done to support all the options. Lower or upper is easy I guess - just transpose after the fact. Pivoting might require a little more work, though I guess it's just a matter of passing the pivot options along to cholfact!(real(A)), and applying the permutation correctly before computing the decomposition of epsilon(A). I'm not yet sure whether any of this can usefully be done in-place. If we could reinterpret Matrix{Dual{Float64}} into a Matrix{Float64} of twice the number of rows we might have a chance. Do either of you know whether julia gives any guarantees on memory layout which would let us do that safely? |
I don't know the answer to this question @c42f. Thanks for willing to work on improving your Cholesky factorization code, I didn't realize this was not ready (I merged without having noticed the WIP). If you could amend your code, possibly with the help of @mlubin, this would be great as I can't give it my immediate attention due to other ongoing coding work. |
@c42f, I've reverted the commit so that you can make a clean PR. What I was thinking about |
Ok, thanks Miles. I can't reopen this PR since github knows it's merged (unless one of you guys can?) but we can keep discussing it here until I've got some new code to talk about. It's a good question as to whether we want the factorization to be a pair of Matrix{Float64} vs a single Matrix{Dual{Float64}}. I was assuming we'd want the latter as output (if you look at my test chol() implementation you'll see that's what I produce at the moment). You're probably right that it's better to keep them split apart though: as soon as you want to solve an equation using the factorization, you'll likely want to call the blas again for efficiency. Looking at the code in base/linalg/factorization.jl, I see the Cholesky stuff is rather tied to LAPACK at the moment, and testing shows it doesn't even work with a BigFloat array yet. That probably means we're a bit ahead of ourselves trying to get it all working for duals. |
I wonder if it would work if the |
Yes, allowing |
@andreasnoackjensen, that doesn't quite resolve the issue because The code needed to implement this will also surely double the complexity of the DualNumbers package, which I'm not sure if it's worth it just to be able to call BLAS. |
@mlubin At first, I thought the same, but instead of
it almost works for a However, it is still not obvious that we want this. I am a little worried about the number of temporaries in these operations. |
Calling BLAS can be a big deal performance-wise, at least vs the implementation of matrix multiplication which function dmul(A,B)
rA = real(A)
rB = real(B)
dual(rA*rB, rA*epsilon(B) + epsilon(A)*rB)
end
A = dual(rand(1000,1000), rand(1000,1000))
B = dual(rand(1000,1000), rand(1000,1000)) Now we have
It's a lot of extra allocation, but I'd take that for a 10x performance boost! The question of whether DualNumbers should take on the complexity of making |
I think the best solution would be to relax |
It's too bad that BLAS/LAPACK don't accept strides for matrices, since then we could just reinterpret a |
To put it more clearly, the primary use of |
That is right, but it is somewhat similar to the requirement that you cannot restrict your function arguments to
I don't get this part. Can you elaborate? |
@andreasnoackjensen There's no loss of exact typing when you write your function to take a number type |
@mlubin I think that some of your message is missing. |
Should be edited now. |
The idea was to have I don't know if it is feasible to differentiate through a |
Interesting conversation guys, though it leaves me unsure about which way is best to proceed. I'm not entirely sure I want to take on the work of rearranging the whole package around allowing |
I agree. I also like to think about problems like this. I tried to compare
|
@andreasnoackjensen - by "triangular Lyapunov solver" do you mean that the function |
I am at the sea intensively marking exams without a decent internet
|
Sure, that could certainly be the case - I took choldn!() from the mailing list without looking much at it. A generic solution for Cholesky factorization will certainly be great to have. I think I found the pull request you refer to... JuliaLang/julia#7236 |
I just did some timings on 1000x1000 matrices. Unless the generic cholesky can be easily improved, it still seems valuable to have this specialized code. I'd vote for squeezing the specialized factorization result into a |
Ok cool, thanks for the extra testing @mlubin. Along the same lines, it's probably worth having a specialized version of matrix multiplication for BLAS types, since that's so much faster. It's really too bad that gemm doesn't support striding since the BLAS versions imply a lot of extra memory allocation. |
As discussed on the mailing list, here's some code implementing Cholesky factorization of dual matrices in a reasonably efficient manner.
The current implementation doesn't play nicely with the cholfact!() infrastructure, so some additional changes will be required. I can do some digging to figure that out, but suggestions are also very welcome. This is close to my first attempt at production quality julia code, so there could well be some strange things I've done in places.