Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chold! doesn't act as expected #1374

Closed
simonbyrne opened this issue Oct 13, 2012 · 23 comments
Closed

chold! doesn't act as expected #1374

simonbyrne opened this issue Oct 13, 2012 · 23 comments

Comments

@simonbyrne
Copy link
Contributor

As I understand it, chold! is supposed to return an in-place CholeskyDense: it outputs such a type, but its value stays as Array. Also, only the upper diagonal elements are changed, the lower diagonal elements are those of the original matrix.

julia> X = randn(5,5); A = X'*X
5x5 Float64 Array:
  3.6818     1.76559   1.03745  -2.91073  -0.814601
  1.76559    6.79412   1.04646  -1.67308  -6.84107 
  1.03745    1.04646   2.79321  -1.98242  -2.5977  
 -2.91073   -1.67308  -1.98242   4.88272   4.21862 
 -0.814601  -6.84107  -2.5977    4.21862  13.5007  

julia> chold!(A,true)
CholeskyDense{Float64}(5x5 Float64 Array:
 1.9188  0.920151  0.540677  -1.51695   -0.424536
 0.0     2.43874   0.2251    -0.113688  -2.64499 
 0.0     0.0       1.56531   -0.726144  -1.13253 
 0.0     0.0       0.0        1.42876    1.71584 
 0.0     0.0       0.0        0.0        1.44836 ,true)

julia> A
5x5 Float64 Array:
  1.9188     0.920151   0.540677  -1.51695   -0.424536
  1.76559    2.43874    0.2251    -0.113688  -2.64499 
  1.03745    1.04646    1.56531   -0.726144  -1.13253 
 -2.91073   -1.67308   -1.98242    1.42876    1.71584 
 -0.814601  -6.84107   -2.5977     4.21862    1.44836 
@StefanKarpinski
Copy link
Member

It's not possible to change the type of something in-place, so the behavior expecting isn't actually feasible. Perhaps this is an indication that chold! simply shouldn't exist since its behavior can't be made to match that of its non-in-place counterpart.

@StefanKarpinski
Copy link
Member

cc: @timholy

@timholy
Copy link
Member

timholy commented Oct 13, 2012

"its value stays as an array": CholeskyDense is a composite of two objects, and Array and a Bool. In that output, you're seeing those two components (note the "true" at the end).

What's possibly confusing is that that CholeskyDense is designed to act as a decomposition of A, in other words be interchangeable for A when used in linear algebra. For example:

julia> Ac = copy(A)
5x5 Float64 Array:
  6.4692     2.17151   -3.92343   -3.83961   -0.393393
  2.17151    4.54112   -3.09598   -1.09211   -0.745708
 -3.92343   -3.09598   12.3281     0.280427  -0.148782
 -3.83961   -1.09211    0.280427   6.62379   -1.92005 
 -0.393393  -0.745708  -0.148782  -1.92005    4.46919 

julia> C = chold!(A, true)
CholeskyDense{Float64}(5x5 Float64 Array:
 2.54346  0.853761  -1.54255  -1.5096    -0.154668
 0.0      1.95249   -0.91115   0.100757  -0.314295
 0.0      0.0        3.01968  -0.647885  -0.223115
 0.0      0.0        0.0       1.97863   -1.14545 
 0.0      0.0        0.0       0.0        1.72762 ,true)

julia> C\b
5-element Float64 Array:
 -0.189731
 -0.419205
 -0.250393
 -0.222415
 -0.419997

julia> Ac\b
5-element Float64 Array:
 -0.189731
 -0.419205
 -0.250393
 -0.222415
 -0.419997

In other words, this test shows that it's working as intended.

If you want the Cholesky factor, used factors(C). Or just call chol(C).

@timholy timholy closed this as completed Oct 13, 2012
@timholy timholy reopened this Oct 13, 2012
@timholy timholy closed this as completed Oct 13, 2012
@StefanKarpinski
Copy link
Member

Ah, clever. So the ! version basically uses the input array as a computational scratchpad but returns the decomposition still.

@simonbyrne
Copy link
Contributor Author

I get the decomposition thing---that's really quite useful---I guess I just expected chold!(A) to have the same result as A=chold(A).

@timholy
Copy link
Member

timholy commented Oct 13, 2012

I see your point, and it's a good one. (It took me a little to understand that was probably your thinking, sorry about that.) But as Stefan said, it's not possible. I guess ! also means "caution!" in this context :-). ! is actually pretty ambiguous: some functions have "dummy" inputs that really serve as outputs, for others (like sort!) they are genuine inputs and the output is returned in them, and yet others just use their inputs as scratch space. I have wondered about using chold!! on occasion, but in the end any naming convention will break down for functions that have multiple inputs and do different things to them.

I just noticed that your comments also point out that chold! is not quite working in place. The call to Lapack is in-place, but then it's calling triu or tril which creates a new matrix. So we have the worst of all worlds, it destroys the input but it also allocates a new matrix, which therefore doesn't really represent a savings. I just pushed a fix for this issue.

@simonbyrne
Copy link
Contributor Author

So does that mean A = chold!(A,true) won't need to create a new array in memory? If so, that's very handy.

@timholy
Copy link
Member

timholy commented Oct 13, 2012

Correct.

@timholy
Copy link
Member

timholy commented Oct 13, 2012

Indeed, I should say that C = chold!(A, true) also doesn't create a new array in memory. I would be tempted to rename it. I'm not sure (others could tell you with certainty), but I wonder if having variables change type in the middle of a function is less-than-ideal programming practice from the standpoint of making life easy on the compiler.

@pao
Copy link
Member

pao commented Oct 13, 2012

It depends. If A is just an identifier, in A = chold!(A, true) you're binding something new to it so it's as if you've really called it A1 and you call it that for the rest of its scope (which if I understand correctly is conversion to a static single assignment (SSA) representation.) Compilers figure that out. This is different than mutating the value of A into something with a different type, which is what the hypothetical side-effectful chold!(A, true) would have to do. As Stefan mentions, that can't be done, since the type of a value can't be changed.

Note that I was very careful to completely avoid the word "variable."

@daviddelaat
Copy link
Contributor

An alternative would be to use a type to indicate that an object is allowed to be destroyed:

type Destroy{T}
    val::T
end

chold{T<:LapackType}(A::Matrix{T}) = CholeskyDense{T}(copy(A), true)
chold{T<:LapackType}(A::Destroy{Matrix{T}}) = CholeskyDense{T}(A.val, true)

The exclamation mark could then be reserved for functions that do in-place modifications such as sort!.

@timholy
Copy link
Member

timholy commented Oct 16, 2012

That's not a bad idea. I'm a bit used to knowing that ! means I'd better read the documentation on a function, so I don't think this is essential, but it might mean fewer trips to the docs. That certainly has merit. What do others think?

@dmbates
Copy link
Member

dmbates commented Oct 16, 2012

My attitude was much like yours, @timholy. The ! in the function name is a sign of danger in that the function can modify its arguments. That's why almost all the functions in the Blas and Lapack modules have names ending in !.

To me it is a sign to the casual user that such functions should be avoided. The expert user who has read the docs can employ such functions to overwrite arrays thereby saving storage and enhancing performance but they should not complain about subtle bugs in their code introduced by the contents of arrays being changed.

So I think that the ! means, "you better read the docs carefully before using this function".

@daviddelaat
Copy link
Contributor

Yes, I think you're right about the ! indicating danger, and hence requiring the user to read the documentation. But not all functions that destroy their arguments do (and can) have exclamation marks:

A = [10. 2; 3 4]
CholeskyDense{Float64}(A, true)
println(A)

(Although maybe this example is a bit far-fetched.)

@StefanKarpinski
Copy link
Member

Right. That's a different but related convention, which is that when you give constructors an array, they "own" it. Of course, this is to avoid having to make copies of everything, but it can definitely be a usability issue.

@dmbates
Copy link
Member

dmbates commented Oct 17, 2012

As an inner constructor the function CholeskyDense{Float64} must have the same name as the type so the only way to insert a ! in the name is to change the name of the type as well. Although we expect that it will be very rare to call the inner constructor for a templated class directly, all the other constructors do end up calling the inner constructor eventually. Thus we are left with the option of mangling the name of the type, potentially confusing some users, or forcing a copy of the argument in the inner constructor. The latter choice would mean that all such objects would cause a copy of the matrix, which is a situation I would like to avoid. I would prefer to provide the ability to modify the array in place if the programmer so chooses.

All the Factorization types eventually call Lapack subroutines to generate the decomposition. These subroutines always overwrite some of their arguments, hence the option for overwriting is available. Saving memory allocation and garbage collection by overwriting is not important when dealing with arrays with tens or hundreds of elements. It is important when dealing with arrays having tens or hundreds of millions of elements. I write code that performs iterative optimization of parameters in models that can have matrices of this size in their representation and these arrays need to be updated at each evaluation of the objective function. I really want to do that updating in place. In fact, in the R code for these models that I currently have available, I go to a lot of trouble and complication linking compiled code with R code and passing objects that should not be modified but are, exactly to allow for in-place updating. In early versions of that code (the lme4 package for R) users had the experience of getting models fit and then having R fail because it is unable to allocate memory to construct the object to be returned from the model-fitting function. In other words, the results were calculated but could not be returned because of a copy operation. They weren't happy.

@timholy
Copy link
Member

timholy commented Oct 17, 2012

I think David's point is that you could, if you chose, have the following behavior: CholeskyDense{Float64}(A, true) makes a copy of A (so doesn't destroy it). In contrast, CholeskyDense{Float64}(destroy(A), true) does not copy of A, because you've marked A as being something you don't care about preserving, and uses A for its storage of the decomposition.

However, I agree with Stefan it's probably a good convention that when you pass things to a constructor, you're passing control of them to the new object. I think I follow that pretty naturally in my coding, but I suppose this is yet another chance to ask whether we need more documentation.

@daviddelaat
Copy link
Contributor

@timholy, that's indeed what I meant. The convention of passing control of the arguments in a type constructor to the type does seem to make a lot of sense, however. Adding Destroy methods everywhere would probably be too verbose.

@simonbyrne
Copy link
Contributor Author

If I have a triangular matrix, is there a way to construct a CholeskyDense instance? As I understand it, the constructor only accepts the full matrix, and then computes the Cholesky decomposition. But what if I already have the Cholesky decomposition computed?

This could be useful for sampling from Wishart (and inverse Wishart) distributions, which can be done via a Bartlett decomposition.

@dmbates
Copy link
Member

dmbates commented Oct 19, 2012

@simonbyrne At present the CholeskyDense{T} constructors perform the decomposition. You could always create a CholeskyDense{T} object of the appropriate size as, say, cld = chold!(eye(n), false) then replace the matrix as cld.LR = myBartlettMatrix

Actually, now that I think of it there is no need to match the size of the matrix used to create ``cld` with the size of the matrix you replace the factor by.

However, it is probably much easier to just call Lapack.potrs! directly if you want to solve systems of the form LL'x=b

@dmbates
Copy link
Member

dmbates commented Oct 19, 2012

@daviddelaat Regarding adding Destroy methods everywhere, I think it make much more sense to call fun!(copy(A)) than to try to do a lot of gymnastics within a function to have fun(Destroy(A)) work.

@simonbyrne
Copy link
Contributor Author

@dmbates Thanks. You can actually call cld = chold!(eye(1), false), and then replace it with a larger matrix to save having to compute the Cholesky of the identity.

However, perhaps it would be better if the constructor for CholeskyDense didn't compute the Cholesky at all: i.e. calls to CholeskyDense would take the triangular matrix as the argument. You could transfer the computation to the chold! function, or use convert?

This would have the benefit of being consistent with some of the other AbstractMatrix types, such as Tridiagonal and Woodbury.

@simonbyrne
Copy link
Contributor Author

Actually, I just realised CholeskyDense isn't a subtype of AbstractMatrix: why is that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants