-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iron out ProjectTo
#467
Comments
I will do this. In a follow up PR
@sethaxen may know.
Nah, we do not need to |
I think that PR makes sure that the result is right for complex numbers. |
Yes this is correct. That PR stopped short of projecting, since we had not yet agreed to do that (though a few other rules sneakily project already). |
Things I find surprising [Edit -- added 8-12]: julia> using SparseArrays, FillArrays, ChainRulesCore
julia> ProjectTo([1,2.0]')(ones(1,2)) # 1. fails to preserve adjoint vector
1×2 adjoint(::Matrix{Float64}) with eltype Float64:
1.0 1.0
julia> ProjectTo(Symmetric(rand(2,2)))([1 2; 100 0]) # 2. not a projection onto symmetric part
2×2 Symmetric{Float64, Matrix{Float64}}:
1.0 2.0
2.0 0.0
julia> ProjectTo(UpperTriangular(rand(2,2)))(Diagonal([1,2])) # 3. wasn't there a plan to allow smaller spaces?
2×2 UpperTriangular{Float64, Matrix{Float64}}:
1.0 0.0
⋅ 2.0
julia> ProjectTo(rand(2,2)')(ones(2,2)) # 4. seems like an unnecessary matrix copy, why is this a good idea?
2×2 adjoint(::Matrix{Float64}) with eltype Float64:
1.0 1.0
1.0 1.0
julia> ProjectTo(rand(1, 3))(Fill(2.0, 1, 3)) # 5. seems like an unnecessary materialization, why?
1×3 Matrix{Float64}:
2.0 2.0 2.0
julia> ProjectTo(true)(2.3) # 6. if Bool is categorical, this should be some Zero
ERROR: InexactError: Bool(2.3)
julia> ProjectTo(1)(2.3) # 7. but integers aren't, so this should work?
ERROR: InexactError: Int64(2.3)
julia> ProjectTo(ones(2,5))(ones(10)) == ones(10) # 8. isn't reshaping precisely why you save size, not just type?
true
julia> ProjectTo(ones(2,5))(ones(5,2)) == ones(5,2) # 9. but perhaps this should be an error, axes too different?
true
julia> s = sprand(3,10,0.2); p = ProjectTo(s); p(ones(3,10)) # 10. wasn't this example most of the reason not to store only the type?
ERROR: MethodError:
julia> ProjectTo([1 2; 3 4])((1,2,3,4)) # 11. would repair Zygote.gradient(x -> +(x...), [1 2; 3 4])
ERROR: MethodError ...
julia> ProjectTo([1 2 3])(ZeroTangent()) # 12. why materialise, instead of propagating Zero?
1×3 Matrix{Int64}:
0 0 0 The merged PR JuliaDiff/ChainRulesCore.jl#385 seems like a more elaborate cousin of https://gist.github.com/mcabbott/8a84086cc604d34b5e8dff2eb3839f3a . (Which is a too-minimal example intended only to sketch the idea compactly.) That has one more stage of indirection, in that (One could of course instead make |
That is what we have except it is spelt Thing that is missing is that we don't have the overloads (yet) to allow subspaces of your subspace not to be projected.
This seems right, what is not being preserved?
What do you mean? What output is expected?
yeah, I think we should go and define a partial ordering on matrix types and allow that.
similar for
This should be a noncopying operation.
Yeah, these are probably bugs. |
That's why I thought it clearer that the
But if T is concrete, it has no subspaces. So what exactly describes the subspace in question, is it just something we should remember, is there a comment, is there a definition in a test suite which will nag us later? This is why I like the abstract type idea, then it's in code. But perhaps it has other flaws I haven't seen yet.
The result ought to be
I don't think that
But it can't be, the layout in physical memory is different. Which is why the appropriate projector for |
It is.
Subspace, not subtype https://en.wikipedia.org/wiki/Linear_subspace Diagonal is a subspace of symmetric, is a subspace of dense matrix. Point of subspace is that it matches the structural requirements of the "super"space, and (if strict) adds some more. It captures what you were saying about "smaller" with Diagonal < Symmetric
Subspaces in julia are represented with wrapper types not with type heirachy. It will end-up in code, when we go and define the identity projections. Though not all subspaces are captures in the type, SparseArrays being the example of ones that are not.
Right. But we can not call say
Ok, yeah that sounds bad. That should be fixed if that is happening |
I am aware! My proposal is to see whether the former could usefully be encoded in the latter. If it could, this might be a pretty scheme. As I said, maybe this has holes in it, but give me credit for having thought for more than a minute here.
Precisely. With the one exception of adjoint vectors, my example 1 above. And maybe they preserve Under the abstract proposal, the projection operator for The projection operator for |
Sorry, communicating in writing is hard.
I think we are talking about the same thing then. |
Sure. What I'm trying to reverse-engineer is sort-of what the conceptual idea of the current design actually is. I was hoping there was a list of potentially awkward examples somewhere which guided this. It seems to always store concrete type of It seems to also store but never use a bunch of other stuff, I haven't understood the scope of what that's for (nor how to invent awkward edge cases for it). |
I think this can be closed. Open questions are tracked at https://github.com/JuliaDiff/ChainRulesCore.jl/labels/ProjectTo |
Following JuliaDiff/ChainRulesCore.jl#385 and #459, there are still some things to iron out, namely:
ProjectTo
when writing rules.@scalar_rule x \ y (-(Ω / x), one(y) / x)
. Do we change the@scalar_rule
macro, or do we just write out the rules that need to be projected? Is there a performance impact (I've tried for * and adding project appears to have no effect)?::typeof(sum), f, xs::AbstractArray
breaks inference when an array of array is passed. Can we solve this?Perhaps we want to solve some of these before merging the two PRs?
The text was updated successfully, but these errors were encountered: