Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broadcasting + better CuArrays support #21

Merged
merged 1 commit into from
May 4, 2020
Merged

Broadcasting + better CuArrays support #21

merged 1 commit into from
May 4, 2020

Conversation

colinxs
Copy link
Contributor

@colinxs colinxs commented Apr 17, 2020

  1. Defines Base.similar(A::ElasticArray, ::Type{T}, dims::Dims{N}) to return an ElasticArray.
  2. Changes broadcasting to return an ElasticArray
  3. Removes Base.similar(::Type{ElasticArray{T}}, dims::Dims{N}); the definition for AbstractArray in Base is sufficient.
  4. Removes parent(A::ElasticArray). Several functions in LinearAlgebra depend on size(A) == size(parent). Given that ElasticArray <: DenseArray and we define Base.pointer this doesn't affect the ability to hit BLAS. Not sure what the other ramifications would be. The other alternative that works but feels wrong is parent(A::ElasticArray) = reshape(A.data, size(A)) but that seems worse?
  5. Adds Adapt.adapt_structure(to, A::ElasticArray) so the CuArrays.cu(A::ElasticArray) returns an ElasticArray wrapping a CuArray.

For A::ElasticArray, A * A, 2 * A, and inv(A) now return an ElasticArray. The goal is for matrix operations to always return an ElasticArray, but without more test coverage I'm no sure if I'm missing something.

This depends on the changes in #20 so we can wait until that's merged. I also need to add a test for adapt_structure. I still hit scalar indexing with CuArrays, so need to figure that out as well.

@oschulz
Copy link
Collaborator

oschulz commented Apr 18, 2020

Could you rebase this to the current master? Will make review easier.

@oschulz
Copy link
Collaborator

oschulz commented Apr 21, 2020

Bump @colinxs :-)

@codecov-io
Copy link

codecov-io commented Apr 21, 2020

Codecov Report

Merging #21 into master will decrease coverage by 3.77%.
The diff coverage is 58.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #21      +/-   ##
==========================================
- Coverage   89.85%   86.07%   -3.78%     
==========================================
  Files           3        3              
  Lines          69       79      +10     
==========================================
+ Hits           62       68       +6     
- Misses          7       11       +4     
Impacted Files Coverage Δ
src/elasticarray.jl 85.52% <58.33%> (-3.87%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 48d3ec5...db6cbb8. Read the comment docs.

@colinxs
Copy link
Contributor Author

colinxs commented Apr 21, 2020

Thanks for the reminder @oschulz! I rebased it.

I'll try to make that first FlattenedArray PR in the next few days.


Base.parent(A::ElasticArray) = A.data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove Base.parent? Or did it just move somewhere else (don't see it right now)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because at least one stdlib function depends on size(A::AbstractArray) == size(parent(similar(A, ...))). Here's a snippet from LinearAlgebra.inv:

Ai = inv!(lu(AA))
Ai = convert(typeof(parent(Ai)), Ai)

Before adding Base.similar(A::ElasticArray, ...), inv!(lu(AA)) returned an Matrix, but now it returns an ElasticArray. So when we do convert(typeof(parent(Ai)), Ai) we try to convert a matrix into a vector (because typeof(parent(Ai)) <: AbstractVector), which fails.

If we change Base.similar to Base.similar(A::ElasticArray, T::Type, dims::Dims) = similar(A.data, T, dims) as mentioned above then we can add back in Base.parent.

But from the docs for Base.parent

Returns the "parent array" of an array view type (e.g., SubArray), or the array
itself if it is not a view.

Is ElasticArray really a "view type"? Do we need Base.parent? I'm unsure :)

@oschulz
Copy link
Collaborator

oschulz commented Apr 22, 2020

Ok, done with the review - the only tricky thing is Base.similar(A::ElasticArray, ...), see my comments.

@oschulz oschulz merged commit f468be5 into JuliaArrays:master May 4, 2020
@oschulz
Copy link
Collaborator

oschulz commented May 4, 2020

Thanks again for this, @colinxs!

Can you battle test it in your application(s) for a few day, before we make a release? Or do have have further changes that you'd like to make first?

@garrison
Copy link
Contributor

garrison commented May 6, 2020

Just saw the discussion (sorry I did not reply sooner). I will test master with all my existing code and let you know if anything breaks.

@colinxs
Copy link
Contributor Author

colinxs commented May 6, 2020

@oschulz yup! No worries. My only remaining concerns are:

  1. Add a test for adapt_structure.
  2. I was hitting scalar indexing when the parent was a CuArray, so we'll need to look into how to properly wrap a CuArray. Perhaps @aterenin could take a look at this as well since he's earlier PR suggested he was ElasticArrays + CuArrays.
  3. We test that for A::ElasticArray, A * A, 2 * A, and inv(A) return an ElasticArray, but we might be missing something.
  4. We should probably do a benchmark comparison between this PR and an Array for array math and broadcasting to make sure this doesn't slow anything down.

@aterenin
Copy link
Contributor

aterenin commented May 6, 2020

My PR was very tiny and made in order to allow compatibility, and make things like this feasible. To fix scalar getindex, I think one would have to implement something that tells the broadcasting machinery that it should promote broadcasting to the child array. I don't know the internals for broadcasting well enough to know how to do this.

@oschulz
Copy link
Collaborator

oschulz commented May 6, 2020

@garrison , thanks, I'll wait for your feedback before making a release.

@oschulz
Copy link
Collaborator

oschulz commented May 6, 2020

@aterenin, sorry, which PR?

@oschulz
Copy link
Collaborator

oschulz commented May 6, 2020

@colinxs: My only remaining concerns are [...]

Sure, we don't need to rush a release, please do let me know how the benchmarks turn out. I have a feeling we're all aiming for top performance here.

@garrison
Copy link
Contributor

garrison commented May 6, 2020

Happy to report that all of my code works as expected with master.

One thing I did notice a few weeks ago is that with #20, it is now possible to append! an array whose size is incompatible with the ElasticArray, as long as the length of the iterable agrees. I'm a bit hesitant about this, but also welcome that ElasticArrays will now work with appending arbitrary iterables rather than arrays. Perhaps an additional check should be done, though, if it is something that has a definite size?

@garrison
Copy link
Contributor

garrison commented May 6, 2020

Also, if there is to be any hope of recovering after an exception, shouldn't _check_size be called before actually modifying any data?

@oschulz
Copy link
Collaborator

oschulz commented May 6, 2020

One thing I did notice a few weeks ago is that with #20, it is now possible to append! an array whose size is incompatible with the ElasticArray, as long as the length of the iterable agrees.

You mean if the shape is wrong, but the length compatible? If so, I'd say that since

append!(rand(6), rand(3,4))

is allowed

append!(ElasticArray{Float64}(undef, 3, 4), rand(6))

should be allowed too.

@oschulz
Copy link
Collaborator

oschulz commented May 6, 2020

Also, if there is to be any hope of recovering after an exception, shouldn't _check_size be called before actually modifying any data

Oh, yes, that would probably make sense! :-)

@colinxs ?

@oschulz
Copy link
Collaborator

oschulz commented May 6, 2020

Thanks for helping out with testing, @garrison !

@garrison
Copy link
Contributor

garrison commented May 6, 2020

If so, I'd say that since

This is a good point. But one could also make the case that append!(rand(6), rand(3,4)) is different conceptually from append!(rand(3,4), rand(6)), and the latter is more like the example you gave. My concern here is that somebody will call something like ElasticArray{Float64}(undef, 3, 4), rand(4, 3)) and not get the expected result/data alignment.

@oschulz
Copy link
Collaborator

oschulz commented May 6, 2020

Hm, yes ... maybe always allow to append a vector or iterator, but higher-dimensional arrays only when compatible? @colinxs , what do you think?

@colinxs
Copy link
Contributor Author

colinxs commented May 6, 2020

Also, if there is to be any hope of recovering after an exception, shouldn't _check_size be called before actually modifying any data

I thought about doing that, but that requires either:

  1. Iterating through iter twice, first the check the sizes, the second to append the data (2N iterations + 1 resize! + 1 copyto!)
  2. Appending each element one-by-one (N iterations + N resize!s + N copyto!s)
    It may not be possible to iterate over iter twice.

If you look at e.g. Base.copyto! for an AbstractArray it may fail partway through, but I agree this is a bit different since the ElasticArray may end up in a corrupted state. So perhaps the best middle ground is append iter in batch if iter::AbstractArray (as we can check the size beforehand), otherwise do (2).

This is a good point. But one could also make the case that append!(rand(6), rand(3,4)) is different ...

Maybe, but that starts to treat ElasticArray as an array-of-arrays, which its not. As for user's making errors, god forbid they do append!(ElasticArray{Float64}(undef, 3, 3), rand(3,3). No size checking helps you there haha. Both sides have good points, but I'm partial to allowing append!(ElasticArray{Float64}(undef, 3, 4), rand(4,3).

cc : @garrison @oschulz

@colinxs
Copy link
Contributor Author

colinxs commented May 6, 2020

Appending each element one-by-one (N iterations + N resize!s + N copyto!s)
It may not be possible to iterate over iter twice.

I just realized that you could optimize this as well by using HasLength/HasEltype and re-creating the machinery in Base._append!

@oschulz
Copy link
Collaborator

oschulz commented May 6, 2020

Hm, what about a try/catch, and we truncate/revert and rethrow if necessary?

@colinxs
Copy link
Contributor Author

colinxs commented May 7, 2020

Seems kind of hacky (and try/catch has it's own performance issues). I think our best option is to just recreate the machinery in Base.

@colinxs
Copy link
Contributor Author

colinxs commented May 7, 2020

@oschulz @garrison

Boom done: #23

@colinxs
Copy link
Contributor Author

colinxs commented May 7, 2020

Also, in reference to the discussion whether append!(ElasticArray{Float64}(undef, 3, 4), rand(4,3)) should be allowed or not, it's always been possible with ElasticArrays, #20 didn't change that. Here's the previous definition:

function Base.append!(dest::ElasticArray, src::AbstractArray)
    rem(length(eachindex(src)), dest.kernel_length) != 0 && throw(DimensionMismatch("Can't append, length of source array is incompatible"))
    append!(dest.data, src)
    dest
end

So it only checked that length(src)==dest.kernel_length and not front(size(src)) == dest.kernel_size.

With that in mind I think we should go ahead and tag a release that includes #20 and #23. I don't have time right now to address the points in #21 (comment), so if we don't feel comfortable with that right now you can go ahead and revert f468be5 and re-open this PR for a later merge.

@oschulz
Copy link
Collaborator

oschulz commented May 7, 2020

it's always been possible with ElasticArrays, #20 didn't change that.

Yes, I think we should discuss this in a separate issue, it shouldn't block the release. @garrison, since you brought it up, do you want to create an issue on append and shape of the "appendee"?

@oschulz
Copy link
Collaborator

oschulz commented May 7, 2020

Ok, all merged - I would call the result v1.1, I'd say the changes we not really breaking. Or would you prefer a v2, @colinxs and @garrison?

@garrison
Copy link
Contributor

garrison commented May 7, 2020

@garrison, since you brought it up, do you want to create an issue on append and shape of the "appendee"?

Sure.

I would call the result v1.1, I'd say the changes we not really breaking.

Sounds good to me.

@oschulz
Copy link
Collaborator

oschulz commented May 7, 2020

Ok, then I'll do a few more application tests myself over the weekend, and then tag a release early next week.

@colinxs
Copy link
Contributor Author

colinxs commented May 7, 2020

Will this next release include f468be5? Or do you plan to revert it temporarily? I think either option is fine, just curious.

As for the next version, technically it should be v2.0.0 because the addition of the type parameter V is a breaking change.

@oschulz
Copy link
Collaborator

oschulz commented May 7, 2020

Will this next release include f468be5?

My plan is to base the release on the current master, no reverts.

@colinxs
Copy link
Contributor Author

colinxs commented May 7, 2020

Okay, sounds good! Any chance you'd be able to tag a release today? Haha I'm waiting to make a few releases that depend on #20 and #23. For what's it's worth, the fork of ElasticArrays that I've been using for a few weeks includes those changes, so it's fairly well tested.

@colinxs
Copy link
Contributor Author

colinxs commented May 7, 2020

I'll be on hand to patch any bugs that may come up :)

@oschulz
Copy link
Collaborator

oschulz commented May 7, 2020

I'll be on hand to patch any bugs that may come up :)

Now that's an offer! :-)

Here we go: JuliaRegistries/General#14361

@colinxs
Copy link
Contributor Author

colinxs commented May 7, 2020

Awesome! Thank you :)

@oschulz
Copy link
Collaborator

oschulz commented May 7, 2020

You're very welcome - thanks for contributing!

@oschulz oschulz mentioned this pull request May 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants