give `sum(T::Type, xs) = zero(T)` when xs is empy #20561

felixrehren · 2017-02-10T17:10:38Z

From #5311, we allow notation like sum(T::Type, xs) to cast the return type of the sum. Currently

julia> sum(Float64, [])
ERROR: ArgumentError: reducing over an empty collection is not allowed

With this PR:

julia> sum(Float64, [])
0.0

This lets you avoid writing boilerplate if isempty(xs) ... else ... code around what should be a simple sum(T, xs), by handling the case when xs is empty in the natural way.

This is achieved by extending mr_empty, which dispatches on f::Callable, op, T::Type. Here Callable = Union{Type, Function}. When f::Function, there is currently nothing we can say in general (some special cases are defined, like identity and abs2). When f::Type, however, the user has indicated what resulting type he would like: therefore, when op = +, return zero(f), and when op = *, return one(f). This also enhances the behaviour of sum(T::Type, xs) and prod(T::Type, xs).

Note this introduces a new error possibility:

julia> sum(String, [])
ERROR: MethodError: no method matching zero(::Type{String})

but I feel like anyone who tries to sum strings or any type T without a zero(T) has only himself to blame. And of course

julia> prod(String, [])
""

We allow users to call `sum(T::Type, xs)`. Currently, e.g. `sum(Float64, [])` gives an error because the `eltype` of `[]` is `Any` and so inference fails. But the fact that the user supplied a type means that we know the intended `eltype`, so we can use it to set e.g. `sum(Float64, []) = zero(Float64)`.

ararslan · 2017-02-10T20:53:44Z

I don't think the behavior sum(T, Any[]) == zero(T) is sound. If the element type of the input can be anything, it's not safe to assume that one could convert any given element to T if the input collection was non-empty. So I don't think it makes sense to make that assumption in the empty case.

pabloferz · 2017-02-10T20:57:49Z

Also, if you're going to write sum(Float64, []) you might just write sum(Float64[]) instead, which already works.

oscardssmith · 2017-02-10T21:52:14Z

The place where this is better is stuff like sum(Float64, x) where x is an array of Float32's. This would allow the removal of current sum promotion magic

felixrehren · 2017-02-10T22:10:22Z

@ararslan The caller of sum(T, xs) is expecting eltype(xs) to be convertible to T. It's their responsibility to make sure that expectation is right, otherwise they'll be hit by errors whenever xs is nonempty. So the assumption doesn't hold in the empty case in general -- but, conditional on the caller having decided this is what they need, I think it's reasonable. [edit: callee -> caller, got it backwards before ...]

@pabloferz Agreed, of course. And someone who was going to write sum(Float64[]) in the first place might then just write zero(Float64), or just go straight for 0.0... It's useful for the cases when you write sum(Float64, xs), and sometimes xs may be empty. It lets you avoid writing an additional boilerplate if isempty(xs) ... else ... case (edited to suggest that above, too).

@oscardssmith I'm not familiar with that, please could you explain?

@tkelman will add tests if some consensus on usefulness emerges 👍

ararslan · 2017-02-10T22:19:51Z

conditional on the callee having decided this is what they need, I think it's reasonable.

That can't be assumed in general, especially when writing generic code. As careful as people try to be with types, type inference can still give you an element type of Any in some cases, e.g. with generators. It's better to deal with those cases explicitly using isempty than to get out a potentially incorrect return type.

This would allow the removal of current sum promotion magic

I don't think that's true at all? This PR concerns only the empty case, plus I'd actually argue that this behavior results in more magic.

JeffBezanson · 2017-02-11T21:01:34Z

type inference can still give you an element type of Any in some cases, e.g. with generators

Which I wish we didn't do! But since we have that behavior, we need to mitigate it by reducing the impact of getting a wider-typed array: Any[1,2] and Int[1,2] should behave similarly wherever possible.

So I'd be fine with this, except I believe it makes some cases of sum type-unstable: for example sum(Integer, Int8[1,2]) gives an Int32, but zero(Integer) gives an Int. So the type would depend on whether the argument is empty. Also, if you pass Any as the type you'll get no method matching zero(::Type{Any}) instead of the usual empty-reduce error message.

nalimilan · 2017-02-12T15:10:54Z

This behavior makes sense to me too. The fact that the input can contain elements of an incorrect type is irrelevant: sum(Int, ...) will either return an Int or fail if conversion isn't possible. So why not extend this to the case where the input array is empty? This is particularly helpful because we rely on inference for empty arrays: specifying the desired return type to sum will protect your from an unexpected eltype of an empty array.

felixrehren · 2017-02-12T18:07:36Z

@JeffBezanson That's interesting, I hadn't thought of that. Where does the Int promotion happen? (I can't follow it here :/) It looks like mapreduce consistently converts its Int8 arguments to Int32, even when the system is 64 bit (here: on juliabox.com / Julia 0.5). I don't know why typeof(zero(Integer)) = Int64, typeof(Integer(zero(Int8))) = Int8, but typeof(sum(Int8[])) == Int32. Could this PR access the same promotion rules, to be compatible with this behaviour?

Of course this type-instability can be avoided by writing if isempty(xs) ... else .... Still, would like to get this type-stable on its own 👍

I agree, the error message is less appropriate when someone writes sum(T,xs) for empty xs and a type T without zero. We would want to add sum(T, itr) to the docstring -- would this be sufficiently informative?

sum(T, itr)

Casts the elements of itr to type T and returns the sum. If itr is empty, returns zero(T).

martinholters · 2017-02-13T08:29:07Z

This needs to be considered together with #20560, as presently e.g. typeof(sum(Int8, [1,2,3])) == Int32. So typeof(sum(Int8, [])) == Int8 would be an unfortunate introduction of type-instability. Furthermore, solving #18423 would allow us to forbid summing an empty collection without need for the isempty boilerplate, and the given neutral element could decide the return type. So I don't think this PR makes much sense at the moment, but could be reconsidered after #20560 is settled.

JeffreySarnoff · 2017-03-23T21:25:23Z

just wondering -- does this suggest prod(T, []) == one(T)?

felixrehren · 2017-03-23T22:11:08Z

Yup!

tpapp · 2019-01-04T09:47:47Z

It would be great to revive this PR, or find a solution of some form. I understand that writing type inferrable code is ideal, but sometimes it is not possible or worth it, then one has to special-case sum anyway. I frequently find doing something like

robustzero(_) = 0.0
robustzero(::Type{T}) where {T <: Real} = zero(T)
isempty(v) ? robustzero(eltype(T)) : sum(v)

or similar, eg with nested AD applications where getting inference to work can be tricky.

JeffreySarnoff · 2019-01-04T10:22:25Z

maybe

Base.sum(::Type{R}, x::AbstractArray) where {R<:Real} = isempty(x) ? sum(zero(R)) : sum(x)

tpapp · 2019-01-04T10:36:48Z

Possibly, but I would prefer to go to the fallback only when x has no known element type that has a zero. Eg

sum(Int, Float64[]) === 0.0
sum(Int, Any[]) === 0

JeffreySarnoff · 2019-01-04T11:00:12Z

Base.sum(::Type{R1},x::Array{R2,1}) where {R1<:Real, R2<:Real} = isempty(x) ? sum(zero(R2)) : sum(x)

Base.sum(::Type{R1},x::Array{R2,1}) where {R1<:Real, R2<:Any} = isempty(x) ? sum(zero(R1)) : sum(x)

julia> sum(Int, [1.0, 2.0])
3.0

julia> sum(Int, Float32[])
0.0f0

julia> sum(Int, [])
0

julia> sum(Float32, [])
0.0f0

tpapp · 2019-01-04T11:05:04Z

Neat. Now if only we could replace Real with "all types for which zero is defined"; but that may be tedious to maintain in the long run.

JeffreySarnoff · 2019-01-04T11:26:44Z

Does this approach meet the desired outcomes mentioned early in this thread?
(particularly #20561 (comment))

Base.sum(::Type{R1},x::Array{R2,1}) where {R1<:Number, R2<:Number} =
    isempty(x) ? sum(zero(R2)) : sum(x)

Base.sum(::Type{R1},x::Array{R2,1}) where {R1<:Number, R2<:Any} =
    isempty(x) ? sum(zero(R1)) : sum(x)

Base.prod(::Type{R1},x::Array{R2,1}) where {R1<:Number, R2<:Number} =
    isempty(x) ? prod(one(R2)) : prod(x)

Base.prod(::Type{R1},x::Array{R2,1}) where {R1<:Number, R2<:Any} =
    isempty(x) ? prod(one(R1)) : prod(x)


julia> typeof(sum(Int8, [])), typeof(sum(Complex{Float32}, []))
(Int64, Complex{Float32})

julia> typeof(sum(Int8, Rational{Int16}[])), typeof(sum(Complex{Float32}, Complex{Float16}[]))
(Rational{Int16}, Complex{Float16})

julia> typeof(sum(Int8, [0.0])), typeof(sum(Complex{Float32}, [0//1]))
(Float64, Rational{Int64})

tkf · 2020-06-12T06:10:49Z

Now that we have init in sum etc. #36188, maybe we can close this PR? In general, it's better to deal with values than types in Julia API.

musm · 2020-12-15T18:41:02Z

closed since we now have #36188

felixrehren added 2 commits February 10, 2017 16:58

typo

078586e

felixrehren mentioned this pull request Feb 10, 2017

Provide method for sum that accepts a neutral element #18423

Closed

tkelman added the needs tests Unit tests are required for this change label Feb 10, 2017

felixrehren mentioned this pull request Feb 13, 2017

stop being clever with reduction result types #20560

Closed

felixrehren mentioned this pull request Feb 14, 2017

RFC reduce promotion: default to system default #20607

Closed

5 tasks

benninkrs mentioned this pull request Jun 5, 2020

Change one to onemult #36103

Open

tkf added the fold sum, maximum, reduce, foldl, etc. label Jun 13, 2020

musm closed this Dec 15, 2020

stevengj mentioned this pull request Jun 29, 2023

fix conversion of empty Dates.CompoundPeriod() to zero units #50259

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

give `sum(T::Type, xs) = zero(T)` when xs is empy #20561

give `sum(T::Type, xs) = zero(T)` when xs is empy #20561

felixrehren commented Feb 10, 2017 •

edited

Loading

ararslan commented Feb 10, 2017

pabloferz commented Feb 10, 2017

oscardssmith commented Feb 10, 2017

felixrehren commented Feb 10, 2017 •

edited

Loading

ararslan commented Feb 10, 2017

JeffBezanson commented Feb 11, 2017

nalimilan commented Feb 12, 2017

felixrehren commented Feb 12, 2017 •

edited

Loading

martinholters commented Feb 13, 2017

JeffreySarnoff commented Mar 23, 2017

felixrehren commented Mar 23, 2017

tpapp commented Jan 4, 2019

JeffreySarnoff commented Jan 4, 2019

tpapp commented Jan 4, 2019

JeffreySarnoff commented Jan 4, 2019 •

edited

Loading

tpapp commented Jan 4, 2019

JeffreySarnoff commented Jan 4, 2019 •

edited

Loading

tkf commented Jun 12, 2020

musm commented Dec 15, 2020

give sum(T::Type, xs) = zero(T) when xs is empy #20561

give sum(T::Type, xs) = zero(T) when xs is empy #20561

Conversation

felixrehren commented Feb 10, 2017 • edited Loading

ararslan commented Feb 10, 2017

pabloferz commented Feb 10, 2017

oscardssmith commented Feb 10, 2017

felixrehren commented Feb 10, 2017 • edited Loading

ararslan commented Feb 10, 2017

JeffBezanson commented Feb 11, 2017

nalimilan commented Feb 12, 2017

felixrehren commented Feb 12, 2017 • edited Loading

martinholters commented Feb 13, 2017

JeffreySarnoff commented Mar 23, 2017

felixrehren commented Mar 23, 2017

tpapp commented Jan 4, 2019

JeffreySarnoff commented Jan 4, 2019

tpapp commented Jan 4, 2019

JeffreySarnoff commented Jan 4, 2019 • edited Loading

tpapp commented Jan 4, 2019

JeffreySarnoff commented Jan 4, 2019 • edited Loading

tkf commented Jun 12, 2020

musm commented Dec 15, 2020

give `sum(T::Type, xs) = zero(T)` when xs is empy #20561

give `sum(T::Type, xs) = zero(T)` when xs is empy #20561

felixrehren commented Feb 10, 2017 •

edited

Loading

felixrehren commented Feb 10, 2017 •

edited

Loading

felixrehren commented Feb 12, 2017 •

edited

Loading

JeffreySarnoff commented Jan 4, 2019 •

edited

Loading

JeffreySarnoff commented Jan 4, 2019 •

edited

Loading