Skip to content

Made WeightVec a subtype of RealVector #248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 24, 2017
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 23 additions & 6 deletions src/weights.jl
Original file line number Diff line number Diff line change
@@ -1,19 +1,32 @@

###### Weight vector #####

immutable WeightVec{W,Vec<:RealVector}
values::Vec
sum::W
if VERSION < v"0.6.0-dev.2123"
immutable WeightVec{S<:Real, T<:Real, V<:RealVector} <: RealVector{T}
values::V
sum::S
end
else
immutable WeightVec{S<:Real, T<:Real, V<:AbstractVector{T}} <: AbstractVector{T}
values::V
sum::S
end
end

"""
WeightVec(vs, [wsum])
function WeightVec{S<:Real, V<:RealVector}(vs::V, s::S)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is incorrect, the signature should just reflect how the user can call the function, not how it's implemented.


Construct a `WeightVec` with weight values `vs` and sum of weights `wsum`.
If omitted, `wsum` is computed.
"""
WeightVec{Vec<:RealVector,W<:Real}(vs::Vec,wsum::W) = WeightVec{W,Vec}(vs, wsum)
WeightVec(vs::RealVector) = WeightVec(vs, sum(vs))
function WeightVec{S<:Real, V<:RealVector}(vs::V, s::S)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can actually use s::S=sum(vs) to get rid of the next method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I didn't realize that was valid syntax in 0.5 (ie: referencing an earlier argument when setting the default value).

return WeightVec{S, eltype(vs), V}(vs, s)
end

function WeightVec{V<:RealVector}(vs::V)
s = sum(vs)
return WeightVec{typeof(s), eltype(vs), V}(vs, s)
end

"""
weights(vs)
@@ -30,6 +43,7 @@ sum(wv::WeightVec) = wv.sum
isempty(wv::WeightVec) = isempty(wv.values)

Base.getindex(wv::WeightVec, i) = getindex(wv.values, i)
Base.size(wv::WeightVec) = size(wv.values)


##### Weighted sum #####
@@ -281,6 +295,9 @@ Base.mean{T<:Number,W<:Real}(A::AbstractArray{T}, w::WeightVec{W}, dim::Int) =


###### Weighted median #####
function Base.median(v::AbstractArray, w::WeightVec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this method be unnecessary since all you're doing is throwing a MethodError? Presumably without the method, it would also be a MethodError.

Copy link
Member Author

@rofinn rofinn Apr 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to added this because this test was failing with a BoundsError instead. It appears that making WeightVec a RealVector allowed it to dispatch to a method in base.

julia> median([4 3 2 1 0], weights(wt))
ERROR: BoundsError: attempt to access 2-element Array{Any,1} at index [3]
 in mapslices(::Base.#median!, ::Array{Int64,2}, ::StatsBase.WeightVec{Int64,Array{Int64,1},Int64}) at ./abstractarray.jl:1619
 in median(::Array{Int64,2}, ::StatsBase.WeightVec{Int64,Array{Int64,1},Int64}) at ./statistics.jl:579```

I figured it was better to manually throw a `MethodError` rather than letting it hit the `BoundsError` in that case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. This is fine then. Thanks for clarifying.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, the current behavior isn't that different:

julia> median([4 3 2 1 0], weights([1,2,3,4,5]))
ERROR: MethodError: no method matching start(::StatsBase.WeightVec{Int64,Array{Int64,1}})
Closest candidates are:
  start(::SimpleVector) at essentials.jl:259
  start(::Base.MethodList) at reflection.jl:560
  start(::ExponentialBackOff) at error.jl:107
  ...
Stacktrace:
 [1] append_any(::StatsBase.WeightVec{Int64,Array{Int64,1}}, ::Vararg{StatsBase.WeightVec{Int64,Array{Int64,1}},N} where N) at ./essentials.jl:170
 [2] median(::Array{Int64,2}, ::StatsBase.WeightVec{Int64,Array{Int64,1}}) at ./statistics.jl:619

So I'm not sure this MethodError is really needed. We typically don't throw them for all possible cases that fail.

throw(MethodError(median, (typeof(v), typeof(w))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hadn't noticed that one. Should pass (v, w) instead of their types. Otherwise, looks good to me.

end

function Base.median{W<:Real}(v::RealVector, w::WeightVec{W})
isempty(v) && error("median of an empty array is undefined")
9 changes: 8 additions & 1 deletion test/weights.jl
Original file line number Diff line number Diff line change
@@ -5,10 +5,12 @@ import Compat: view

@test isa(weights([1, 2, 3]), WeightVec{Int})
@test isa(weights([1., 2., 3.]), WeightVec{Float64})

@test isa(weights([1 2 3; 4 5 6]), WeightVec{Int})

@test isa(WeightVec([1, 2, 3], 6), WeightVec{Int})

@test isempty(weights(Float64[]))
@test size(weights([1, 2, 3])) == (3,)

w = [1., 2., 3.]
wv = weights(w)
@@ -26,6 +28,11 @@ bv = weights(b)
@test sum(bv) === 3
@test !isempty(bv)

ba = BitArray([true, false, true])
sa = sparsevec([1., 0., 2.])

@test sum(ba, wv) === 4.0
@test sum(sa, wv) === 7.0

## wsum