Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

broadcasting not type stable #140

Open
dlfivefifty opened this issue Nov 24, 2020 · 1 comment
Open

broadcasting not type stable #140

dlfivefifty opened this issue Nov 24, 2020 · 1 comment

Comments

@dlfivefifty
Copy link
Member

The broadcast code is not type stable, see below. The code seems quite complicated with generated functions, so I think the best would be to right special case code for the vector and matrix case.

julia>         u = BlockArray(randn(5), [2,3]);

julia>         @code_warntype copyto!(similar(u), Base.broadcasted(exp, u))
Variables
  #self#::Core.Compiler.Const(copyto!, false)
  dest::BlockArray{Float64,1,Array{Array{Float64,1},1},Tuple{BlockedUnitRange{Array{Int64,1}}}}
  bc::Base.Broadcast.Broadcasted{BlockArrays.BlockStyle{1},Nothing,typeof(exp),Tuple{BlockArray{Float64,1,Array{Array{Float64,1},1},Tuple{BlockedUnitRange{Array{Int64,1}}}}}}
  bs::Tuple{BlockedUnitRange{Array{Int64,1}}}
  @_5::Union{Nothing, Tuple{Union{Tuple{}, Tuple{Any,Vararg{Any,N} where N}},Union{Tuple{}, Tuple{Any,Vararg{Any,N} where N}}}}
  blockindexrange_0_1::Any
  blockindexrange_1_1::Any
  @_8::Int64

Body::BlockArray{Float64,1,Array{Array{Float64,1},1},Tuple{BlockedUnitRange{Array{Int64,1}}}}
1 ─       Core.NewvarNode(:(@_5))
│         (bs = BlockArrays.axes(bc))
│   %3  = BlockArrays.axes(dest)::Tuple{BlockedUnitRange{Array{Int64,1}}}%4  = BlockArrays.blockisequal(%3, bs)::Bool%5  = !%4::Bool
└──       goto #3 if not %5
2%7  = BlockArrays.PseudoBlockArray(dest, bs)::PseudoBlockArray{Float64,1,BlockArray{Float64,1,Array{Array{Float64,1},1},Tuple{BlockedUnitRange{Array{Int64,1}}}},Tuple{BlockedUnitRange{Array{Int64,1}}}}
│         BlockArrays.copyto!(%7, bc)
└──       return dest
3%10 = BlockArrays.subblocks(dest, bs, 1)::Union{Base.Generator{BlockRange{1,Tuple{Base.OneTo{Int64}}},BlockArrays.var"#42#43"}, BlockArrays.SubBlockIterator}
│   %11 = Core.tuple(%10)::Tuple{Union{Base.Generator{BlockRange{1,Tuple{Base.OneTo{Int64}}},BlockArrays.var"#42#43"}, BlockArrays.SubBlockIterator}}
│   %12 = Base.getproperty(bc, :args)::Tuple{BlockArray{Float64,1,Array{Array{Float64,1},1},Tuple{BlockedUnitRange{Array{Int64,1}}}}}
│   %13 = BlockArrays.Ref(bs)::Base.RefValue{Tuple{BlockedUnitRange{Array{Int64,1}}}}
│   %14 = BlockArrays.Ref(1)::Base.RefValue{Int64}%15 = Base.broadcasted(BlockArrays.subblocks, %12, %13, %14)::Base.Broadcast.Broadcasted{Base.Broadcast.Style{Tuple},Nothing,typeof(BlockArrays.subblocks),Tuple{Tuple{BlockArray{Float64,1,Array{Array{Float64,1},1},Tuple{BlockedUnitRange{Array{Int64,1}}}}},Base.RefValue{Tuple{BlockedUnitRange{Array{Int64,1}}}},Base.RefValue{Int64}}}
│   %16 = Base.materialize(%15)::Tuple{Union{Base.Generator{BlockRange{1,Tuple{Base.OneTo{Int64}}},BlockArrays.var"#42#43"}, BlockArrays.SubBlockIterator}}
│   %17 = Core._apply_iterate(Base.iterate, BlockArrays.zip, %11, %16)::Base.Iterators.Zip
│         (@_5 = Base.iterate(%17))
│   %19 = (@_5 === nothing)::Bool%20 = Base.not_int(%19)::Bool
└──       goto #6 if not %20
4%22 = @_5::Tuple{Union{Tuple{}, Tuple{Any,Vararg{Any,N} where N}},Union{Tuple{}, Tuple{Any,Vararg{Any,N} where N}}}::Tuple{Union{Tuple{}, Tuple{Any,Vararg{Any,N} where N}},Union{Tuple{}, Tuple{Any,Vararg{Any,N} where N}}}
│   %23 = Core.getfield(%22, 1)::Union{Tuple{}, Tuple{Any,Vararg{Any,N} where N}}%24 = Base.indexed_iterate(%23, 1)::Core.Compiler.PartialStruct(Tuple{Any,Int64}, Any[Any, Core.Compiler.Const(2, false)])
│         (blockindexrange_0_1 = Core.getfield(%24, 1))
│         (@_8 = Core.getfield(%24, 2))
│   %27 = Base.indexed_iterate(%23, 2, @_8::Core.Compiler.Const(2, false))::Core.Compiler.PartialStruct(Tuple{Any,Int64}, Any[Any, Core.Compiler.Const(3, false)])
│         (blockindexrange_1_1 = Core.getfield(%27, 1))
│   %29 = Core.getfield(%22, 2)::Union{Tuple{}, Tuple{Any,Vararg{Any,N} where N}}%30 = Base.getproperty(bc, :f)::Core.Compiler.Const(exp, false)
│   %31 = BlockArrays._bview(dest, blockindexrange_0_1)::Any%32 = Base.getproperty(bc, :args)::Tuple{BlockArray{Float64,1,Array{Array{Float64,1},1},Tuple{BlockedUnitRange{Array{Int64,1}}}}}
│   %33 = Base.getindex(%32, 1)::BlockArray{Float64,1,Array{Array{Float64,1},1},Tuple{BlockedUnitRange{Array{Int64,1}}}}
│   %34 = BlockArrays._bview(%33, blockindexrange_1_1)::Any
│         BlockArrays.broadcast!(%30, %31, %34)
│         (@_5 = Base.iterate(%17, %29))
│   %37 = (@_5 === nothing)::Bool%38 = Base.not_int(%37)::Bool
└──       goto #6 if not %38
5 ─       goto #4
6return dest
@dlfivefifty
Copy link
Member Author

Made a quick-and-dirty type stable vector version that is roughly 4x faster:

        function fastbroadcast!(dest, bc)
            @inbounds for K in blockaxes(bc)[1]
                KI = K[1:Int(K)]
                broadcast!(bc.f, view(dest,KI), __bview(bc.args, KI)...)
            end
            dest
        end

Note the KI = K[1:Int(K)] should in theory be completely unnecessary, however, in my case without it (that is, just using view(.., K)), it's 20x slower. This is because in my test problem view(bc.args[j], KI) returns a Range, where view(bc.args[j],K) returns a view, adding extra computation. #138 provides an alternative solution to simplify this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant