Skip to content

Commit b8a6b10

Browse files
committed
improve performance issue of @nospecialize-d keyword func call
This commit tries to fix and improve performance for calling keyword funcs whose arguments types are not fully known but `@nospecialize`-d. The final result would look like (this particular example is taken from our Julia-level compiler implementation): ```julia abstract type CallInfo end struct NoCallInfo <: CallInfo end struct NewInstruction stmt::Any type::Any info::CallInfo line::Union{Int32,Nothing} # if nothing, copy the line from previous statement in the insertion location flag::Union{UInt8,Nothing} # if nothing, IR flags will be recomputed on insertion function NewInstruction(@nospecialize(stmt), @nospecialize(type), @nospecialize(info::CallInfo), line::Union{Int32,Nothing}, flag::Union{UInt8,Nothing}) return new(stmt, type, info, line, flag) end end @nospecialize function NewInstruction(newinst::NewInstruction; stmt=newinst.stmt, type=newinst.type, info::CallInfo=newinst.info, line::Union{Int32,Nothing}=newinst.line, flag::Union{UInt8,Nothing}=newinst.flag) return NewInstruction(stmt, type, info, line, flag) end @Specialize using BenchmarkTools struct VirtualKwargs stmt::Any type::Any info::CallInfo end vkws = VirtualKwargs(nothing, Any, NoCallInfo()) newinst = NewInstruction(nothing, Any, NoCallInfo(), nothing, nothing) runner(newinst, vkws) = NewInstruction(newinst; vkws.stmt, vkws.type, vkws.info) @benchmark runner($newinst, $vkws) ``` > on master ``` BenchmarkTools.Trial: 10000 samples with 186 evaluations. Range (min … max): 559.898 ns … 4.173 μs ┊ GC (min … max): 0.00% … 85.29% Time (median): 605.608 ns ┊ GC (median): 0.00% Time (mean ± σ): 638.170 ns ± 125.080 ns ┊ GC (mean ± σ): 0.06% ± 0.85% █▇▂▆▄ ▁█▇▄▂ ▂ ██████▅██████▇▇▇██████▇▇▇▆▆▅▄▅▄▂▄▄▅▇▆▆▆▆▆▅▆▆▄▄▅▅▄▃▄▄▄▅▃▅▅▆▅▆▆ █ 560 ns Histogram: log(frequency) by time 1.23 μs < Memory estimate: 32 bytes, allocs estimate: 2. ``` > on this commit ```julia BenchmarkTools.Trial: 10000 samples with 1000 evaluations. Range (min … max): 3.080 ns … 83.177 ns ┊ GC (min … max): 0.00% … 0.00% Time (median): 3.098 ns ┊ GC (median): 0.00% Time (mean ± σ): 3.118 ns ± 0.885 ns ┊ GC (mean ± σ): 0.00% ± 0.00% ▂▅▇█▆▅▄▂ ▂▄▆▆▇████████▆▃▃▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▁▁▂▂▂▁▂▂▂▂▂▂▁▁▂▁▂▂▂▂▂▂▂▂▂ ▃ 3.08 ns Histogram: frequency by time 3.19 ns < Memory estimate: 0 bytes, allocs estimate: 0. ``` So for this particular case it achieves roughly 200x speed up. This is because this commit allows inlining of a call to keyword sorter as well as removal of `NamedTuple` call. Especially this commit is composed of the following improvements: - Add early return case for `structdiff`: This change improves the return type inference for a case when compared `NamedTuple`s are type unstable but there is no difference in their names, e.g. given two `NamedTuple{(:a,:b),T} where T<:Tuple{Any,Any}`s. And in such case the optimizer will remove `structdiff` and succeeding `pairs` calls, letting the keyword sorter to be inlined. - Tweak the core `NamedTuple{names}(args::Tuple)` constructor so that it directly forms `:splatnew` allocation rather than redirects to the general `NamedTuple` constructor, that could be confused for abstract input tuple type. - Improve `nfields_tfunc` accuracy as for abstract `NamedTuple` types. This improvement lets `inline_splatnew` to handle more abstract `NamedTuple`s, especially whose names are fully known but its fields tuple type is abstract. Those improvements are combined to allow our SROA pass to optimize away `NamedTuple` and `tuple` calls generated for keyword argument handling. E.g. the IR for the example `NewInstruction` constructor is now fairly optimized, like: ```julia julia> Base.code_ircode((NewInstruction,Any,Any,CallInfo)) do newinst, stmt, type, info NewInstruction(newinst; stmt, type, info) end |> only 2 1 ── %1 = Base.getfield(_2, :line)::Union{Nothing, Int32} │╻╷ Type##kw │ %2 = Base.getfield(_2, :flag)::Union{Nothing, UInt8} ││┃ getproperty │ %3 = (isa)(%1, Nothing)::Bool ││ │ %4 = (isa)(%2, Nothing)::Bool ││ │ %5 = (Core.Intrinsics.and_int)(%3, %4)::Bool ││ └─── goto #3 if not %5 ││ 2 ── %7 = %new(Main.NewInstruction, _3, _4, _5, nothing, nothing)::NewInstruction NewInstruction └─── goto #10 ││ 3 ── %9 = (isa)(%1, Int32)::Bool ││ │ %10 = (isa)(%2, Nothing)::Bool ││ │ %11 = (Core.Intrinsics.and_int)(%9, %10)::Bool ││ └─── goto #5 if not %11 ││ 4 ── %13 = π (%1, Int32) ││ │ %14 = %new(Main.NewInstruction, _3, _4, _5, %13, nothing)::NewInstruction│││╻ NewInstruction └─── goto #10 ││ 5 ── %16 = (isa)(%1, Nothing)::Bool ││ │ %17 = (isa)(%2, UInt8)::Bool ││ │ %18 = (Core.Intrinsics.and_int)(%16, %17)::Bool ││ └─── goto #7 if not %18 ││ 6 ── %20 = π (%2, UInt8) ││ │ %21 = %new(Main.NewInstruction, _3, _4, _5, nothing, %20)::NewInstruction│││╻ NewInstruction └─── goto #10 ││ 7 ── %23 = (isa)(%1, Int32)::Bool ││ │ %24 = (isa)(%2, UInt8)::Bool ││ │ %25 = (Core.Intrinsics.and_int)(%23, %24)::Bool ││ └─── goto #9 if not %25 ││ 8 ── %27 = π (%1, Int32) ││ │ %28 = π (%2, UInt8) ││ │ %29 = %new(Main.NewInstruction, _3, _4, _5, %27, %28)::NewInstruction │││╻ NewInstruction └─── goto #10 ││ 9 ── Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └─── unreachable ││ 10 ┄ %33 = φ (#2 => %7, #4 => %14, #6 => %21, #8 => %29)::NewInstruction ││ └─── goto #11 ││ 11 ─ return %33 │ => NewInstruction ```
1 parent df5b081 commit b8a6b10

File tree

8 files changed

+97
-11
lines changed

8 files changed

+97
-11
lines changed

base/boot.jl

+2-1
Original file line numberDiff line numberDiff line change
@@ -615,7 +615,8 @@ end
615615

616616
NamedTuple() = NamedTuple{(),Tuple{}}(())
617617

618-
NamedTuple{names}(args::Tuple) where {names} = NamedTuple{names,typeof(args)}(args)
618+
eval(Core, :(NamedTuple{names}(args::Tuple) where {names} =
619+
$(Expr(:splatnew, :(NamedTuple{names,typeof(args)}), :args))))
619620

620621
using .Intrinsics: sle_int, add_int
621622

base/compiler/abstractinterpretation.jl

+3-3
Original file line numberDiff line numberDiff line change
@@ -2109,16 +2109,16 @@ function abstract_eval_statement_expr(interp::AbstractInterpreter, e::Expr, vtyp
21092109
elseif ehead === :splatnew
21102110
t, isexact = instanceof_tfunc(abstract_eval_value(interp, e.args[1], vtypes, sv))
21112111
nothrow = false # TODO: More precision
2112-
if length(e.args) == 2 && isconcretetype(t) && !ismutabletype(t)
2112+
if length(e.args) == 2 && isconcretedispatch(t) && !ismutabletype(t)
21132113
at = abstract_eval_value(interp, e.args[2], vtypes, sv)
21142114
n = fieldcount(t)
21152115
if isa(at, Const) && isa(at.val, Tuple) && n == length(at.val::Tuple) &&
21162116
let t = t, at = at; _all(i->getfield(at.val::Tuple, i) isa fieldtype(t, i), 1:n); end
2117-
nothrow = isexact && isconcretedispatch(t)
2117+
nothrow = isexact
21182118
t = Const(ccall(:jl_new_structt, Any, (Any, Any), t, at.val))
21192119
elseif isa(at, PartialStruct) && at ᵢ Tuple && n == length(at.fields::Vector{Any}) &&
21202120
let t = t, at = at; _all(i->(at.fields::Vector{Any})[i] fieldtype(t, i), 1:n); end
2121-
nothrow = isexact && isconcretedispatch(t)
2121+
nothrow = isexact
21222122
t = PartialStruct(t, at.fields::Vector{Any})
21232123
end
21242124
end

base/compiler/ssair/passes.jl

+11-1
Original file line numberDiff line numberDiff line change
@@ -401,6 +401,16 @@ function lift_leaves(compact::IncrementalCompact,
401401
end
402402
lift_arg!(compact, leaf, cache_key, def, 1+field, lifted_leaves)
403403
continue
404+
# NOTE we can enable this, but most `:splatnew` expressions are transformed into
405+
# `:new` expressions by the inlinear
406+
# elseif isexpr(def, :splatnew) && length(def.args) == 2 && isa(def.args[2], AnySSAValue)
407+
# tplssa = def.args[2]::AnySSAValue
408+
# tplexpr = compact[tplssa][:inst]
409+
# if is_known_call(tplexpr, tuple, compact) && 1 ≤ field < length(tplexpr.args)
410+
# lift_arg!(compact, tplssa, cache_key, tplexpr, 1+field, lifted_leaves)
411+
# continue
412+
# end
413+
# return nothing
404414
elseif is_getfield_captures(def, compact)
405415
# Walk to new_opaque_closure
406416
ocleaf = def.args[2]
@@ -469,7 +479,7 @@ function lift_arg!(
469479
end
470480
end
471481
lifted_leaves[cache_key] = LiftedValue(lifted)
472-
nothing
482+
return nothing
473483
end
474484

475485
function walk_to_def(compact::IncrementalCompact, @nospecialize(leaf))

base/compiler/tfuncs.jl

+17-3
Original file line numberDiff line numberDiff line change
@@ -403,11 +403,19 @@ add_tfunc(Core.sizeof, 1, 1, sizeof_tfunc, 1)
403403
function nfields_tfunc(@nospecialize(x))
404404
isa(x, Const) && return Const(nfields(x.val))
405405
isa(x, Conditional) && return Const(0)
406-
x = unwrap_unionall(widenconst(x))
406+
xt = widenconst(x)
407+
x = unwrap_unionall(xt)
407408
isconstType(x) && return Const(nfields(x.parameters[1]))
408409
if isa(x, DataType) && !isabstracttype(x)
409-
if !(x.name === Tuple.name && isvatuple(x)) &&
410-
!(x.name === _NAMEDTUPLE_NAME && !isconcretetype(x))
410+
if x.name === Tuple.name
411+
isvatuple(x) && return Int
412+
return Const(length(x.types))
413+
elseif x.name === _NAMEDTUPLE_NAME
414+
length(x.parameters) == 2 || return Int
415+
names = x.parameters[1]
416+
isa(names, Tuple{Vararg{Symbol}}) || return nfields_tfunc(rewrap_unionall(x.parameters[2], xt))
417+
return Const(length(names))
418+
else
411419
return Const(isdefined(x, :types) ? length(x.types) : length(x.name.names))
412420
end
413421
end
@@ -1594,6 +1602,12 @@ function apply_type_tfunc(@nospecialize(headtypetype), @nospecialize args...)
15941602
end
15951603
if istuple
15961604
return Type{<:appl}
1605+
elseif isa(appl, DataType) && appl.name === _NAMEDTUPLE_NAME && length(appl.parameters) == 2 &&
1606+
(appl.parameters[1] === () || appl.parameters[2] === Tuple{})
1607+
# if the first/second parameter of `NamedTuple` is known to be empty,
1608+
# the second/first argument should also be empty tuple type,
1609+
# so refine it here
1610+
return Const(NamedTuple{(),Tuple{}})
15971611
end
15981612
ans = Type{appl}
15991613
for i = length(outervars):-1:1

base/namedtuple.jl

+8-3
Original file line numberDiff line numberDiff line change
@@ -335,22 +335,27 @@ reverse(nt::NamedTuple) = NamedTuple{reverse(keys(nt))}(reverse(values(nt)))
335335
end
336336

337337
"""
338-
structdiff(a::NamedTuple{an}, b::Union{NamedTuple{bn},Type{NamedTuple{bn}}}) where {an,bn}
338+
structdiff(a::NamedTuple, b::Union{NamedTuple,Type{NamedTuple}})
339339
340340
Construct a copy of named tuple `a`, except with fields that exist in `b` removed.
341341
`b` can be a named tuple, or a type of the form `NamedTuple{field_names}`.
342342
"""
343343
function structdiff(a::NamedTuple{an}, b::Union{NamedTuple{bn}, Type{NamedTuple{bn}}}) where {an, bn}
344344
if @generated
345345
names = diff_names(an, bn)
346+
isempty(names) && return (;) # just a fast pass
346347
idx = Int[ fieldindex(a, names[n]) for n in 1:length(names) ]
347348
types = Tuple{Any[ fieldtype(a, idx[n]) for n in 1:length(idx) ]...}
348349
vals = Any[ :(getfield(a, $(idx[n]))) for n in 1:length(idx) ]
349-
:( NamedTuple{$names,$types}(($(vals...),)) )
350+
return :( NamedTuple{$names,$types}(($(vals...),)) )
350351
else
351352
names = diff_names(an, bn)
353+
# N.B this early return is necessary to get a better type stability,
354+
# and also allows us to cut off the cost from constructing
355+
# potentially type unstable closure passed to the `map` below
356+
isempty(names) && return (;)
352357
types = Tuple{Any[ fieldtype(typeof(a), names[n]) for n in 1:length(names) ]...}
353-
NamedTuple{names,types}(map(Fix1(getfield, a), names))
358+
return NamedTuple{names,types}(map(n::Symbol->getfield(a, n), names))
354359
end
355360
end
356361

test/compiler/inference.jl

+12
Original file line numberDiff line numberDiff line change
@@ -1526,6 +1526,11 @@ end
15261526
@test nfields_tfunc(Tuple{Int, Vararg{Int}}) === Int
15271527
@test nfields_tfunc(Tuple{Int, Integer}) === Const(2)
15281528
@test nfields_tfunc(Union{Tuple{Int, Float64}, Tuple{Int, Int}}) === Const(2)
1529+
@test nfields_tfunc(@NamedTuple{a::Int,b::Integer}) === Const(2)
1530+
@test nfields_tfunc(NamedTuple{(:a,:b),T} where T<:Tuple{Int,Integer}) === Const(2)
1531+
@test nfields_tfunc(NamedTuple{(:a,:b)}) === Const(2)
1532+
@test nfields_tfunc(NamedTuple{names,Tuple{Any,Any}} where names) === Const(2)
1533+
@test nfields_tfunc(Union{NamedTuple{(:a,:b)},NamedTuple{(:c,:d)}}) === Const(2)
15291534

15301535
using Core.Compiler: typeof_tfunc
15311536
@test typeof_tfunc(Tuple{Vararg{Int}}) == Type{Tuple{Vararg{Int,N}}} where N
@@ -2336,6 +2341,13 @@ end
23362341
# Equivalence of Const(T.instance) and T for singleton types
23372342
@test Const(nothing) Nothing && Nothing Const(nothing)
23382343

2344+
# `apply_type_tfunc` should always return accurate result for empty NamedTuple case
2345+
import Core: Const
2346+
import Core.Compiler: apply_type_tfunc
2347+
@test apply_type_tfunc(Const(NamedTuple), Const(()), Type{T} where T<:Tuple{}) === Const(typeof((;)))
2348+
@test apply_type_tfunc(Const(NamedTuple), Const(()), Type{T} where T<:Tuple) === Const(typeof((;)))
2349+
@test apply_type_tfunc(Const(NamedTuple), Tuple{Vararg{Symbol}}, Type{Tuple{}}) === Const(typeof((;)))
2350+
23392351
# Don't pessimize apply_type to anything worse than Type and yield Bottom for invalid Unions
23402352
@test Core.Compiler.return_type(Core.apply_type, Tuple{Type{Union}}) == Type{Union{}}
23412353
@test Core.Compiler.return_type(Core.apply_type, Tuple{Type{Union},Any}) == Type

test/compiler/inline.jl

+43
Original file line numberDiff line numberDiff line change
@@ -1770,3 +1770,46 @@ f_ifelse_3(a, b) = Core.ifelse(a, true, b)
17701770
@test fully_eliminated(f_ifelse_1, Tuple{Any, Any}; retval=Core.Argument(2))
17711771
@test fully_eliminated(f_ifelse_2, Tuple{Any, Any}; retval=Core.Argument(3))
17721772
@test !fully_eliminated(f_ifelse_3, Tuple{Any, Any})
1773+
1774+
# inline_splatnew for abstract `NamedTuple`
1775+
@eval construct_splatnew(T, fields) = $(Expr(:splatnew, :T, :fields))
1776+
for tt = Any[(Int,Int), (Integer,Integer), (Any,Any)]
1777+
let src = code_typed1(tt) do a, b
1778+
construct_splatnew(NamedTuple{(:a,:b),typeof((a,b))}, (a,b))
1779+
end
1780+
@test count(issplatnew, src.code) == 0
1781+
@test count(isnew, src.code) == 1
1782+
end
1783+
end
1784+
1785+
# optimize away `NamedTuple`s used for handling `@nospecialize`d keyword-argument
1786+
# https://github.com/JuliaLang/julia/pull/47059
1787+
abstract type CallInfo end
1788+
struct NewInstruction
1789+
stmt::Any
1790+
type::Any
1791+
info::CallInfo
1792+
line::Int32
1793+
flag::UInt8
1794+
function NewInstruction(@nospecialize(stmt), @nospecialize(type), @nospecialize(info::CallInfo),
1795+
line::Int32, flag::UInt8)
1796+
return new(stmt, type, info, line, flag)
1797+
end
1798+
end
1799+
@nospecialize
1800+
function NewInstruction(newinst::NewInstruction;
1801+
stmt=newinst.stmt,
1802+
type=newinst.type,
1803+
info::CallInfo=newinst.info,
1804+
line::Int32=newinst.line,
1805+
flag::UInt8=newinst.flag)
1806+
return NewInstruction(stmt, type, info, line, flag)
1807+
end
1808+
@specialize
1809+
let src = code_typed1((NewInstruction,Any,Any,CallInfo)) do newinst, stmt, type, info
1810+
NewInstruction(newinst; stmt, type, info)
1811+
end
1812+
@test count(issplatnew, src.code) == 0
1813+
@test count(iscall((src,NamedTuple)), src.code) == 0
1814+
@test count(isnew, src.code) == 1
1815+
end

test/compiler/irutils.jl

+1
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ get_code(args...; kwargs...) = code_typed1(args...; kwargs...).code
88

99
# check if `x` is a statement with a given `head`
1010
isnew(@nospecialize x) = isexpr(x, :new)
11+
issplatnew(@nospecialize x) = isexpr(x, :splatnew)
1112
isreturn(@nospecialize x) = isa(x, ReturnNode)
1213

1314
# check if `x` is a dynamic call of a given function

0 commit comments

Comments
 (0)