Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip inferring calls that lead to throw #35982

Merged
merged 1 commit into from
Jul 10, 2020
Merged

skip inferring calls that lead to throw #35982

merged 1 commit into from
Jul 10, 2020

Conversation

JeffBezanson
Copy link
Member

@JeffBezanson JeffBezanson commented May 22, 2020

This adds a simple linear pass that finds all IR statements obviously followed by a call to throw, and then skips inferring generic calls in those slots. Some numbers:

test master PR PR (fixed)
Base 38.4 35.4 35.8
Stdlibs 53.8 52.8 53.3
precompile 148.3 145.9 146.3
sys.so 149047528 143333400 145332832
TTFP 13.4 12.7 12.8
plot after SIMD 10.9 9.9 10.1
plot after DataFrames 13.6 11.5 11.9

@JeffBezanson JeffBezanson added the latency Latency label May 22, 2020
@vchuravy
Copy link
Member

cc: @maleadt for possible GPU consequences.

Is this only triggered if one of the types in the signature is not concrete? E.g. if we would have add a apply_generic anyway?

@JeffBezanson
Copy link
Member Author

No, it skips them unconditionally. I can try that, but I suspect anything less will not give much benefit (since we need to remove backedges to printing code). This is somewhat experimental, because we haven't done many things like this before (flagrantly not inferring some code), but I believe it will be necessary to make a dent in the latency situation.

The best way forward for GPUs etc. might be to run inference with this change disabled. @Keno et. al. have been working on the compiler refactoring needed for that.

@JeffBezanson
Copy link
Member Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@JeffBezanson
Copy link
Member Author

Ah, of course, the un-inferred callsites in many cases were blocking inlining. I decided to give those statements a flat cost. Added a column to the table above for the new numbers.

@JeffBezanson
Copy link
Member Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@timholy
Copy link
Member

timholy commented May 23, 2020

This looks like it would replace #30222, which I never got around to finishing. I think a fixed cost of 20 is pretty reasonable; it will still affect the inlining decision to some extent, but it will put an end to the current disincentive for crafting a careful error message that reports helpful state information back to the user. That's a win even without the very nice improvements in latency and system image size.

#30222 looks like it had a couple of other changes to the cost model; these may have long since been rendered unnecessary, but it might be worth a glance before you close it.

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@JeffBezanson
Copy link
Member Author

Thanks for reminding me about that PR. You were right that this is really more of a reverse-flow analysis; I could use that to add some extra precision for branches. Adding more accurate costs for certain known C functions is also a good idea.

@timholy
Copy link
Member

timholy commented May 23, 2020

Feel free to steal wantonly from that PR, I got stymied and didn't know what to do.

@JeffBezanson
Copy link
Member Author

Ok I think this is in good shape now. The analysis should definitely be scrutinized first; it's simple but subtle.

@@ -319,7 +319,7 @@ function statement_cost(ex::Expr, line::Int, src::CodeInfo, sptypes::Vector{Any}
return 0
elseif (f === Main.Core.arrayref || f === Main.Core.const_arrayref) && length(ex.args) >= 3
atyp = argextype(ex.args[3], src, sptypes, slottypes)
return isknowntype(atyp) ? 4 : params.inline_nonleaf_penalty
return isknowntype(atyp) ? 4 : error_path ? 20 : params.inline_nonleaf_penalty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this 20 could be a params.error_path_penalty?

Copy link
Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@JeffBezanson JeffBezanson force-pushed the jb/throwinfer branch 2 times, most recently from fbcbaec to 2c33c18 Compare June 12, 2020 20:44
@JeffBezanson
Copy link
Member Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@JeffBezanson
Copy link
Member Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@JeffBezanson JeffBezanson merged commit 8320fcc into master Jul 10, 2020
@JeffBezanson JeffBezanson deleted the jb/throwinfer branch July 10, 2020 23:12
simeonschaub pushed a commit to simeonschaub/julia that referenced this pull request Aug 11, 2020
aviatesk added a commit that referenced this pull request Apr 3, 2023
The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible.
This is because the deoptimization was designed with the type domain
profitability in mind (#35982), and it has not been aware of the effects
domain very well.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates`
- `effects.nonoverlayed`
```

Here are some numbers:

| Metric                  | master    | this commit | #35982 reverted (set `unoptimize_throw_blocks=false`) |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.296319                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.738128                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 33.035448                                  |
| Precompilation (seconds) | 49.967516 | 49.421121  | 49.999998                                  |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874 seconds (11.64 M allocations)` |

[^1]: I got these numbers with disabling all the `@precompile_all_calls` statements in Plots.jl.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
aviatesk added a commit that referenced this pull request Apr 4, 2023
The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible.
This is because the deoptimization was designed with the type domain
profitability in mind (#35982), and it has not been aware of the effects
domain very well.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates`
- `effects.nonoverlayed`
```

Here are some numbers:

| Metric                  | master    | this commit | #35982 reverted (set `unoptimize_throw_blocks=false`) |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.296319                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.738128                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 33.035448                                  |
| Precompilation (seconds) | 49.967516 | 49.421121  | 49.999998                                  |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874 seconds (11.64 M allocations)` |

[^1]: I got these numbers with disabling all the `@precompile_all_calls` statements in Plots.jl.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
aviatesk added a commit that referenced this pull request Apr 5, 2023
The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible.
This is because the deoptimization was designed with the type domain
profitability in mind (#35982), and it has not been aware of the effects
domain very well.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates`
- `effects.nonoverlayed`
```

Here are some numbers:

| Metric                  | master    | this commit | #35982 reverted (set `unoptimize_throw_blocks=false`) |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.296319                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.738128                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 33.035448                                  |
| Precompilation (seconds) | 49.967516 | 49.421121  | 49.999998                                  |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874 seconds (11.64 M allocations)` |

[^1]: I got these numbers with disabling all the `@precompile_all_calls` statements in Plots.jl.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
aviatesk added a commit that referenced this pull request Apr 9, 2023
The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible.
This is because the deoptimization was designed with the type domain
profitability in mind (#35982), and it has not been aware of the effects
domain very well.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates`
- `effects.nonoverlayed`
```

Here are some numbers:

| Metric                  | master    | this commit | #35982 reverted (set `unoptimize_throw_blocks=false`) |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.296319                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.738128                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 33.035448                                  |
| Precompilation (seconds) | 49.967516 | 49.421121  | 49.999998                                  |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874 seconds (11.64 M allocations)` |

[^1]: I got these numbers with disabling all the `@precompile_all_calls` statements in Plots.jl.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
aviatesk added a commit that referenced this pull request Sep 20, 2023
The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible.
This is because the deoptimization was designed with the type domain
profitability in mind (#35982), and it has not been aware of the effects
domain very well.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates`
- `effects.nonoverlayed`
```

Here are some numbers:

| Metric                  | master    | this commit | #35982 reverted (set `unoptimize_throw_blocks=false`) |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.296319                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.738128                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 33.035448                                  |
| Precompilation (seconds) | 49.967516 | 49.421121  | 49.999998                                  |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874 seconds (11.64 M allocations)` |

[^1]: I got these numbers with disabling all the `@precompile_all_calls` statements in Plots.jl.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
aviatesk added a commit that referenced this pull request Sep 20, 2023
The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible.
This is because the deoptimization was designed with the type domain
profitability in mind (#35982), and it has not been aware of the effects
domain very well.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates`
- `effects.nonoverlayed`
```

Here are some numbers:

| Metric                  | master    | this commit | #35982 reverted (set `unoptimize_throw_blocks=false`) |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.296319                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.738128                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 33.035448                                  |
| Precompilation (seconds) | 49.967516 | 49.421121  | 49.999998                                  |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874 seconds (11.64 M allocations)` |

[^1]: I got these numbers with disabling all the `@precompile_all_calls` statements in Plots.jl.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
aviatesk added a commit that referenced this pull request Sep 20, 2023
The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible.
This is because the deoptimization was designed with the type domain
profitability in mind (#35982), and it has not been aware of the effects
domain very well.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates`
- `effects.nonoverlayed`
```

Here are some numbers:

| Metric                  | master    | this commit | #35982 reverted (set `unoptimize_throw_blocks=false`) |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.296319                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.738128                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 33.035448                                  |
| Precompilation (seconds) | 49.967516 | 49.421121  | 49.999998                                  |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874 seconds (11.64 M allocations)` |

[^1]: I got these numbers with disabling all the `@precompile_all_calls` statements in Plots.jl.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
aviatesk added a commit that referenced this pull request Sep 20, 2023
The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible.
This is because the deoptimization was designed with the type domain
profitability in mind (#35982), and it has not been aware of the effects
domain very well.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates`
- `effects.nonoverlayed`
```

Here are some numbers:

| Metric                  | master    | this commit | #35982 reverted (set `unoptimize_throw_blocks=false`) |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.296319                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.738128                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 33.035448                                  |
| Precompilation (seconds) | 49.967516 | 49.421121  | 49.999998                                  |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874 seconds (11.64 M allocations)` |

[^1]: I got these numbers with disabling all the `@precompile_all_calls` statements in Plots.jl.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
aviatesk added a commit that referenced this pull request Sep 21, 2023
The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible.
This is because the deoptimization was designed with the type domain
profitability in mind (#35982), and it has not been aware of the effects
domain very well.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates`
- `effects.nonoverlayed`
```

Here are some numbers:

| Metric                  | master    | this commit | #35982 reverted (set `unoptimize_throw_blocks=false`) |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.296319                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.738128                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 33.035448                                  |
| Precompilation (seconds) | 49.967516 | 49.421121  | 49.999998                                  |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874 seconds (11.64 M allocations)` |

[^1]: I got these numbers with disabling all the `@precompile_all_calls` statements in Plots.jl.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
aviatesk added a commit that referenced this pull request Sep 21, 2023
The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible.
This is because the deoptimization was designed with the type domain
profitability in mind (#35982), and it has not been aware of the effects
domain very well.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates`
- `effects.nonoverlayed`
```

Here are some numbers:

| Metric                  | master    | this commit | #35982 reverted (set `unoptimize_throw_blocks=false`) |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.296319                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.738128                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 33.035448                                  |
| Precompilation (seconds) | 49.967516 | 49.421121  | 49.999998                                  |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874 seconds (11.64 M allocations)` |

[^1]: I got these numbers with disabling all the `@precompile_all_calls` statements in Plots.jl.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
aviatesk added a commit that referenced this pull request Sep 21, 2023
The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible.
This is because the deoptimization was designed with the type domain
profitability in mind (#35982), and it has not been aware of the effects
domain very well.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates`
- `effects.nonoverlayed`
```

Here are some numbers:

| Metric                  | master    | this commit | #35982 reverted (set `unoptimize_throw_blocks=false`) |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.296319                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.738128                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 33.035448                                  |
| Precompilation (seconds) | 49.967516 | 49.421121  | 49.999998                                  |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874 seconds (11.64 M allocations)` |

[^1]: I got these numbers with disabling all the `@precompile_all_calls` statements in Plots.jl.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
aviatesk added a commit that referenced this pull request Sep 26, 2023
The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible.
This is because the deoptimization was designed with the type domain
profitability in mind (#35982), and it has not been aware of the effects
domain very well.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates`
- `effects.nonoverlayed`
```

Here are some numbers:

| Metric                  | master    | this commit | #35982 reverted (set `unoptimize_throw_blocks=false`) |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.296319                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.738128                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 33.035448                                  |
| Precompilation (seconds) | 49.967516 | 49.421121  | 49.999998                                  |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874 seconds (11.64 M allocations)` |

[^1]: I got these numbers with disabling all the `@precompile_all_calls` statements in Plots.jl.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
aviatesk added a commit that referenced this pull request Sep 26, 2023
…49235)

The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible. This is
because the deoptimization was designed with the type domain
profitability in mind (#35982), and hasn't been adequately considering
the effects domain.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates === false`
- `effects.nonoverlayed === false`

Here are some numbers:

| Metric | master | this commit | #35982 reverted (set
`unoptimize_throw_blocks=false`) |

|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds) | 15.579300 | 15.206645 | 15.296319 |
| Stdlibs (seconds) | 17.919013 | 17.667094 | 17.738128 |
| Total (seconds) | 33.499279 | 32.874737 | 33.035448 |
| Precompilation (seconds) | 49.967516 | 49.421121 | 49.999998 |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M
allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874
seconds (11.64 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.469492 seconds (15.32
M allocations)` | `4.499217 seconds (15.41 M allocations)` | `4.470772
seconds (15.38 M allocations)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
latency Latency
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants