-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always inline query_get_at
.
#137695
Always inline query_get_at
.
#137695
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
…at, r=<try> Always inline `query_get_at`. r? `@ghost`
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (6c9c69b): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary -2.2%, secondary -0.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary -2.1%, secondary -0.6%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 770.979s -> 780.944s (1.29%) |
Pretty good perf results, though the 0.862% artifact size increase isn't so nice. Let's ask someone with opinions about r? @saethlin |
This comment was marked as duplicate.
This comment was marked as duplicate.
I'm not completely opposed to this, but also I'm a bit confused by the fact that we need this attribute and all the other So I wonder if the type erasure scheme that's used in the query system is combining code paths so that PGO can't figure things out. I think this inlining is primarily profitable for the few queries that make up the majority of our execution time, but I suspect those are not separable. Thoughts? |
The ones that are called most often, yes, almost certainly. No idea about the type erasure stuff, I don't really understand what that is doing or what it's for. |
Hm. The fact that neither of use know about that is rather concerning, because I think based on this it is rather perf-relevant. |
I'm okay with this, but mostly because the usual downsides of the attribute don't really apply to the compiler; unoptimized builds don't really matter and our code size is already quite spectacular. I'd really like us to have a better way to understand what inline attributes matter with our LTO+PGO setup, because for sure people (including me) often think that a @bors r+ rollup=never |
☀️ Test successful - checks-actions |
A job failed! Check out the build log: (web) (plain) Click to see the possible cause of the failure (guessed by this bot)
|
Finished benchmarking commit (2b4694a): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 2.4%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary -1.7%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 770.102s -> 779.777s (1.26%) |
Many more wins than regressions. @rustbot label: +perf-regression-triaged |
If I understand correctly, you talk about code that was introduced in #108638, to reduce Speaking of which, this PR adds 9s to bootstrap. Recently another PR (#136731) regressed that time by 7s and it was rejected and reverted in #138092. The trade-off here seems clearer, so I'm just noting this to make sure it's not just an oversight. |
Good find! That PR only regressed bootstrap times and did nothing else, while this also has clear compile-time wins, so I don't think that it's in the same category. The binary size increases are a bit more concerning to me, tbh. But I haven't found it so terrible. |
r? @saethlin