Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always inline query_get_at. #137695

Merged
merged 1 commit into from
Mar 10, 2025

Conversation

nnethercote
Copy link
Contributor

@nnethercote nnethercote commented Feb 27, 2025

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 27, 2025
@nnethercote
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 27, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 27, 2025
…at, r=<try>

Always inline `query_get_at`.

r? `@ghost`
@bors
Copy link
Contributor

bors commented Feb 27, 2025

⌛ Trying commit cc78386 with merge 6c9c69b...

@bors
Copy link
Contributor

bors commented Feb 27, 2025

☀️ Try build successful - checks-actions
Build commit: 6c9c69b (6c9c69b41e19a0ee2f9d5b6dd8f67b7b55839ab0)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (6c9c69b): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
0.2% [0.2%, 0.3%] 3
Regressions ❌
(secondary)
1.1% [0.3%, 1.8%] 12
Improvements ✅
(primary)
-0.5% [-1.5%, -0.2%] 79
Improvements ✅
(secondary)
-0.7% [-1.2%, -0.1%] 56
All ❌✅ (primary) -0.5% [-1.5%, 0.3%] 82

Max RSS (memory usage)

Results (primary -2.2%, secondary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.8% [2.1%, 5.5%] 2
Improvements ✅
(primary)
-2.2% [-2.2%, -2.2%] 1
Improvements ✅
(secondary)
-2.6% [-2.9%, -2.1%] 3
All ❌✅ (primary) -2.2% [-2.2%, -2.2%] 1

Cycles

Results (primary -2.1%, secondary -0.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
4.1% [3.7%, 4.5%] 3
Improvements ✅
(primary)
-2.1% [-2.2%, -2.0%] 3
Improvements ✅
(secondary)
-2.3% [-2.9%, -2.0%] 8
All ❌✅ (primary) -2.1% [-2.2%, -2.0%] 3

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 770.979s -> 780.944s (1.29%)
Artifact size: 361.96 MiB -> 365.08 MiB (0.86%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Feb 27, 2025
@nnethercote
Copy link
Contributor Author

Pretty good perf results, though the 0.862% artifact size increase isn't so nice. Let's ask someone with opinions about inline(always) for review :)

r? @saethlin

@nnethercote nnethercote marked this pull request as ready for review February 27, 2025 07:13
@rustbot

This comment was marked as duplicate.

@saethlin
Copy link
Member

saethlin commented Mar 4, 2025

I'm not completely opposed to this, but also I'm a bit confused by the fact that we need this attribute and all the other inline(always) in this code path. Yeah the functions in question are a bit big, but we have PGO and LTO so surely those should be able to determine that a function like this is a good inlining candidate.

So I wonder if the type erasure scheme that's used in the query system is combining code paths so that PGO can't figure things out. I think this inlining is primarily profitable for the few queries that make up the majority of our execution time, but I suspect those are not separable. Thoughts?

@nnethercote
Copy link
Contributor Author

I think this inlining is primarily profitable for the few queries that make up the majority of our execution time

The ones that are called most often, yes, almost certainly.

No idea about the type erasure stuff, I don't really understand what that is doing or what it's for.

@saethlin
Copy link
Member

saethlin commented Mar 8, 2025

No idea about the type erasure stuff, I don't really understand what that is doing or what it's for.

Hm. The fact that neither of use know about that is rather concerning, because I think based on this it is rather perf-relevant.

@saethlin
Copy link
Member

saethlin commented Mar 8, 2025

I'm okay with this, but mostly because the usual downsides of the attribute don't really apply to the compiler; unoptimized builds don't really matter and our code size is already quite spectacular. I'd really like us to have a better way to understand what inline attributes matter with our LTO+PGO setup, because for sure people (including me) often think that a #[inline] is merited based on local builds but in dist, it isn't.

@bors r+ rollup=never

@bors
Copy link
Contributor

bors commented Mar 8, 2025

📌 Commit cc78386 has been approved by saethlin

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 8, 2025
@bors
Copy link
Contributor

bors commented Mar 9, 2025

⌛ Testing commit cc78386 with merge 2b4694a...

@bors
Copy link
Contributor

bors commented Mar 10, 2025

☀️ Test successful - checks-actions
Approved by: saethlin
Pushing 2b4694a to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 10, 2025
@bors bors merged commit 2b4694a into rust-lang:master Mar 10, 2025
7 checks passed
@rustbot rustbot added this to the 1.87.0 milestone Mar 10, 2025
@rust-log-analyzer
Copy link
Collaborator

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

gh pr comment ${HEAD_PR} -F output.log
shell: /usr/bin/bash -e {0}
##[endgroup]
fatal: ambiguous argument 'HEAD^1': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
##[error]Process completed with exit code 128.
Post job cleanup.

@nnethercote nnethercote deleted the always-inline-query_get_at branch March 10, 2025 02:29
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (2b4694a): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
0.2% [0.2%, 0.3%] 2
Regressions ❌
(secondary)
0.9% [0.2%, 1.5%] 9
Improvements ✅
(primary)
-0.5% [-1.4%, -0.2%] 61
Improvements ✅
(secondary)
-0.7% [-1.5%, -0.2%] 56
All ❌✅ (primary) -0.5% [-1.4%, 0.3%] 63

Max RSS (memory usage)

Results (primary 2.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.4% [2.4%, 2.4%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.4% [2.4%, 2.4%] 1

Cycles

Results (primary -1.7%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.7% [-2.1%, -1.0%] 4
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.7% [-2.1%, -1.0%] 4

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 770.102s -> 779.777s (1.26%)
Artifact size: 361.98 MiB -> 365.16 MiB (0.88%)

@Kobzol
Copy link
Contributor

Kobzol commented Mar 11, 2025

Many more wins than regressions.

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Mar 11, 2025
@panstromek
Copy link
Contributor

No idea about the type erasure stuff, I don't really understand what that is doing or what it's for.

Hm. The fact that neither of use know about that is rather concerning, because I think based on this it is rather perf-relevant.

If I understand correctly, you talk about code that was introduced in #108638, to reduce rustc_query_impl compile time by 27% (another polymorphization at home). It was one of the bigger bootstrap compile time wins.

Speaking of which, this PR adds 9s to bootstrap. Recently another PR (#136731) regressed that time by 7s and it was rejected and reverted in #138092. The trade-off here seems clearer, so I'm just noting this to make sure it's not just an oversight.

@Kobzol
Copy link
Contributor

Kobzol commented Mar 11, 2025

Good find! That PR only regressed bootstrap times and did nothing else, while this also has clear compile-time wins, so I don't think that it's in the same category. The binary size increases are a bit more concerning to me, tbh.

But I haven't found it so terrible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants