Always inline `query_get_at`. #137695

nnethercote · 2025-02-27T01:04:15Z

r? @saethlin

nnethercote · 2025-02-27T01:04:45Z

@bors try @rust-timer queue

…at, r=<try> Always inline `query_get_at`. r? `@ghost`

bors · 2025-02-27T01:05:55Z

⌛ Trying commit cc78386 with merge 6c9c69b...

bors · 2025-02-27T03:09:34Z

☀️ Try build successful - checks-actions
Build commit: 6c9c69b (6c9c69b41e19a0ee2f9d5b6dd8f67b7b55839ab0)

rust-timer · 2025-02-27T05:35:32Z

Finished benchmarking commit (6c9c69b): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.2%, 0.3%]	3
Regressions ❌ (secondary)	1.1%	[0.3%, 1.8%]	12
Improvements ✅ (primary)	-0.5%	[-1.5%, -0.2%]	79
Improvements ✅ (secondary)	-0.7%	[-1.2%, -0.1%]	56
All ❌✅ (primary)	-0.5%	[-1.5%, 0.3%]	82

Max RSS (memory usage)

Results (primary -2.2%, secondary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.8%	[2.1%, 5.5%]	2
Improvements ✅ (primary)	-2.2%	[-2.2%, -2.2%]	1
Improvements ✅ (secondary)	-2.6%	[-2.9%, -2.1%]	3
All ❌✅ (primary)	-2.2%	[-2.2%, -2.2%]	1

Cycles

Results (primary -2.1%, secondary -0.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	4.1%	[3.7%, 4.5%]	3
Improvements ✅ (primary)	-2.1%	[-2.2%, -2.0%]	3
Improvements ✅ (secondary)	-2.3%	[-2.9%, -2.0%]	8
All ❌✅ (primary)	-2.1%	[-2.2%, -2.0%]	3

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 770.979s -> 780.944s (1.29%)
Artifact size: 361.96 MiB -> 365.08 MiB (0.86%)

nnethercote · 2025-02-27T05:48:21Z

Pretty good perf results, though the 0.862% artifact size increase isn't so nice. Let's ask someone with opinions about inline(always) for review :)

r? @saethlin

saethlin · 2025-03-04T00:16:23Z

I'm not completely opposed to this, but also I'm a bit confused by the fact that we need this attribute and all the other inline(always) in this code path. Yeah the functions in question are a bit big, but we have PGO and LTO so surely those should be able to determine that a function like this is a good inlining candidate.

So I wonder if the type erasure scheme that's used in the query system is combining code paths so that PGO can't figure things out. I think this inlining is primarily profitable for the few queries that make up the majority of our execution time, but I suspect those are not separable. Thoughts?

nnethercote · 2025-03-04T00:23:11Z

I think this inlining is primarily profitable for the few queries that make up the majority of our execution time

The ones that are called most often, yes, almost certainly.

No idea about the type erasure stuff, I don't really understand what that is doing or what it's for.

saethlin · 2025-03-08T00:15:52Z

No idea about the type erasure stuff, I don't really understand what that is doing or what it's for.

Hm. The fact that neither of use know about that is rather concerning, because I think based on this it is rather perf-relevant.

saethlin · 2025-03-08T00:21:30Z

I'm okay with this, but mostly because the usual downsides of the attribute don't really apply to the compiler; unoptimized builds don't really matter and our code size is already quite spectacular. I'd really like us to have a better way to understand what inline attributes matter with our LTO+PGO setup, because for sure people (including me) often think that a #[inline] is merited based on local builds but in dist, it isn't.

@bors r+ rollup=never

bors · 2025-03-08T00:21:33Z

📌 Commit cc78386 has been approved by saethlin

It is now in the queue for this repository.

bors · 2025-03-09T21:37:01Z

⌛ Testing commit cc78386 with merge 2b4694a...

bors · 2025-03-10T00:42:24Z

☀️ Test successful - checks-actions
Approved by: saethlin
Pushing 2b4694a to master...

rust-log-analyzer · 2025-03-10T00:42:51Z

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)


gh pr comment ${HEAD_PR} -F output.log
shell: /usr/bin/bash -e {0}
##[endgroup]
fatal: ambiguous argument 'HEAD^1': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
##[error]Process completed with exit code 128.
Post job cleanup.

rust-timer · 2025-03-10T06:27:44Z

Finished benchmarking commit (2b4694a): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

If the regression was expected or you think it can be justified,
please write a comment with sufficient written justification, and add
@rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
If you think that you know of a way to resolve the regression, try to create
a new PR with a fix for the regression.
If you do not understand the regression or you think that it is just noise,
you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.2%, 0.3%]	2
Regressions ❌ (secondary)	0.9%	[0.2%, 1.5%]	9
Improvements ✅ (primary)	-0.5%	[-1.4%, -0.2%]	61
Improvements ✅ (secondary)	-0.7%	[-1.5%, -0.2%]	56
All ❌✅ (primary)	-0.5%	[-1.4%, 0.3%]	63

Max RSS (memory usage)

Results (primary 2.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.4%	[2.4%, 2.4%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.4%	[2.4%, 2.4%]	1

Cycles

Results (primary -1.7%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.7%	[-2.1%, -1.0%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.7%	[-2.1%, -1.0%]	4

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 770.102s -> 779.777s (1.26%)
Artifact size: 361.98 MiB -> 365.16 MiB (0.88%)

Kobzol · 2025-03-11T08:24:37Z

Many more wins than regressions.

@rustbot label: +perf-regression-triaged

panstromek · 2025-03-11T10:46:38Z

No idea about the type erasure stuff, I don't really understand what that is doing or what it's for.

Hm. The fact that neither of use know about that is rather concerning, because I think based on this it is rather perf-relevant.

If I understand correctly, you talk about code that was introduced in #108638, to reduce rustc_query_impl compile time by 27% (another polymorphization at home). It was one of the bigger bootstrap compile time wins.

Speaking of which, this PR adds 9s to bootstrap. Recently another PR (#136731) regressed that time by 7s and it was rejected and reverted in #138092. The trade-off here seems clearer, so I'm just noting this to make sure it's not just an oversight.

Kobzol · 2025-03-11T10:49:23Z

Good find! That PR only regressed bootstrap times and did nothing else, while this also has clear compile-time wins, so I don't think that it's in the same category. The binary size increases are a bit more concerning to me, tbh.

But I haven't found it so terrible.

Always inline query_get_at.

cc78386

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 27, 2025

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 27, 2025

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 27, 2025

Auto merge of rust-lang#137695 - nnethercote:always-inline-query_get_…

6c9c69b

…at, r=<try> Always inline `query_get_at`. r? `@ghost`

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Feb 27, 2025

rustbot assigned saethlin Feb 27, 2025

nnethercote marked this pull request as ready for review February 27, 2025 07:13

This comment was marked as duplicate.

Sign in to view

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 8, 2025

bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 10, 2025

bors merged commit 2b4694a into rust-lang:master Mar 10, 2025
7 checks passed

rustbot added this to the 1.87.0 milestone Mar 10, 2025

nnethercote deleted the always-inline-query_get_at branch March 10, 2025 02:29

rustbot added the perf-regression-triaged The performance regression has been triaged. label Mar 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Always inline `query_get_at`. #137695

Always inline `query_get_at`. #137695

nnethercote commented Feb 27, 2025 •

edited

Loading

nnethercote commented Feb 27, 2025

This comment has been minimized.

bors commented Feb 27, 2025

bors commented Feb 27, 2025

This comment has been minimized.

rust-timer commented Feb 27, 2025

nnethercote commented Feb 27, 2025

This comment was marked as duplicate.

saethlin commented Mar 4, 2025

nnethercote commented Mar 4, 2025

saethlin commented Mar 8, 2025

saethlin commented Mar 8, 2025

bors commented Mar 8, 2025

bors commented Mar 9, 2025

bors commented Mar 10, 2025

rust-log-analyzer commented Mar 10, 2025

rust-timer commented Mar 10, 2025

Kobzol commented Mar 11, 2025

panstromek commented Mar 11, 2025

Kobzol commented Mar 11, 2025

Always inline query_get_at. #137695

Always inline query_get_at. #137695

Conversation

nnethercote commented Feb 27, 2025 • edited Loading

nnethercote commented Feb 27, 2025

This comment has been minimized.

bors commented Feb 27, 2025

bors commented Feb 27, 2025

This comment has been minimized.

rust-timer commented Feb 27, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

nnethercote commented Feb 27, 2025

This comment was marked as duplicate.

saethlin commented Mar 4, 2025

nnethercote commented Mar 4, 2025

saethlin commented Mar 8, 2025

saethlin commented Mar 8, 2025

bors commented Mar 8, 2025

bors commented Mar 9, 2025

bors commented Mar 10, 2025

rust-log-analyzer commented Mar 10, 2025

rust-timer commented Mar 10, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

Kobzol commented Mar 11, 2025

panstromek commented Mar 11, 2025

Kobzol commented Mar 11, 2025

Always inline `query_get_at`. #137695

Always inline `query_get_at`. #137695

nnethercote commented Feb 27, 2025 •

edited

Loading