-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[experimental] Use #[thread_local]
instead of thread_local!
in rustc_middle
#106270
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit afd4524ac844d1f2dc3ed87a61d25021897f2993 with merge 389dafa688a3ad30c1a062340c53cdf031a3f1e1... |
static TLV: Cell<usize> = const { Cell::new(0) }; | ||
} | ||
#[thread_local] | ||
static TLV: Cell<usize> = const { Cell::new(0) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have an inline_const here? Couldn't that just call Cell::new
directly?
This breaks on systems without Destructors not running doesn't matter for |
How can I tell which systems aren't supported by |
|
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (389dafa688a3ad30c1a062340c53cdf031a3f1e1): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. |
Comparison of
In particular, notice that the |
Do you have a sense of what percentage that is of the total sum? It looks like wall time is estimated to have gone down by about 3% -- I wonder how good our estimate is. (Note that the wall time is a little noisy, so it might actually be less or more than 3%. It's also true that we have a -j 2 build for this so there's some parallelism involved). |
|
ah no, apparently I was undercounting.
So according to dump-mono-stats this is a 5.3% reduction in code that needs to be sent in LLVM :) it says "estimate" so that might not be 100% accurate but it seems roughly comparable to what perf.rlo says. |
@bjorn3 hmm, |
The size estimate is a rough approximation here, it's still basically the number of MIR statements (here It's still useful of course, but improving the estimates to better reflect the backends' actual compilation-to-native-code work (rather than, say, the lowering to their own IR) is tracked in issues like #69382 and others. |
Yes, but not deterministically. |
You need to build the compiler that will use TLS so at the very least use |
afd4524
to
2a845e8
Compare
This comment has been minimized.
This comment has been minimized.
This avoid monomorphizing `LocalKey::try_with` 5 times for each query. It has the downside that destructors for the thread-local are never run; but we don't depend on the destructors for ImplicitContext itself anywhere.
2a845e8
to
0aa4956
Compare
#106311 has a much better perf improvement, and also avoids needing multiple versions of the code for different platforms. |
This avoid monomorphizing
LocalKey::try_with
5 times for each query. It has the downside that destructors for the thread-local are never run; but we don't depend on the destructors for ImplicitContext itself anywhere.Helps with #65031 (comment)
r? @jyn514