Skip to content

Commit 0beb2ba

Browse files
committed
Auto merge of #61268 - michaelwoerister:stabilize-pgo, r=alexcrichton
Stabilize support for Profile-guided Optimization This PR makes profile-guided optimization available via the `-C profile-generate` / `-C profile-use` pair of commandline flags and adds end-user documentation for the feature to the [rustc book](https://doc.rust-lang.org/rustc/). The PR thus ticks the last two remaining checkboxes of the [stabilization tracking issue](#59913). From the tracking issue: > Profile-guided optimization (PGO) is a common optimization technique for ahead-of-time compilers. It works by collecting data about a program's typical execution (e.g. probability of branches taken, typical runtime values of variables, etc) and then uses this information during program optimization for things like inlining decisions, machine code layout, or indirect call promotion. If you are curious about how this can be used, there is a rendered version of the documentation this PR adds available [here]( https://github.com/michaelwoerister/rust/blob/stabilize-pgo/src/doc/rustc/src/profile-guided-optimization.md). r? @alexcrichton cc @rust-lang/compiler
2 parents 848e0a2 + b7fe2ca commit 0beb2ba

File tree

17 files changed

+189
-36
lines changed

17 files changed

+189
-36
lines changed

src/doc/rustc/src/SUMMARY.md

+1
Original file line numberDiff line numberDiff line change
@@ -13,5 +13,6 @@
1313
- [Targets](targets/index.md)
1414
- [Built-in Targets](targets/built-in.md)
1515
- [Custom Targets](targets/custom.md)
16+
- [Profile-guided Optimization](profile-guided-optimization.md)
1617
- [Linker-plugin based LTO](linker-plugin-lto.md)
1718
- [Contributing to `rustc`](contributing.md)

src/doc/rustc/src/codegen-options/index.md

+17
Original file line numberDiff line numberDiff line change
@@ -214,3 +214,20 @@ This option lets you control what happens when the code panics.
214214
## incremental
215215

216216
This flag allows you to enable incremental compilation.
217+
218+
## profile-generate
219+
220+
This flag allows for creating instrumented binaries that will collect
221+
profiling data for use with profile-guided optimization (PGO). The flag takes
222+
an optional argument which is the path to a directory into which the
223+
instrumented binary will emit the collected data. See the chapter on
224+
[profile-guided optimization](profile-guided-optimization.html) for more
225+
information.
226+
227+
## profile-use
228+
229+
This flag specifies the profiling data file to be used for profile-guided
230+
optimization (PGO). The flag takes a mandatory argument which is the path
231+
to a valid `.profdata` file. See the chapter on
232+
[profile-guided optimization](profile-guided-optimization.html) for more
233+
information.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
# Profile Guided Optimization
2+
3+
`rustc` supports doing profile-guided optimization (PGO).
4+
This chapter describes what PGO is, what it is good for, and how it can be used.
5+
6+
## What Is Profiled-Guided Optimization?
7+
8+
The basic concept of PGO is to collect data about the typical execution of
9+
a program (e.g. which branches it is likely to take) and then use this data
10+
to inform optimizations such as inlining, machine-code layout,
11+
register allocation, etc.
12+
13+
There are different ways of collecting data about a program's execution.
14+
One is to run the program inside a profiler (such as `perf`) and another
15+
is to create an instrumented binary, that is, a binary that has data
16+
collection built into it, and run that.
17+
The latter usually provides more accurate data and it is also what is
18+
supported by `rustc`.
19+
20+
## Usage
21+
22+
Generating a PGO-optimized program involves following a workflow with four steps:
23+
24+
1. Compile the program with instrumentation enabled
25+
(e.g. `rustc -Cprofile-generate=/tmp/pgo-data main.rs`)
26+
2. Run the instrumented program (e.g. `./main`) which generates a
27+
`default_<id>.profraw` file
28+
3. Convert the `.profraw` file into a `.profdata` file using
29+
LLVM's `llvm-profdata` tool
30+
4. Compile the program again, this time making use of the profiling data
31+
(for example `rustc -Cprofile-use=merged.profdata main.rs`)
32+
33+
An instrumented program will create one or more `.profraw` files, one for each
34+
instrumented binary. E.g. an instrumented executable that loads two instrumented
35+
dynamic libraries at runtime will generate three `.profraw` files. Running an
36+
instrumented binary multiple times, on the other hand, will re-use the
37+
respective `.profraw` files, updating them in place.
38+
39+
These `.profraw` files have to be post-processed before they can be fed back
40+
into the compiler. This is done by the `llvm-profdata` tool. This tool
41+
is most easily installed via
42+
43+
```bash
44+
rustup component add llvm-tools-preview
45+
```
46+
47+
Note that installing the `llvm-tools-preview` component won't add
48+
`llvm-profdata` to the `PATH`. Rather, the tool can be found in:
49+
50+
```bash
51+
~/.rustup/toolchains/<toolchain>/lib/rustlib/<target-triple>/bin/
52+
```
53+
54+
Alternatively, an `llvm-profdata` coming with a recent LLVM or Clang
55+
version usually works too.
56+
57+
The `llvm-profdata` tool merges multiple `.profraw` files into a single
58+
`.profdata` file that can then be fed back into the compiler via
59+
`-Cprofile-use`:
60+
61+
```bash
62+
# STEP 1: Compile the binary with instrumentation
63+
rustc -Cprofile-generate=/tmp/pgo-data -O ./main.rs
64+
65+
# STEP 2: Run the binary a few times, maybe with common sets of args.
66+
# Each run will create or update `.profraw` files in /tmp/pgo-data
67+
./main mydata1.csv
68+
./main mydata2.csv
69+
./main mydata3.csv
70+
71+
# STEP 3: Merge and post-process all the `.profraw` files in /tmp/pgo-data
72+
llvm-profdata merge -o ./merged.profdata /tmp/pgo-data
73+
74+
# STEP 4: Use the merged `.profdata` file during optimization. All `rustc`
75+
# flags have to be the same.
76+
rustc -Cprofile-use=./merged.profdata -O ./main.rs
77+
```
78+
79+
### A Complete Cargo Workflow
80+
81+
Using this feature with Cargo works very similar to using it with `rustc`
82+
directly. Again, we generate an instrumented binary, run it to produce data,
83+
merge the data, and feed it back into the compiler. Some things of note:
84+
85+
- We use the `RUSTFLAGS` environment variable in order to pass the PGO compiler
86+
flags to the compilation of all crates in the program.
87+
88+
- We pass the `--target` flag to Cargo, which prevents the `RUSTFLAGS`
89+
arguments to be passed to Cargo build scripts. We don't want the build
90+
scripts to generate a bunch of `.profraw` files.
91+
92+
- We pass `--release` to Cargo because that's where PGO makes the most sense.
93+
In theory, PGO can also be done on debug builds but there is little reason
94+
to do so.
95+
96+
- It is recommended to use *absolute paths* for the argument of
97+
`-Cprofile-generate` and `-Cprofile-use`. Cargo can invoke `rustc` with
98+
varying working directories, meaning that `rustc` will not be able to find
99+
the supplied `.profdata` file. With absolute paths this is not an issue.
100+
101+
- It is good practice to make sure that there is no left-over profiling data
102+
from previous compilation sessions. Just deleting the directory is a simple
103+
way of doing so (see `STEP 0` below).
104+
105+
This is what the entire workflow looks like:
106+
107+
```bash
108+
# STEP 0: Make sure there is no left-over profiling data from previous runs
109+
rm -rf /tmp/pgo-data
110+
111+
# STEP 1: Build the instrumented binaries
112+
RUSTFLAGS="-Cprofile-generate=/tmp/pgo-data" \
113+
cargo build --release --target=x86_64-unknown-linux-gnu
114+
115+
# STEP 2: Run the instrumented binaries with some typical data
116+
./target/x86_64-unknown-linux-gnu/release/myprogram mydata1.csv
117+
./target/x86_64-unknown-linux-gnu/release/myprogram mydata2.csv
118+
./target/x86_64-unknown-linux-gnu/release/myprogram mydata3.csv
119+
120+
# STEP 3: Merge the `.profraw` files into a `.profdata` file
121+
llvm-profdata merge -o /tmp/pgo-data/merged.profdata /tmp/pgo-data
122+
123+
# STEP 4: Use the `.profdata` file for guiding optimizations
124+
RUSTFLAGS="-Cprofile-use=/tmp/pgo-data/merged.profdata" \
125+
cargo build --release --target=x86_64-unknown-linux-gnu
126+
```
127+
128+
## Further Reading
129+
130+
`rustc`'s PGO support relies entirely on LLVM's implementation of the feature
131+
and is equivalent to what Clang offers via the `-fprofile-generate` /
132+
`-fprofile-use` flags. The [Profile Guided Optimization][clang-pgo] section
133+
in Clang's documentation is therefore an interesting read for anyone who wants
134+
to use PGO with Rust.
135+
136+
[clang-pgo]: https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization

src/librustc/session/config.rs

+12-13
Original file line numberDiff line numberDiff line change
@@ -1207,7 +1207,11 @@ options! {CodegenOptions, CodegenSetter, basic_codegen_options,
12071207
linker_plugin_lto: LinkerPluginLto = (LinkerPluginLto::Disabled,
12081208
parse_linker_plugin_lto, [TRACKED],
12091209
"generate build artifacts that are compatible with linker-based LTO."),
1210-
1210+
profile_generate: SwitchWithOptPath = (SwitchWithOptPath::Disabled,
1211+
parse_switch_with_opt_path, [TRACKED],
1212+
"compile the program with profiling instrumentation"),
1213+
profile_use: Option<PathBuf> = (None, parse_opt_pathbuf, [TRACKED],
1214+
"use the given `.profdata` file for profile-guided optimization"),
12111215
}
12121216

12131217
options! {DebuggingOptions, DebuggingSetter, basic_debugging_options,
@@ -1379,11 +1383,6 @@ options! {DebuggingOptions, DebuggingSetter, basic_debugging_options,
13791383
"extra arguments to prepend to the linker invocation (space separated)"),
13801384
profile: bool = (false, parse_bool, [TRACKED],
13811385
"insert profiling code"),
1382-
pgo_gen: SwitchWithOptPath = (SwitchWithOptPath::Disabled,
1383-
parse_switch_with_opt_path, [TRACKED],
1384-
"Generate PGO profile data, to a given file, or to the default location if it's empty."),
1385-
pgo_use: Option<PathBuf> = (None, parse_opt_pathbuf, [TRACKED],
1386-
"Use PGO profile data from the given profile file."),
13871386
disable_instrumentation_preinliner: bool = (false, parse_bool, [TRACKED],
13881387
"Disable the instrumentation pre-inliner, useful for profiling / PGO."),
13891388
relro_level: Option<RelroLevel> = (None, parse_relro_level, [TRACKED],
@@ -2036,13 +2035,6 @@ pub fn build_session_options_and_crate_config(
20362035
}
20372036
}
20382037

2039-
if debugging_opts.pgo_gen.enabled() && debugging_opts.pgo_use.is_some() {
2040-
early_error(
2041-
error_format,
2042-
"options `-Z pgo-gen` and `-Z pgo-use` are exclusive",
2043-
);
2044-
}
2045-
20462038
let mut output_types = BTreeMap::new();
20472039
if !debugging_opts.parse_only {
20482040
for list in matches.opt_strs("emit") {
@@ -2154,6 +2146,13 @@ pub fn build_session_options_and_crate_config(
21542146
);
21552147
}
21562148

2149+
if cg.profile_generate.enabled() && cg.profile_use.is_some() {
2150+
early_error(
2151+
error_format,
2152+
"options `-C profile-generate` and `-C profile-use` are exclusive",
2153+
);
2154+
}
2155+
21572156
let mut prints = Vec::<PrintRequest>::new();
21582157
if cg.target_cpu.as_ref().map_or(false, |s| s == "help") {
21592158
prints.push(PrintRequest::TargetCPUs);

src/librustc/session/config/tests.rs

+2-2
Original file line numberDiff line numberDiff line change
@@ -519,11 +519,11 @@ fn test_codegen_options_tracking_hash() {
519519
assert!(reference.dep_tracking_hash() != opts.dep_tracking_hash());
520520

521521
opts = reference.clone();
522-
opts.debugging_opts.pgo_gen = SwitchWithOptPath::Enabled(None);
522+
opts.cg.profile_generate = SwitchWithOptPath::Enabled(None);
523523
assert_ne!(reference.dep_tracking_hash(), opts.dep_tracking_hash());
524524

525525
opts = reference.clone();
526-
opts.debugging_opts.pgo_use = Some(PathBuf::from("abc"));
526+
opts.cg.profile_use = Some(PathBuf::from("abc"));
527527
assert_ne!(reference.dep_tracking_hash(), opts.dep_tracking_hash());
528528

529529
opts = reference.clone();

src/librustc/session/mod.rs

+3-3
Original file line numberDiff line numberDiff line change
@@ -1295,9 +1295,9 @@ fn validate_commandline_args_with_session_available(sess: &Session) {
12951295

12961296
// Make sure that any given profiling data actually exists so LLVM can't
12971297
// decide to silently skip PGO.
1298-
if let Some(ref path) = sess.opts.debugging_opts.pgo_use {
1298+
if let Some(ref path) = sess.opts.cg.profile_use {
12991299
if !path.exists() {
1300-
sess.err(&format!("File `{}` passed to `-Zpgo-use` does not exist.",
1300+
sess.err(&format!("File `{}` passed to `-C profile-use` does not exist.",
13011301
path.display()));
13021302
}
13031303
}
@@ -1306,7 +1306,7 @@ fn validate_commandline_args_with_session_available(sess: &Session) {
13061306
// an error to combine the two for now. It always runs into an assertions
13071307
// if LLVM is built with assertions, but without assertions it sometimes
13081308
// does not crash and will probably generate a corrupted binary.
1309-
if sess.opts.debugging_opts.pgo_gen.enabled() &&
1309+
if sess.opts.cg.profile_generate.enabled() &&
13101310
sess.target.target.options.is_like_msvc &&
13111311
sess.panic_strategy() == PanicStrategy::Unwind {
13121312
sess.err("Profile-guided optimization does not yet work in conjunction \

src/librustc_codegen_llvm/attributes.rs

+2-2
Original file line numberDiff line numberDiff line change
@@ -102,8 +102,8 @@ pub fn set_probestack(cx: &CodegenCx<'ll, '_>, llfn: &'ll Value) {
102102
return
103103
}
104104

105-
// probestack doesn't play nice either with pgo-gen.
106-
if cx.sess().opts.debugging_opts.pgo_gen.enabled() {
105+
// probestack doesn't play nice either with `-C profile-generate`.
106+
if cx.sess().opts.cg.profile_generate.enabled() {
107107
return;
108108
}
109109

src/librustc_codegen_ssa/back/link.rs

+1-1
Original file line numberDiff line numberDiff line change
@@ -1179,7 +1179,7 @@ fn link_args<'a, B: ArchiveBuilder<'a>>(cmd: &mut dyn Linker,
11791179
cmd.build_static_executable();
11801180
}
11811181

1182-
if sess.opts.debugging_opts.pgo_gen.enabled() {
1182+
if sess.opts.cg.profile_generate.enabled() {
11831183
cmd.pgo_gen();
11841184
}
11851185

src/librustc_codegen_ssa/back/symbol_export.rs

+1-1
Original file line numberDiff line numberDiff line change
@@ -203,7 +203,7 @@ fn exported_symbols_provider_local<'tcx>(
203203
}
204204
}
205205

206-
if tcx.sess.opts.debugging_opts.pgo_gen.enabled() {
206+
if tcx.sess.opts.cg.profile_generate.enabled() {
207207
// These are weak symbols that point to the profile version and the
208208
// profile name, which need to be treated as exported so LTO doesn't nix
209209
// them.

src/librustc_codegen_ssa/back/write.rs

+2-2
Original file line numberDiff line numberDiff line change
@@ -423,8 +423,8 @@ pub fn start_async_codegen<B: ExtraBackendMethods>(
423423
modules_config.passes.push("insert-gcov-profiling".to_owned())
424424
}
425425

426-
modules_config.pgo_gen = sess.opts.debugging_opts.pgo_gen.clone();
427-
modules_config.pgo_use = sess.opts.debugging_opts.pgo_use.clone();
426+
modules_config.pgo_gen = sess.opts.cg.profile_generate.clone();
427+
modules_config.pgo_use = sess.opts.cg.profile_use.clone();
428428

429429
modules_config.opt_level = Some(sess.opts.optimize);
430430
modules_config.opt_size = Some(sess.opts.optimize);

src/librustc_metadata/creader.rs

+1-1
Original file line numberDiff line numberDiff line change
@@ -868,7 +868,7 @@ impl<'a> CrateLoader<'a> {
868868

869869
fn inject_profiler_runtime(&mut self) {
870870
if self.sess.opts.debugging_opts.profile ||
871-
self.sess.opts.debugging_opts.pgo_gen.enabled()
871+
self.sess.opts.cg.profile_generate.enabled()
872872
{
873873
info!("loading profiler");
874874

src/test/codegen/pgo-instrumentation.rs

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
1-
// Test that `-Zpgo-gen` creates expected instrumentation artifacts in LLVM IR.
1+
// Test that `-Cprofile-generate` creates expected instrumentation artifacts in LLVM IR.
22
// Compiling with `-Cpanic=abort` because PGO+unwinding isn't supported on all platforms.
33

44
// needs-profiler-support
5-
// compile-flags: -Z pgo-gen -Ccodegen-units=1 -Cpanic=abort
5+
// compile-flags: -Cprofile-generate -Ccodegen-units=1 -Cpanic=abort
66

77
// CHECK: @__llvm_profile_raw_version =
88
// CHECK: @__profc_{{.*}}pgo_instrumentation{{.*}}some_function{{.*}} = private global

src/test/run-make-fulldeps/cross-lang-lto-pgo-smoketest/Makefile

+4-4
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ all: cpp-executable rust-executable
2121

2222
cpp-executable:
2323
$(RUSTC) -Clinker-plugin-lto=on \
24-
-Zpgo-gen="$(TMPDIR)"/cpp-profdata \
24+
-Cprofile-generate="$(TMPDIR)"/cpp-profdata \
2525
-o "$(TMPDIR)"/librustlib-xlto.a \
2626
$(COMMON_FLAGS) \
2727
./rustlib.rs
@@ -39,7 +39,7 @@ cpp-executable:
3939
-o "$(TMPDIR)"/cpp-profdata/merged.profdata \
4040
"$(TMPDIR)"/cpp-profdata/default_*.profraw
4141
$(RUSTC) -Clinker-plugin-lto=on \
42-
-Zpgo-use="$(TMPDIR)"/cpp-profdata/merged.profdata \
42+
-Cprofile-use="$(TMPDIR)"/cpp-profdata/merged.profdata \
4343
-o "$(TMPDIR)"/librustlib-xlto.a \
4444
$(COMMON_FLAGS) \
4545
./rustlib.rs
@@ -57,7 +57,7 @@ rust-executable:
5757
$(CLANG) ./clib.c -fprofile-generate="$(TMPDIR)"/rs-profdata -flto=thin -c -o $(TMPDIR)/clib.o -O3
5858
(cd $(TMPDIR); $(AR) crus ./libxyz.a ./clib.o)
5959
$(RUSTC) -Clinker-plugin-lto=on \
60-
-Zpgo-gen="$(TMPDIR)"/rs-profdata \
60+
-Cprofile-generate="$(TMPDIR)"/rs-profdata \
6161
-L$(TMPDIR) \
6262
$(COMMON_FLAGS) \
6363
-Clinker=$(CLANG) \
@@ -78,7 +78,7 @@ rust-executable:
7878
rm "$(TMPDIR)"/libxyz.a
7979
(cd $(TMPDIR); $(AR) crus ./libxyz.a ./clib.o)
8080
$(RUSTC) -Clinker-plugin-lto=on \
81-
-Zpgo-use="$(TMPDIR)"/rs-profdata/merged.profdata \
81+
-Cprofile-use="$(TMPDIR)"/rs-profdata/merged.profdata \
8282
-L$(TMPDIR) \
8383
$(COMMON_FLAGS) \
8484
-Clinker=$(CLANG) \

src/test/run-make-fulldeps/pgo-gen-lto/Makefile

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
-include ../tools.mk
44

5-
COMPILE_FLAGS=-Copt-level=3 -Clto=fat -Z pgo-gen="$(TMPDIR)"
5+
COMPILE_FLAGS=-Copt-level=3 -Clto=fat -Cprofile-generate="$(TMPDIR)"
66

77
# LLVM doesn't yet support instrumenting binaries that use unwinding on MSVC:
88
# https://github.com/rust-lang/rust/issues/61002

src/test/run-make-fulldeps/pgo-gen-no-imp-symbols/Makefile

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
-include ../tools.mk
44

5-
COMPILE_FLAGS=-O -Ccodegen-units=1 -Z pgo-gen="$(TMPDIR)"
5+
COMPILE_FLAGS=-O -Ccodegen-units=1 -Cprofile-generate="$(TMPDIR)"
66

77
# LLVM doesn't yet support instrumenting binaries that use unwinding on MSVC:
88
# https://github.com/rust-lang/rust/issues/61002

src/test/run-make-fulldeps/pgo-gen/Makefile

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
-include ../tools.mk
44

5-
COMPILE_FLAGS=-g -Z pgo-gen="$(TMPDIR)"
5+
COMPILE_FLAGS=-g -Cprofile-generate="$(TMPDIR)"
66

77
# LLVM doesn't yet support instrumenting binaries that use unwinding on MSVC:
88
# https://github.com/rust-lang/rust/issues/61002

src/test/run-make-fulldeps/pgo-use/Makefile

+2-2
Original file line numberDiff line numberDiff line change
@@ -33,15 +33,15 @@ endif
3333

3434
all:
3535
# Compile the test program with instrumentation
36-
$(RUSTC) $(COMMON_FLAGS) -Z pgo-gen="$(TMPDIR)" main.rs
36+
$(RUSTC) $(COMMON_FLAGS) -Cprofile-generate="$(TMPDIR)" main.rs
3737
# Run it in order to generate some profiling data
3838
$(call RUN,main some-argument) || exit 1
3939
# Postprocess the profiling data so it can be used by the compiler
4040
"$(LLVM_BIN_DIR)"/llvm-profdata merge \
4141
-o "$(TMPDIR)"/merged.profdata \
4242
"$(TMPDIR)"/default_*.profraw
4343
# Compile the test program again, making use of the profiling data
44-
$(RUSTC) $(COMMON_FLAGS) -Z pgo-use="$(TMPDIR)"/merged.profdata --emit=llvm-ir main.rs
44+
$(RUSTC) $(COMMON_FLAGS) -Cprofile-use="$(TMPDIR)"/merged.profdata --emit=llvm-ir main.rs
4545
# Check that the generate IR contains some things that we expect
4646
#
4747
# We feed the file into LLVM FileCheck tool *in reverse* so that we see the

0 commit comments

Comments
 (0)