You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Auto merge of #54592 - GabrielMajeri:no-plt, r=nagisa
Support for disabling PLT for better function call performance
This PR gives `rustc` the ability to skip the PLT when generating function calls into shared libraries. This can improve performance by reducing branch indirection.
AFAIK, the only advantage of using the PLT is to allow for ELF lazy binding. However, since Rust already [enables full relro for security](#43170), lazy binding was disabled anyway.
This is a little known feature which is supported by [GCC](https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html) and [Clang](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-fplt) as `-fno-plt` (some Linux distros [enable it by default](https://git.archlinux.org/svntogit/packages.git/tree/trunk/makepkg.conf?h=packages/pacman#n40) for all builds).
Implementation inspired by [this patch](https://reviews.llvm.org/D39079#change-YvkpNDlMs_LT) which adds `-fno-plt` support to Clang.
## Performance
I didn't run a lot of benchmarks, but these are the results on my machine for a `clap` [benchmark](https://github.com/clap-rs/clap/blob/master/benches/05_ripgrep.rs):
```
name control ns/iter no-plt ns/iter diff ns/iter diff % speedup
build_app_long 11,097 10,733 -364 -3.28% x 1.03
build_app_short 11,089 10,742 -347 -3.13% x 1.03
build_help_long 186,835 182,713 -4,122 -2.21% x 1.02
build_help_short 80,949 78,455 -2,494 -3.08% x 1.03
parse_clean 12,385 12,044 -341 -2.75% x 1.03
parse_complex 19,438 19,017 -421 -2.17% x 1.02
parse_lots 431,493 421,421 -10,072 -2.33% x 1.02
```
A small performance improvement across the board, with no downsides. It's likely binaries which make a lot of function calls into dynamic libraries could see even more improvements. [This comment](https://patchwork.ozlabs.org/patch/468993/#1028255) suggests that, in some cases, `-fno-plt` could improve PIC/PIE code performance by 10%.
## Security benefits
**Bonus**: some of the speculative execution attacks rely on the PLT, by disabling it we reduce a big attack surface and reduce the need for [`retpoline`](https://reviews.llvm.org/D41723).
## Remaining PLT calls
The compiled binaries still have plenty of PLT calls, coming from C/C++ libraries. Building dependencies with `CFLAGS=-fno-plt CXXFLAGS=-fno-plt` removes them.
0 commit comments