r[attributes.codegen]
The following attributes are used for controlling code generation.
r[attributes.codegen.hint]
r[attributes.codegen.hint.cold-inline]
The cold
and inline
attributes give suggestions to generate code in a
way that may be faster than what it would do without the hint. The attributes
are only hints, and may be ignored.
r[attributes.codegen.hint.usage] Both attributes can be used on functions. When applied to a function in a trait, they apply only to that function when used as a default function for a trait implementation and not to all trait implementations. The attributes have no effect on a trait function without a body.
r[attributes.codegen.inline]
r[attributes.codegen.inline.intro]
The inline
attribute suggests that a copy of the attributed function
should be placed in the caller, rather than generating code to call the
function where it is defined.
Note
The rustc
compiler automatically inlines functions based on internal heuristics. Incorrectly inlining functions can make the program slower, so this attribute should be used with care.
r[attributes.codegen.inline.modes] There are three ways to use the inline attribute:
#[inline]
suggests performing an inline expansion.#[inline(always)]
suggests that an inline expansion should always be performed.#[inline(never)]
suggests that an inline expansion should never be performed.
Note
#[inline]
in every form is a hint, with no requirements on the language to place a copy of the attributed function in the caller.
r[attributes.codegen.cold]
The cold
attribute suggests that the attributed function is unlikely to
be called.
r[attributes.codegen.no_builtins]
The no_builtins
attribute may be applied at the crate level to disable
optimizing certain code patterns to invocations of library functions that are
assumed to exist.
r[attributes.codegen.target_feature]
r[attributes.codegen.target_feature.intro]
The target_feature
attribute may be applied to a function to
enable code generation of that function for specific platform architecture
features. It uses the MetaListNameValueStr syntax with a single key of
enable
whose value is a string of comma-separated feature names to enable.
# #[cfg(target_feature = "avx2")]
#[target_feature(enable = "avx2")]
fn foo_avx2() {}
r[attributes.codegen.target_feature.arch] Each target architecture has a set of features that may be enabled. It is an error to specify a feature for a target architecture that the crate is not being compiled for.
r[attributes.codegen.target_feature.closures]
Closures defined within a target_feature
-annotated function inherit the
attribute from the enclosing function.
r[attributes.codegen.target_feature.target-ub] It is undefined behavior to call a function that is compiled with a feature that is not supported on the current platform the code is running on, except if the platform explicitly documents this to be safe.
r[attributes.codegen.target_feature.safety-restrictions] The following restrictions apply unless otherwise specified by the platform rules below:
- Safe
#[target_feature]
functions (and closures that inherit the attribute) can only be safely called within a caller that enables all thetarget_feature
s that the callee enables. This restriction does not apply in anunsafe
context. - Safe
#[target_feature]
functions (and closures that inherit the attribute) can only be coerced to safe function pointers in contexts that enable all thetarget_feature
s that the coercee enables. This restriction does not apply tounsafe
function pointers.
Implicitly enabled features are included in this rule. For example an sse2
function can call ones marked with sse
.
# #[cfg(target_feature = "sse2")] {
#[target_feature(enable = "sse")]
fn foo_sse() {}
fn bar() {
// Calling `foo_sse` here is unsafe, as we must ensure that SSE is
// available first, even if `sse` is enabled by default on the target
// platform or manually enabled as compiler flags.
unsafe {
foo_sse();
}
}
#[target_feature(enable = "sse")]
fn bar_sse() {
// Calling `foo_sse` here is safe.
foo_sse();
|| foo_sse();
}
#[target_feature(enable = "sse2")]
fn bar_sse2() {
// Calling `foo_sse` here is safe because `sse2` implies `sse`.
foo_sse();
}
# }
r[attributes.codegen.target_feature.fn-traits]
A function with a #[target_feature]
attribute never implements the Fn
family of traits, although closures inheriting features from the enclosing function do.
r[attributes.codegen.target_feature.allowed-positions]
The #[target_feature]
attribute is not allowed on the following places:
- [the
main
function][crate.main] - a [
panic_handler
function][panic.panic_handler] - safe trait methods
- safe default functions in traits
r[attributes.codegen.target_feature.inline]
Functions marked with target_feature
are not inlined into a context that
does not support the given features. The #[inline(always)]
attribute may not
be used with a target_feature
attribute.
r[attributes.codegen.target_feature.availability]
The following is a list of the available feature names.
r[attributes.codegen.target_feature.x86]
Executing code with unsupported features is undefined behavior on this platform.
Hence on this platform usage of #[target_feature]
functions follows the
[above restrictions][attributes.codegen.target_feature.safety-restrictions].
Feature | Implicitly Enables | Description |
---|---|---|
adx |
ADX --- Multi-Precision Add-Carry Instruction Extensions | |
aes |
sse2 |
AES --- Advanced Encryption Standard |
avx |
sse4.2 |
AVX --- Advanced Vector Extensions |
avx2 |
avx |
AVX2 --- Advanced Vector Extensions 2 |
bmi1 |
BMI1 --- Bit Manipulation Instruction Sets | |
bmi2 |
BMI2 --- Bit Manipulation Instruction Sets 2 | |
cmpxchg16b |
cmpxchg16b --- Compares and exchange 16 bytes (128 bits) of data atomically |
|
f16c |
avx |
F16C --- 16-bit floating point conversion instructions |
fma |
avx |
FMA3 --- Three-operand fused multiply-add |
fxsr |
fxsave and fxrstor --- Save and restore x87 FPU, MMX Technology, and SSE State |
|
lzcnt |
lzcnt --- Leading zeros count |
|
movbe |
movbe --- Move data after swapping bytes |
|
pclmulqdq |
sse2 |
pclmulqdq --- Packed carry-less multiplication quadword |
popcnt |
popcnt --- Count of bits set to 1 |
|
rdrand |
rdrand --- Read random number |
|
rdseed |
rdseed --- Read random seed |
|
sha |
sse2 |
SHA --- Secure Hash Algorithm |
sse |
SSE --- Streaming SIMD Extensions | |
sse2 |
sse |
SSE2 --- Streaming SIMD Extensions 2 |
sse3 |
sse2 |
SSE3 --- Streaming SIMD Extensions 3 |
sse4.1 |
ssse3 |
SSE4.1 --- Streaming SIMD Extensions 4.1 |
sse4.2 |
sse4.1 |
SSE4.2 --- Streaming SIMD Extensions 4.2 |
ssse3 |
sse3 |
SSSE3 --- Supplemental Streaming SIMD Extensions 3 |
xsave |
xsave --- Save processor extended states |
|
xsavec |
xsavec --- Save processor extended states with compaction |
|
xsaveopt |
xsaveopt --- Save processor extended states optimized |
|
xsaves |
xsaves --- Save processor extended states supervisor |
r[attributes.codegen.target_feature.aarch64]
On this platform the usage of #[target_feature]
functions follows the
[above restrictions][attributes.codegen.target_feature.safety-restrictions].
Further documentation on these features can be found in the ARM Architecture Reference Manual, or elsewhere on developer.arm.com.
Note
The following pairs of features should both be marked as enabled or disabled together if used:
paca
andpacg
, which LLVM currently implements as one feature.
Feature | Implicitly Enables | Feature Name |
---|---|---|
aes |
neon |
FEAT_AES & FEAT_PMULL --- Advanced SIMD AES & PMULL instructions |
bf16 |
FEAT_BF16 --- BFloat16 instructions | |
bti |
FEAT_BTI --- Branch Target Identification | |
crc |
FEAT_CRC --- CRC32 checksum instructions | |
dit |
FEAT_DIT --- Data Independent Timing instructions | |
dotprod |
FEAT_DotProd --- Advanced SIMD Int8 dot product instructions | |
dpb |
FEAT_DPB --- Data cache clean to point of persistence | |
dpb2 |
FEAT_DPB2 --- Data cache clean to point of deep persistence | |
f32mm |
sve |
FEAT_F32MM --- SVE single-precision FP matrix multiply instruction |
f64mm |
sve |
FEAT_F64MM --- SVE double-precision FP matrix multiply instruction |
fcma |
neon |
FEAT_FCMA --- Floating point complex number support |
fhm |
fp16 |
FEAT_FHM --- Half-precision FP FMLAL instructions |
flagm |
FEAT_FlagM --- Conditional flag manipulation | |
fp16 |
neon |
FEAT_FP16 --- Half-precision FP data processing |
frintts |
FEAT_FRINTTS --- Floating-point to int helper instructions | |
i8mm |
FEAT_I8MM --- Int8 Matrix Multiplication | |
jsconv |
neon |
FEAT_JSCVT --- JavaScript conversion instruction |
lse |
FEAT_LSE --- Large System Extension | |
lor |
FEAT_LOR --- Limited Ordering Regions extension | |
mte |
FEAT_MTE & FEAT_MTE2 --- Memory Tagging Extension | |
neon |
FEAT_FP & FEAT_AdvSIMD --- Floating Point and Advanced SIMD extension | |
pan |
FEAT_PAN --- Privileged Access-Never extension | |
paca |
FEAT_PAuth --- Pointer Authentication (address authentication) | |
pacg |
FEAT_PAuth --- Pointer Authentication (generic authentication) | |
pmuv3 |
FEAT_PMUv3 --- Performance Monitors extension (v3) | |
rand |
FEAT_RNG --- Random Number Generator | |
ras |
FEAT_RAS & FEAT_RASv1p1 --- Reliability, Availability and Serviceability extension | |
rcpc |
FEAT_LRCPC --- Release consistent Processor Consistent | |
rcpc2 |
rcpc |
FEAT_LRCPC2 --- RcPc with immediate offsets |
rdm |
FEAT_RDM --- Rounding Double Multiply accumulate | |
sb |
FEAT_SB --- Speculation Barrier | |
sha2 |
neon |
FEAT_SHA1 & FEAT_SHA256 --- Advanced SIMD SHA instructions |
sha3 |
sha2 |
FEAT_SHA512 & FEAT_SHA3 --- Advanced SIMD SHA instructions |
sm4 |
neon |
FEAT_SM3 & FEAT_SM4 --- Advanced SIMD SM3/4 instructions |
spe |
FEAT_SPE --- Statistical Profiling Extension | |
ssbs |
FEAT_SSBS & FEAT_SSBS2 --- Speculative Store Bypass Safe | |
sve |
fp16 |
FEAT_SVE --- Scalable Vector Extension |
sve2 |
sve |
FEAT_SVE2 --- Scalable Vector Extension 2 |
sve2-aes |
sve2 , aes |
FEAT_SVE_AES --- SVE AES instructions |
sve2-sm4 |
sve2 , sm4 |
FEAT_SVE_SM4 --- SVE SM4 instructions |
sve2-sha3 |
sve2 , sha3 |
FEAT_SVE_SHA3 --- SVE SHA3 instructions |
sve2-bitperm |
sve2 |
FEAT_SVE_BitPerm --- SVE Bit Permute |
tme |
FEAT_TME --- Transactional Memory Extension | |
vh |
FEAT_VHE --- Virtualization Host Extensions |
r[attributes.codegen.target_feature.riscv]
On this platform the usage of #[target_feature]
functions follows the
[above restrictions][attributes.codegen.target_feature.safety-restrictions].
Further documentation on these features can be found in their respective specification. Many specifications are described in the RISC-V ISA Manual or in another manual hosted on the RISC-V GitHub Account.
Feature | Implicitly Enables | Description |
---|---|---|
a |
A --- Atomic instructions | |
c |
C --- Compressed instructions | |
m |
M --- Integer Multiplication and Division instructions | |
zb |
zba , zbc , zbs |
Zb --- Bit Manipulation instructions |
zba |
Zba --- Address Generation instructions | |
zbb |
Zbb --- Basic bit-manipulation | |
zbc |
Zbc --- Carry-less multiplication | |
zbkb |
Zbkb --- Bit Manipulation Instructions for Cryptography | |
zbkc |
Zbkc --- Carry-less multiplication for Cryptography | |
zbkx |
Zbkx --- Crossbar permutations | |
zbs |
Zbs --- Single-bit instructions | |
zk |
zkn , zkr , zks , zkt , zbkb , zbkc , zkbx |
Zk --- Scalar Cryptography |
zkn |
zknd , zkne , zknh , zbkb , zbkc , zkbx |
Zkn --- NIST Algorithm suite extension |
zknd |
Zknd --- NIST Suite: AES Decryption | |
zkne |
Zkne --- NIST Suite: AES Encryption | |
zknh |
Zknh --- NIST Suite: Hash Function Instructions | |
zkr |
Zkr --- Entropy Source Extension | |
zks |
zksed , zksh , zbkb , zbkc , zkbx |
Zks --- ShangMi Algorithm Suite |
zksed |
Zksed --- ShangMi Suite: SM4 Block Cipher Instructions | |
zksh |
Zksh --- ShangMi Suite: SM3 Hash Function Instructions | |
zkt |
Zkt --- Data Independent Execution Latency Subset |
r[attributes.codegen.target_feature.wasm]
Safe #[target_feature]
functions may always be used in safe contexts on Wasm
platforms. It is impossible to cause undefined behavior via the
#[target_feature]
attribute because attempting to use instructions
unsupported by the Wasm engine will fail at load time without the risk of being
interpreted in a way different from what the compiler expected.
Feature | Implicitly Enables | Description |
---|---|---|
bulk-memory |
WebAssembly bulk memory operations proposal | |
extended-const |
WebAssembly extended const expressions proposal | |
mutable-globals |
WebAssembly mutable global proposal | |
nontrapping-fptoint |
WebAssembly non-trapping float-to-int conversion proposal | |
relaxed-simd |
simd128 |
WebAssembly relaxed simd proposal |
sign-ext |
WebAssembly sign extension operators Proposal | |
simd128 |
WebAssembly simd proposal | |
multivalue |
WebAssembly multivalue proposal | |
reference-types |
WebAssembly reference-types proposal | |
tail-call |
WebAssembly tail-call proposal |
r[attributes.codegen.target_feature.info]
r[attributes.codegen.target_feature.remark-cfg]
See the target_feature
conditional compilation option for selectively
enabling or disabling compilation of code based on compile-time settings. Note
that this option is not affected by the target_feature
attribute, and is
only driven by the features enabled for the entire crate.
r[attributes.codegen.target_feature.remark-rt]
See the is_x86_feature_detected
or is_aarch64_feature_detected
macros
in the standard library for runtime feature detection on these platforms.
Note
rustc
has a default set of features enabled for each target and CPU. The CPU may be chosen with the -C target-cpu
flag. Individual features may be enabled or disabled for an entire crate with the -C target-feature
flag.
r[attributes.codegen.track_caller]
r[attributes.codegen.track_caller.allowed-positions]
The track_caller
attribute may be applied to any function with "Rust"
ABI
with the exception of the entry point fn main
.
r[attributes.codegen.track_caller.traits] When applied to functions and methods in trait declarations, the attribute applies to all implementations. If the trait provides a default implementation with the attribute, then the attribute also applies to override implementations.
r[attributes.codegen.track_caller.extern]
When applied to a function in an extern
block the attribute must also be applied to any linked
implementations, otherwise undefined behavior results. When applied to a function which is made
available to an extern
block, the declaration in the extern
block must also have the attribute,
otherwise undefined behavior results.
r[attributes.codegen.track_caller.behavior]
Applying the attribute to a function f
allows code within f
to get a hint of the Location
of
the "topmost" tracked call that led to f
's invocation. At the point of observation, an
implementation behaves as if it walks up the stack from f
's frame to find the nearest frame of an
unattributed function outer
, and it returns the Location
of the tracked call in outer
.
#[track_caller]
fn f() {
println!("{}", std::panic::Location::caller());
}
Note
core
provides [core::panic::Location::caller
] for observing caller locations. It wraps the [core::intrinsics::caller_location
] intrinsic implemented by rustc
.
Note
Because the resulting Location
is a hint, an implementation may halt its walk up the stack early. See Limitations for important caveats.
When f
is called directly by calls_f
, code in f
observes its callsite within calls_f
:
# #[track_caller]
# fn f() {
# println!("{}", std::panic::Location::caller());
# }
fn calls_f() {
f(); // <-- f() prints this location
}
When f
is called by another attributed function g
which is in turn called by calls_g
, code in
both f
and g
observes g
's callsite within calls_g
:
# #[track_caller]
# fn f() {
# println!("{}", std::panic::Location::caller());
# }
#[track_caller]
fn g() {
println!("{}", std::panic::Location::caller());
f();
}
fn calls_g() {
g(); // <-- g() prints this location twice, once itself and once from f()
}
When g
is called by another attributed function h
which is in turn called by calls_h
, all code
in f
, g
, and h
observes h
's callsite within calls_h
:
# #[track_caller]
# fn f() {
# println!("{}", std::panic::Location::caller());
# }
# #[track_caller]
# fn g() {
# println!("{}", std::panic::Location::caller());
# f();
# }
#[track_caller]
fn h() {
println!("{}", std::panic::Location::caller());
g();
}
fn calls_h() {
h(); // <-- prints this location three times, once itself, once from g(), once from f()
}
And so on.
r[attributes.codegen.track_caller.limits]
r[attributes.codegen.track_caller.hint] This information is a hint and implementations are not required to preserve it.
r[attributes.codegen.track_caller.decay]
In particular, coercing a function with #[track_caller]
to a function pointer creates a shim which
appears to observers to have been called at the attributed function's definition site, losing actual
caller information across virtual calls. A common example of this coercion is the creation of a
trait object whose methods are attributed.
Note
The aforementioned shim for function pointers is necessary because rustc
implements track_caller
in a codegen context by appending an implicit parameter to the function ABI, but this would be unsound for an indirect call because the parameter is not a part of the function's type and a given function pointer type may or may not refer to a function with the attribute. The creation of a shim hides the implicit parameter from callers of the function pointer, preserving soundness.
r[attributes.codegen.instruction_set]
r[attributes.codegen.instruction_set.allowed-positions]
The instruction_set
attribute may be applied to a function to control which instruction set the function will be generated for.
r[attributes.codegen.instruction_set.behavior] This allows mixing more than one instruction set in a single program on CPU architectures that support it.
r[attributes.codegen.instruction_set.syntax] It uses the MetaListPath syntax, and a path comprised of the architecture family name and instruction set name.
r[attributes.codegen.instruction_set.target-limits]
It is a compilation error to use the instruction_set
attribute on a target that does not support it.
r[attributes.codegen.instruction_set.arm]
For the ARMv4T
and ARMv5te
architectures, the following are supported:
arm::a32
--- Generate the function as A32 "ARM" code.arm::t32
--- Generate the function as T32 "Thumb" code.
#[instruction_set(arm::a32)]
fn foo_arm_code() {}
#[instruction_set(arm::t32)]
fn bar_thumb_code() {}
Using the instruction_set
attribute has the following effects:
- If the address of the function is taken as a function pointer, the low bit of the address will be set to 0 (arm) or 1 (thumb) depending on the instruction set.
- Any inline assembly in the function must use the specified instruction set instead of the target default.