|
| 1 | +- Feature Name: unsafe_block_in_unsafe_fn |
| 2 | +- Start Date: 2018-11-04 |
| 3 | +- RFC PR: (leave this empty) |
| 4 | +- Rust Issue: (leave this empty) |
| 5 | + |
| 6 | +# Summary |
| 7 | +[summary]: #summary |
| 8 | + |
| 9 | +No longer treat the body of an `unsafe fn` as being an `unsafe` block. To avoid |
| 10 | +a breaking change, this is a warning now and may become an error in a future |
| 11 | +edition. |
| 12 | + |
| 13 | +# Motivation |
| 14 | +[motivation]: #motivation |
| 15 | + |
| 16 | +Marking a function as `unsafe` is one of Rust's key protections against |
| 17 | +undefined behavior: Even if the programmer does not read the documentation, |
| 18 | +calling an `unsafe` function (or performing another unsafe operation) outside an |
| 19 | +`unsafe` block will lead to a compile error, hopefully followed by reading the |
| 20 | +documentation. |
| 21 | + |
| 22 | +However, we currently entirely lose this protection when writing an `unsafe fn`: |
| 23 | +If I, say, accidentally call `offset` instead of `wrapping_offset`, or if I |
| 24 | +dereference a raw pointer thinking it is a reference, this happens without any |
| 25 | +further notice when I am writing an `unsafe fn` because the body of an `unsafe |
| 26 | +fn` is treated as an `unsafe` block. |
| 27 | + |
| 28 | +For example, notice how |
| 29 | +[this PR](https://github.com/rust-lang/rust/pull/55043/files) significantly |
| 30 | +increased the amount of code in the thread spawning function that is considered |
| 31 | +to be inside an `unsafe` block. |
| 32 | + |
| 33 | +The original justification for this behavior (according to my understanding) was |
| 34 | +that calling this function is anyway unsafe, so there is no harm done in |
| 35 | +allowing *it* to perform unsafe operations. And indeed the current situation |
| 36 | +*does* provide the guarantee that a program without `unsafe` cannot be UB. |
| 37 | +However, this neglects the other aspect of `unsafe` that I described above: To |
| 38 | +make the programmer aware that they are treading dangerous ground even when they |
| 39 | +may not realize they are doing so. |
| 40 | + |
| 41 | +In fact, this double role of `unsafe` in `unsafe fn` (making it both unsafe to |
| 42 | +call and enabling it to call other unsafe operations) conflates the two *dual* |
| 43 | +roles that `unsafe` plays in Rust. On the one hand, there are places that |
| 44 | +*define* a proof obligation, these make things "unsafe to call/do" (e.g., the |
| 45 | +language definition says that dereferencing a raw pointer requires it not to be |
| 46 | +dangling). On the other hand, there are places that *discharge* the proof |
| 47 | +obligation, these are "unsafe blocks of code" (e.g., unsafe code that |
| 48 | +dereferences a raw pointer has to locally argue why it cannot be dangling). |
| 49 | + |
| 50 | +`unsafe {}` blocks are about *discharging* obligations, but `unsafe fn` are |
| 51 | +about *defining* obligations. The fact that the body of an `unsafe fn` is also |
| 52 | +implicitly treated like a block has made it hard to realize this duality |
| 53 | +[even for experienced Rust developers][unsafe-dual]. (Completing the picture, |
| 54 | +`unsafe Trait` also defines an obligation, that is discharged by `unsafe impl`. |
| 55 | +Curiously, `unsafe trait` does *not* implicitly make all bodies of default |
| 56 | +functions defined inside this trait unsafe blocks, which is somewhat |
| 57 | +inconsistent with `unsafe fn` when viewed through this lens.) |
| 58 | + |
| 59 | +[unsafe-dual]: https://github.com/rust-lang/rfcs/pull/2585#issuecomment-577852430 |
| 60 | + |
| 61 | +# Guide-level explanation |
| 62 | +[guide-level-explanation]: #guide-level-explanation |
| 63 | + |
| 64 | +The `unsafe` keyword in Rust serves two related purposes. |
| 65 | + |
| 66 | +When you perform an "unsafe to call" operation, like dereferencing a raw pointer |
| 67 | +or calling an `unsafe fn`, you must enclose that code in an `unsafe {}` block. |
| 68 | +The purpose of this is to acknowledge that the operation you are performing here |
| 69 | +has *not* been checked by the compiler, you are responsible yourself for |
| 70 | +upholding Rust's safety guarantees. Generally, unsafe operations come with |
| 71 | +detailed documentation for the conditions that must be met when this operation |
| 72 | +is executed; it is up to you to check that all these conditions are indeed met. |
| 73 | + |
| 74 | +When you are writing a function that itself has additional conditions to ensure |
| 75 | +safety (say, it accesses some data without making some necessary bounds checks, |
| 76 | +or it takes some raw pointers as arguments and performs memory operations based |
| 77 | +on them), then you should mark this as an `unsafe fn` and it is up to you to |
| 78 | +document the conditions that must be met for the arguments. This use of the |
| 79 | +`unsafe` keyword makes your function itself "unsafe to call". |
| 80 | + |
| 81 | +The same duality can be observed in traits: `unsafe trait` is like `unsafe fn`; |
| 82 | +it makes implementing this trait an "unsafe to call" operation and it is up to |
| 83 | +whoever defines the trait to precisely document what is unsafe about it. |
| 84 | +`unsafe impl` is like `unsafe {}`, it acknowledges that there are extra |
| 85 | +requirements here that are not checked by the compiler and that the programmer |
| 86 | +is responsible to uphold. |
| 87 | + |
| 88 | +For this reason, "unsafe to call" operations inside an `unsafe fn` must be |
| 89 | +contained inside an `unsafe {}` block like everywhere else. The author of these |
| 90 | +functions has to ensure that the requirements of the operation are upheld. To |
| 91 | +this end, the author may of course assume that the caller of the `unsafe fn` in |
| 92 | +turn uphold their own requirements. |
| 93 | + |
| 94 | +For backwards compatibility reasons, this unsafety check inside `unsafe fn` is |
| 95 | +controlled by a lint, `unsafe_op_in_unsafe_fn`. By setting |
| 96 | +`#[deny(unsafe_op_in_unsafe_fn)]`, the compiler is as strict about unsafe |
| 97 | +operations inside `unsafe fn` as it is everywhere else. |
| 98 | + |
| 99 | +This lint is allow-by-default initially, and will be warn-by-default across all |
| 100 | +editions eventually. In future editions, it may become deny-by-default, or even |
| 101 | +a hard error. |
| 102 | + |
| 103 | +# Reference-level explanation |
| 104 | +[reference-level-explanation]: #reference-level-explanation |
| 105 | + |
| 106 | +The new `unsafe_op_in_unsafe_fn` lint triggers when an unsafe operation is used |
| 107 | +inside an `unsafe fn` but outside `unsafe {}` blocks. So, the following will |
| 108 | +emit a warning: |
| 109 | + |
| 110 | +```rust |
| 111 | +#[warn(unsafe_op_in_unsafe_fn)] |
| 112 | +unsafe fn get_unchecked<T>(x: &[T], i: usize) -> &T { |
| 113 | + x.get_unchecked(i) |
| 114 | +} |
| 115 | +``` |
| 116 | + |
| 117 | +Moreover, if and only if the `unsafe_op_in_unsafe_fn` lint is not `allow`ed, we |
| 118 | +no longer warn that an `unsafe` block is unnecessary when it is nested |
| 119 | +immediately inside an `unsafe fn`. So, the following compiles without any |
| 120 | +warning: |
| 121 | + |
| 122 | +```rust |
| 123 | +#[warn(unsafe_op_in_unsafe_fn)] |
| 124 | +unsafe fn get_unchecked<T>(x: &[T], i: usize) -> &T { |
| 125 | + unsafe { x.get_unchecked(i) } |
| 126 | +} |
| 127 | +``` |
| 128 | + |
| 129 | +However, nested `unsafe` blocks are still redundant, so this warns: |
| 130 | + |
| 131 | +```rust |
| 132 | +#[warn(unsafe_op_in_unsafe_fn)] |
| 133 | +unsafe fn get_unchecked<T>(x: &[T], i: usize) -> &T { |
| 134 | + unsafe { unsafe { x.get_unchecked(i) } } |
| 135 | +} |
| 136 | +``` |
| 137 | + |
| 138 | +# Drawbacks |
| 139 | +[drawbacks]: #drawbacks |
| 140 | + |
| 141 | +Many `unsafe fn` are actually rather short (no more than 3 lines) and will end |
| 142 | +up just being one large `unsafe` block. This change would make such functions |
| 143 | +less ergonomic to write, they would likely become |
| 144 | + |
| 145 | +```rust |
| 146 | +unsafe fn foo(...) -> ... { unsafe { |
| 147 | + // Code goes here |
| 148 | +} } |
| 149 | +``` |
| 150 | + |
| 151 | +# Rationale and alternatives |
| 152 | +[rationale-and-alternatives]: #rationale-and-alternatives |
| 153 | + |
| 154 | +To achieve the goals laid out in the motivation section, the proposed approach |
| 155 | +is least invasive in the sense that it avoids introducing new keywords, and |
| 156 | +instead relies on the existing lint mechanism to perform the transition. |
| 157 | + |
| 158 | +One alternative always is to not do anything, and live with the current |
| 159 | +situation. |
| 160 | + |
| 161 | +We could avoid using `unsafe` for dual purpose, and instead have `unsafe_to_call |
| 162 | +fn` for functions that are "unsafe to call" but do not implicitly have an |
| 163 | +`unsafe {}` block in their body. For consistency, we might want `unsafe_to_impl |
| 164 | +trait` for traits, though the behavior would be the same as `unsafe trait`. |
| 165 | + |
| 166 | +We could introduce named proof obligations (proposed by @Centril) such that the |
| 167 | +compiler can be be told (to some extend) if the assumptions made by the `unsafe |
| 168 | +fn` are sufficient to discharge the requirements of the unsafe operations. |
| 169 | + |
| 170 | +We could restrict this requirement to use `unsafe` blocks in `unsafe fn` to |
| 171 | +those `unsafe fn` that contain at least one `unsafe` block, meaning short |
| 172 | +`unsafe fn` would keep compiling like they do now. |
| 173 | + |
| 174 | +And of course, the lint name is subject to bikeshedding. |
| 175 | + |
| 176 | +# Prior art |
| 177 | +[prior-art]: #prior-art |
| 178 | + |
| 179 | +The only other language that I am aware of that has a notion of `unsafe` blocks |
| 180 | +and `unsafe` functions is C#. It |
| 181 | +[looks like](https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/unsafe) |
| 182 | +there, unsafe operations can be freely used inside an `unsafe` function even |
| 183 | +without a further `unsafe` block. However, based on @Ixrec's experience, |
| 184 | +`unsafe` plays hardly any role in the C# ecosystem and they do not have a |
| 185 | +culture of thinking about this in terms of proof obligations. |
| 186 | + |
| 187 | +# Unresolved questions |
| 188 | +[unresolved-questions]: #unresolved-questions |
| 189 | + |
| 190 | +What is the timeline for adding the lint, and cranking up its default level? |
| 191 | +Should the default level depend on the edition? |
| 192 | + |
| 193 | +Should we ever make this deny-by-default or even a hard error, in a future |
| 194 | +edition? |
| 195 | + |
| 196 | +Should we require `cargo fix` to be able to do *something* about this warning |
| 197 | +before making it even warn-by-default? (We certainly need to do something |
| 198 | +before making it deny-by-default or a hard error in a future edition.) `cargo |
| 199 | +fix` could add big `unsafe {}` blocks around the entire body of every `unsafe |
| 200 | +fn`. That would not improve the amount of care that is taken for unsafety in |
| 201 | +the fixed code, but it would provide a way to the incrementally improve the big |
| 202 | +functions, and new functions written later would have the appropriate amount of |
| 203 | +care applied to them from the start. Potentially, `rustfmt` could be taught to |
| 204 | +format `unsafe` blocks that wrap the entire function body in a way that avoids |
| 205 | +double-indent. "function bodies as expressions" would enable a format like |
| 206 | +`unsafe fn foo() = unsafe { body }`. |
| 207 | + |
| 208 | +It is not entirely clear if having the behavior of one lint depend on another |
| 209 | +will work (though most likely, it will). If it does not, we should try to find |
| 210 | +some other mechanism to opt-in to the new treatment of `unsafe fn` bodies. |
0 commit comments