Skip to content

Commit 501ab2a

Browse files
committed
Merge remote-tracking branch 'RalfJung/unsafe-block-in-unsafe-fn'
2 parents 60e0b2a + 9838e5d commit 501ab2a

File tree

1 file changed

+210
-0
lines changed

1 file changed

+210
-0
lines changed
+210
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,210 @@
1+
- Feature Name: unsafe_block_in_unsafe_fn
2+
- Start Date: 2018-11-04
3+
- RFC PR: (leave this empty)
4+
- Rust Issue: (leave this empty)
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
No longer treat the body of an `unsafe fn` as being an `unsafe` block. To avoid
10+
a breaking change, this is a warning now and may become an error in a future
11+
edition.
12+
13+
# Motivation
14+
[motivation]: #motivation
15+
16+
Marking a function as `unsafe` is one of Rust's key protections against
17+
undefined behavior: Even if the programmer does not read the documentation,
18+
calling an `unsafe` function (or performing another unsafe operation) outside an
19+
`unsafe` block will lead to a compile error, hopefully followed by reading the
20+
documentation.
21+
22+
However, we currently entirely lose this protection when writing an `unsafe fn`:
23+
If I, say, accidentally call `offset` instead of `wrapping_offset`, or if I
24+
dereference a raw pointer thinking it is a reference, this happens without any
25+
further notice when I am writing an `unsafe fn` because the body of an `unsafe
26+
fn` is treated as an `unsafe` block.
27+
28+
For example, notice how
29+
[this PR](https://github.com/rust-lang/rust/pull/55043/files) significantly
30+
increased the amount of code in the thread spawning function that is considered
31+
to be inside an `unsafe` block.
32+
33+
The original justification for this behavior (according to my understanding) was
34+
that calling this function is anyway unsafe, so there is no harm done in
35+
allowing *it* to perform unsafe operations. And indeed the current situation
36+
*does* provide the guarantee that a program without `unsafe` cannot be UB.
37+
However, this neglects the other aspect of `unsafe` that I described above: To
38+
make the programmer aware that they are treading dangerous ground even when they
39+
may not realize they are doing so.
40+
41+
In fact, this double role of `unsafe` in `unsafe fn` (making it both unsafe to
42+
call and enabling it to call other unsafe operations) conflates the two *dual*
43+
roles that `unsafe` plays in Rust. On the one hand, there are places that
44+
*define* a proof obligation, these make things "unsafe to call/do" (e.g., the
45+
language definition says that dereferencing a raw pointer requires it not to be
46+
dangling). On the other hand, there are places that *discharge* the proof
47+
obligation, these are "unsafe blocks of code" (e.g., unsafe code that
48+
dereferences a raw pointer has to locally argue why it cannot be dangling).
49+
50+
`unsafe {}` blocks are about *discharging* obligations, but `unsafe fn` are
51+
about *defining* obligations. The fact that the body of an `unsafe fn` is also
52+
implicitly treated like a block has made it hard to realize this duality
53+
[even for experienced Rust developers][unsafe-dual]. (Completing the picture,
54+
`unsafe Trait` also defines an obligation, that is discharged by `unsafe impl`.
55+
Curiously, `unsafe trait` does *not* implicitly make all bodies of default
56+
functions defined inside this trait unsafe blocks, which is somewhat
57+
inconsistent with `unsafe fn` when viewed through this lens.)
58+
59+
[unsafe-dual]: https://github.com/rust-lang/rfcs/pull/2585#issuecomment-577852430
60+
61+
# Guide-level explanation
62+
[guide-level-explanation]: #guide-level-explanation
63+
64+
The `unsafe` keyword in Rust serves two related purposes.
65+
66+
When you perform an "unsafe to call" operation, like dereferencing a raw pointer
67+
or calling an `unsafe fn`, you must enclose that code in an `unsafe {}` block.
68+
The purpose of this is to acknowledge that the operation you are performing here
69+
has *not* been checked by the compiler, you are responsible yourself for
70+
upholding Rust's safety guarantees. Generally, unsafe operations come with
71+
detailed documentation for the conditions that must be met when this operation
72+
is executed; it is up to you to check that all these conditions are indeed met.
73+
74+
When you are writing a function that itself has additional conditions to ensure
75+
safety (say, it accesses some data without making some necessary bounds checks,
76+
or it takes some raw pointers as arguments and performs memory operations based
77+
on them), then you should mark this as an `unsafe fn` and it is up to you to
78+
document the conditions that must be met for the arguments. This use of the
79+
`unsafe` keyword makes your function itself "unsafe to call".
80+
81+
The same duality can be observed in traits: `unsafe trait` is like `unsafe fn`;
82+
it makes implementing this trait an "unsafe to call" operation and it is up to
83+
whoever defines the trait to precisely document what is unsafe about it.
84+
`unsafe impl` is like `unsafe {}`, it acknowledges that there are extra
85+
requirements here that are not checked by the compiler and that the programmer
86+
is responsible to uphold.
87+
88+
For this reason, "unsafe to call" operations inside an `unsafe fn` must be
89+
contained inside an `unsafe {}` block like everywhere else. The author of these
90+
functions has to ensure that the requirements of the operation are upheld. To
91+
this end, the author may of course assume that the caller of the `unsafe fn` in
92+
turn uphold their own requirements.
93+
94+
For backwards compatibility reasons, this unsafety check inside `unsafe fn` is
95+
controlled by a lint, `unsafe_op_in_unsafe_fn`. By setting
96+
`#[deny(unsafe_op_in_unsafe_fn)]`, the compiler is as strict about unsafe
97+
operations inside `unsafe fn` as it is everywhere else.
98+
99+
This lint is allow-by-default initially, and will be warn-by-default across all
100+
editions eventually. In future editions, it may become deny-by-default, or even
101+
a hard error.
102+
103+
# Reference-level explanation
104+
[reference-level-explanation]: #reference-level-explanation
105+
106+
The new `unsafe_op_in_unsafe_fn` lint triggers when an unsafe operation is used
107+
inside an `unsafe fn` but outside `unsafe {}` blocks. So, the following will
108+
emit a warning:
109+
110+
```rust
111+
#[warn(unsafe_op_in_unsafe_fn)]
112+
unsafe fn get_unchecked<T>(x: &[T], i: usize) -> &T {
113+
x.get_unchecked(i)
114+
}
115+
```
116+
117+
Moreover, if and only if the `unsafe_op_in_unsafe_fn` lint is not `allow`ed, we
118+
no longer warn that an `unsafe` block is unnecessary when it is nested
119+
immediately inside an `unsafe fn`. So, the following compiles without any
120+
warning:
121+
122+
```rust
123+
#[warn(unsafe_op_in_unsafe_fn)]
124+
unsafe fn get_unchecked<T>(x: &[T], i: usize) -> &T {
125+
unsafe { x.get_unchecked(i) }
126+
}
127+
```
128+
129+
However, nested `unsafe` blocks are still redundant, so this warns:
130+
131+
```rust
132+
#[warn(unsafe_op_in_unsafe_fn)]
133+
unsafe fn get_unchecked<T>(x: &[T], i: usize) -> &T {
134+
unsafe { unsafe { x.get_unchecked(i) } }
135+
}
136+
```
137+
138+
# Drawbacks
139+
[drawbacks]: #drawbacks
140+
141+
Many `unsafe fn` are actually rather short (no more than 3 lines) and will end
142+
up just being one large `unsafe` block. This change would make such functions
143+
less ergonomic to write, they would likely become
144+
145+
```rust
146+
unsafe fn foo(...) -> ... { unsafe {
147+
// Code goes here
148+
} }
149+
```
150+
151+
# Rationale and alternatives
152+
[rationale-and-alternatives]: #rationale-and-alternatives
153+
154+
To achieve the goals laid out in the motivation section, the proposed approach
155+
is least invasive in the sense that it avoids introducing new keywords, and
156+
instead relies on the existing lint mechanism to perform the transition.
157+
158+
One alternative always is to not do anything, and live with the current
159+
situation.
160+
161+
We could avoid using `unsafe` for dual purpose, and instead have `unsafe_to_call
162+
fn` for functions that are "unsafe to call" but do not implicitly have an
163+
`unsafe {}` block in their body. For consistency, we might want `unsafe_to_impl
164+
trait` for traits, though the behavior would be the same as `unsafe trait`.
165+
166+
We could introduce named proof obligations (proposed by @Centril) such that the
167+
compiler can be be told (to some extend) if the assumptions made by the `unsafe
168+
fn` are sufficient to discharge the requirements of the unsafe operations.
169+
170+
We could restrict this requirement to use `unsafe` blocks in `unsafe fn` to
171+
those `unsafe fn` that contain at least one `unsafe` block, meaning short
172+
`unsafe fn` would keep compiling like they do now.
173+
174+
And of course, the lint name is subject to bikeshedding.
175+
176+
# Prior art
177+
[prior-art]: #prior-art
178+
179+
The only other language that I am aware of that has a notion of `unsafe` blocks
180+
and `unsafe` functions is C#. It
181+
[looks like](https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/unsafe)
182+
there, unsafe operations can be freely used inside an `unsafe` function even
183+
without a further `unsafe` block. However, based on @Ixrec's experience,
184+
`unsafe` plays hardly any role in the C# ecosystem and they do not have a
185+
culture of thinking about this in terms of proof obligations.
186+
187+
# Unresolved questions
188+
[unresolved-questions]: #unresolved-questions
189+
190+
What is the timeline for adding the lint, and cranking up its default level?
191+
Should the default level depend on the edition?
192+
193+
Should we ever make this deny-by-default or even a hard error, in a future
194+
edition?
195+
196+
Should we require `cargo fix` to be able to do *something* about this warning
197+
before making it even warn-by-default? (We certainly need to do something
198+
before making it deny-by-default or a hard error in a future edition.) `cargo
199+
fix` could add big `unsafe {}` blocks around the entire body of every `unsafe
200+
fn`. That would not improve the amount of care that is taken for unsafety in
201+
the fixed code, but it would provide a way to the incrementally improve the big
202+
functions, and new functions written later would have the appropriate amount of
203+
care applied to them from the start. Potentially, `rustfmt` could be taught to
204+
format `unsafe` blocks that wrap the entire function body in a way that avoids
205+
double-indent. "function bodies as expressions" would enable a format like
206+
`unsafe fn foo() = unsafe { body }`.
207+
208+
It is not entirely clear if having the behavior of one lint depend on another
209+
will work (though most likely, it will). If it does not, we should try to find
210+
some other mechanism to opt-in to the new treatment of `unsafe fn` bodies.

0 commit comments

Comments
 (0)