Skip to content

Commit 78a4f8c

Browse files
authored
Rollup merge of #65772 - mark-i-m:final-readmes, r=nikomatsakis
Remove the last remaining READMEs cc rust-lang/rustc-dev-guide#481 closes #48478 r? @nikomatsakis
2 parents d51a842 + 1a93bc5 commit 78a4f8c

File tree

4 files changed

+8
-400
lines changed

4 files changed

+8
-400
lines changed

src/README.md

-3
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,4 @@ This directory contains the source code of the rust project, including:
55

66
For more information on how various parts of the compiler work, see the [rustc guide].
77

8-
There is also useful content in this README:
9-
https://github.com/rust-lang/rust/tree/master/src/librustc/infer/lexical_region_resolve.
10-
118
[rustc guide]: https://rust-lang.github.io/rustc-guide/about-this-guide.html
Original file line numberDiff line numberDiff line change
@@ -1,268 +1,7 @@
1-
# Region inference
21

3-
> WARNING: This README is obsolete and will be removed soon! For
4-
> more info on how the current borrowck works, see the [rustc guide].
5-
>
6-
> As of edition 2018, region inference is done using Non-lexical lifetimes,
7-
> which is described in the guide and [this RFC].
2+
Lexical Region Resolution was removed in https://github.com/rust-lang/rust/pull/64790.
83

9-
[rustc guide]: https://rust-lang.github.io/rustc-guide/borrow_check/region_inference.html
10-
[this RFC]: https://github.com/rust-lang/rfcs/blob/master/text/2094-nll.md
4+
Rust now uses Non-lexical lifetimes. For more info, please see the [borrowck
5+
chapter][bc] in the rustc-guide.
116

12-
## Terminology
13-
14-
Note that we use the terms region and lifetime interchangeably.
15-
16-
## Introduction
17-
18-
Region inference uses a somewhat more involved algorithm than type
19-
inference. It is not the most efficient thing ever written though it
20-
seems to work well enough in practice (famous last words). The reason
21-
that we use a different algorithm is because, unlike with types, it is
22-
impractical to hand-annotate with regions (in some cases, there aren't
23-
even the requisite syntactic forms). So we have to get it right, and
24-
it's worth spending more time on a more involved analysis. Moreover,
25-
regions are a simpler case than types: they don't have aggregate
26-
structure, for example.
27-
28-
## The problem
29-
30-
Basically our input is a directed graph where nodes can be divided
31-
into two categories: region variables and concrete regions. Each edge
32-
`R -> S` in the graph represents a constraint that the region `R` is a
33-
subregion of the region `S`.
34-
35-
Region variable nodes can have arbitrary degree. There is one region
36-
variable node per region variable.
37-
38-
Each concrete region node is associated with some, well, concrete
39-
region: e.g., a free lifetime, or the region for a particular scope.
40-
Note that there may be more than one concrete region node for a
41-
particular region value. Moreover, because of how the graph is built,
42-
we know that all concrete region nodes have either in-degree 1 or
43-
out-degree 1.
44-
45-
Before resolution begins, we build up the constraints in a hashmap
46-
that maps `Constraint` keys to spans. During resolution, we construct
47-
the actual `Graph` structure that we describe here.
48-
49-
## Computing the values for region variables
50-
51-
The algorithm is a simple dataflow algorithm. Each region variable
52-
begins as empty. We iterate over the constraints, and for each constraint
53-
we grow the relevant region variable to be as big as it must be to meet all the
54-
constraints. This means the region variables can grow to be `'static` if
55-
necessary.
56-
57-
## Verification
58-
59-
After all constraints are fully propoagated, we do a "verification"
60-
step where we walk over the verify bounds and check that they are
61-
satisfied. These bounds represent the "maximal" values that a region
62-
variable can take on, basically.
63-
64-
## The Region Hierarchy
65-
66-
### Without closures
67-
68-
Let's first consider the region hierarchy without thinking about
69-
closures, because they add a lot of complications. The region
70-
hierarchy *basically* mirrors the lexical structure of the code.
71-
There is a region for every piece of 'evaluation' that occurs, meaning
72-
every expression, block, and pattern (patterns are considered to
73-
"execute" by testing the value they are applied to and creating any
74-
relevant bindings). So, for example:
75-
76-
```rust
77-
fn foo(x: isize, y: isize) { // -+
78-
// +------------+ // |
79-
// | +-----+ // |
80-
// | +-+ +-+ +-+ // |
81-
// | | | | | | | // |
82-
// v v v v v v v // |
83-
let z = x + y; // |
84-
... // |
85-
} // -+
86-
87-
fn bar() { ... }
88-
```
89-
90-
In this example, there is a region for the fn body block as a whole,
91-
and then a subregion for the declaration of the local variable.
92-
Within that, there are sublifetimes for the assignment pattern and
93-
also the expression `x + y`. The expression itself has sublifetimes
94-
for evaluating `x` and `y`.
95-
96-
#s## Function calls
97-
98-
Function calls are a bit tricky. I will describe how we handle them
99-
*now* and then a bit about how we can improve them (Issue #6268).
100-
101-
Consider a function call like `func(expr1, expr2)`, where `func`,
102-
`arg1`, and `arg2` are all arbitrary expressions. Currently,
103-
we construct a region hierarchy like:
104-
105-
+----------------+
106-
| |
107-
+--+ +---+ +---+|
108-
v v v v v vv
109-
func(expr1, expr2)
110-
111-
Here you can see that the call as a whole has a region and the
112-
function plus arguments are subregions of that. As a side-effect of
113-
this, we get a lot of spurious errors around nested calls, in
114-
particular when combined with `&mut` functions. For example, a call
115-
like this one
116-
117-
```rust
118-
self.foo(self.bar())
119-
```
120-
121-
where both `foo` and `bar` are `&mut self` functions will always yield
122-
an error.
123-
124-
Here is a more involved example (which is safe) so we can see what's
125-
going on:
126-
127-
```rust
128-
struct Foo { f: usize, g: usize }
129-
// ...
130-
fn add(p: &mut usize, v: usize) {
131-
*p += v;
132-
}
133-
// ...
134-
fn inc(p: &mut usize) -> usize {
135-
*p += 1; *p
136-
}
137-
fn weird() {
138-
let mut x: Box<Foo> = box Foo { /* ... */ };
139-
'a: add(&mut (*x).f,
140-
'b: inc(&mut (*x).f)) // (..)
141-
}
142-
```
143-
144-
The important part is the line marked `(..)` which contains a call to
145-
`add()`. The first argument is a mutable borrow of the field `f`. The
146-
second argument also borrows the field `f`. Now, in the current borrow
147-
checker, the first borrow is given the lifetime of the call to
148-
`add()`, `'a`. The second borrow is given the lifetime of `'b` of the
149-
call to `inc()`. Because `'b` is considered to be a sublifetime of
150-
`'a`, an error is reported since there are two co-existing mutable
151-
borrows of the same data.
152-
153-
However, if we were to examine the lifetimes a bit more carefully, we
154-
can see that this error is unnecessary. Let's examine the lifetimes
155-
involved with `'a` in detail. We'll break apart all the steps involved
156-
in a call expression:
157-
158-
```rust
159-
'a: {
160-
'a_arg1: let a_temp1: ... = add;
161-
'a_arg2: let a_temp2: &'a mut usize = &'a mut (*x).f;
162-
'a_arg3: let a_temp3: usize = {
163-
let b_temp1: ... = inc;
164-
let b_temp2: &'b = &'b mut (*x).f;
165-
'b_call: b_temp1(b_temp2)
166-
};
167-
'a_call: a_temp1(a_temp2, a_temp3) // (**)
168-
}
169-
```
170-
171-
Here we see that the lifetime `'a` includes a number of substatements.
172-
In particular, there is this lifetime I've called `'a_call` that
173-
corresponds to the *actual execution of the function `add()`*, after
174-
all arguments have been evaluated. There is a corresponding lifetime
175-
`'b_call` for the execution of `inc()`. If we wanted to be precise
176-
about it, the lifetime of the two borrows should be `'a_call` and
177-
`'b_call` respectively, since the references that were created
178-
will not be dereferenced except during the execution itself.
179-
180-
However, this model by itself is not sound. The reason is that
181-
while the two references that are created will never be used
182-
simultaneously, it is still true that the first reference is
183-
*created* before the second argument is evaluated, and so even though
184-
it will not be *dereferenced* during the evaluation of the second
185-
argument, it can still be *invalidated* by that evaluation. Consider
186-
this similar but unsound example:
187-
188-
```rust
189-
struct Foo { f: usize, g: usize }
190-
// ...
191-
fn add(p: &mut usize, v: usize) {
192-
*p += v;
193-
}
194-
// ...
195-
fn consume(x: Box<Foo>) -> usize {
196-
x.f + x.g
197-
}
198-
fn weird() {
199-
let mut x: Box<Foo> = box Foo { ... };
200-
'a: add(&mut (*x).f, consume(x)) // (..)
201-
}
202-
```
203-
204-
In this case, the second argument to `add` actually consumes `x`, thus
205-
invalidating the first argument.
206-
207-
So, for now, we exclude the `call` lifetimes from our model.
208-
Eventually I would like to include them, but we will have to make the
209-
borrow checker handle this situation correctly. In particular, if
210-
there is a reference created whose lifetime does not enclose
211-
the borrow expression, we must issue sufficient restrictions to ensure
212-
that the pointee remains valid.
213-
214-
### Modeling closures
215-
216-
Integrating closures properly into the model is a bit of
217-
work-in-progress. In an ideal world, we would model closures as
218-
closely as possible after their desugared equivalents. That is, a
219-
closure type would be modeled as a struct, and the region hierarchy of
220-
different closure bodies would be completely distinct from all other
221-
fns. We are generally moving in that direction but there are
222-
complications in terms of the implementation.
223-
224-
In practice what we currently do is somewhat different. The basis for
225-
the current approach is the observation that the only time that
226-
regions from distinct fn bodies interact with one another is through
227-
an upvar or the type of a fn parameter (since closures live in the fn
228-
body namespace, they can in fact have fn parameters whose types
229-
include regions from the surrounding fn body). For these cases, there
230-
are separate mechanisms which ensure that the regions that appear in
231-
upvars/parameters outlive the dynamic extent of each call to the
232-
closure:
233-
234-
1. Types must outlive the region of any expression where they are used.
235-
For a closure type `C` to outlive a region `'r`, that implies that the
236-
types of all its upvars must outlive `'r`.
237-
2. Parameters must outlive the region of any fn that they are passed to.
238-
239-
Therefore, we can -- sort of -- assume that any region from an
240-
enclosing fns is larger than any region from one of its enclosed
241-
fn. And that is precisely what we do: when building the region
242-
hierarchy, each region lives in its own distinct subtree, but if we
243-
are asked to compute the `LUB(r1, r2)` of two regions, and those
244-
regions are in disjoint subtrees, we compare the lexical nesting of
245-
the two regions.
246-
247-
*Ideas for improving the situation:* (FIXME #3696) The correctness
248-
argument here is subtle and a bit hand-wavy. The ideal, as stated
249-
earlier, would be to model things in such a way that it corresponds
250-
more closely to the desugared code. The best approach for doing this
251-
is a bit unclear: it may in fact be possible to *actually* desugar
252-
before we start, but I don't think so. The main option that I've been
253-
thinking through is imposing a "view shift" as we enter the fn body,
254-
so that regions appearing in the types of fn parameters and upvars are
255-
translated from being regions in the outer fn into free region
256-
parameters, just as they would be if we applied the desugaring. The
257-
challenge here is that type inference may not have fully run, so the
258-
types may not be fully known: we could probably do this translation
259-
lazilly, as type variables are instantiated. We would also have to
260-
apply a kind of inverse translation to the return value. This would be
261-
a good idea anyway, as right now it is possible for free regions
262-
instantiated within the closure to leak into the parent: this
263-
currently leads to type errors, since those regions cannot outlive any
264-
expressions within the parent hierarchy. Much like the current
265-
handling of closures, there are no known cases where this leads to a
266-
type-checking accepting incorrect code (though it sometimes rejects
267-
what might be considered correct code; see rust-lang/rust#22557), but
268-
it still doesn't feel like the right approach.
7+
[bc]: https://rust-lang.github.io/rustc-guide/borrow_check/region_inference.html

0 commit comments

Comments
 (0)