|
1 |
| -# Dependency graph for incremental compilation |
| 1 | +To learn more about how dependency tracking works in rustc, see the [rustc |
| 2 | +guide]. |
2 | 3 |
|
3 |
| -This module contains the infrastructure for managing the incremental |
4 |
| -compilation dependency graph. This README aims to explain how it ought |
5 |
| -to be used. In this document, we'll first explain the overall |
6 |
| -strategy, and then share some tips for handling specific scenarios. |
7 |
| - |
8 |
| -The high-level idea is that we want to instrument the compiler to |
9 |
| -track which parts of the AST and other IR are read/written by what. |
10 |
| -This way, when we come back later, we can look at this graph and |
11 |
| -determine what work needs to be redone. |
12 |
| - |
13 |
| -### The dependency graph |
14 |
| - |
15 |
| -The nodes of the graph are defined by the enum `DepNode`. They represent |
16 |
| -one of three things: |
17 |
| - |
18 |
| -1. HIR nodes (like `Hir(DefId)`) represent the HIR input itself. |
19 |
| -2. Data nodes (like `TypeOfItem(DefId)`) represent some computed |
20 |
| - information about a particular item. |
21 |
| -3. Procedure nodes (like `CoherenceCheckTrait(DefId)`) represent some |
22 |
| - procedure that is executing. Usually this procedure is |
23 |
| - performing some kind of check for errors. You can think of them as |
24 |
| - computed values where the value being computed is `()` (and the |
25 |
| - value may fail to be computed, if an error results). |
26 |
| - |
27 |
| -An edge `N1 -> N2` is added between two nodes if either: |
28 |
| - |
29 |
| -- the value of `N1` is used to compute `N2`; |
30 |
| -- `N1` is read by the procedure `N2`; |
31 |
| -- the procedure `N1` writes the value `N2`. |
32 |
| - |
33 |
| -The latter two conditions are equivalent to the first one if you think |
34 |
| -of procedures as values. |
35 |
| - |
36 |
| -### Basic tracking |
37 |
| - |
38 |
| -There is a very general strategy to ensure that you have a correct, if |
39 |
| -sometimes overconservative, dependency graph. The two main things you have |
40 |
| -to do are (a) identify shared state and (b) identify the current tasks. |
41 |
| - |
42 |
| -### Identifying shared state |
43 |
| - |
44 |
| -Identify "shared state" that will be written by one pass and read by |
45 |
| -another. In particular, we need to identify shared state that will be |
46 |
| -read "across items" -- that is, anything where changes in one item |
47 |
| -could invalidate work done for other items. So, for example: |
48 |
| - |
49 |
| -1. The signature for a function is "shared state". |
50 |
| -2. The computed type of some expression in the body of a function is |
51 |
| - not shared state, because if it changes it does not itself |
52 |
| - invalidate other functions (though it may be that it causes new |
53 |
| - monomorphizations to occur, but that's handled independently). |
54 |
| - |
55 |
| -Put another way: if the HIR for an item changes, we are going to |
56 |
| -recompile that item for sure. But we need the dep tracking map to tell |
57 |
| -us what *else* we have to recompile. Shared state is anything that is |
58 |
| -used to communicate results from one item to another. |
59 |
| - |
60 |
| -### Identifying the current task, tracking reads/writes, etc |
61 |
| - |
62 |
| -FIXME(#42293). This text needs to be rewritten for the new red-green |
63 |
| -system, which doesn't fully exist yet. |
64 |
| - |
65 |
| -#### Dependency tracking map |
66 |
| - |
67 |
| -`DepTrackingMap` is a particularly convenient way to correctly store |
68 |
| -shared state. A `DepTrackingMap` is a special hashmap that will add |
69 |
| -edges automatically when `get` and `insert` are called. The idea is |
70 |
| -that, when you get/insert a value for the key `K`, we will add an edge |
71 |
| -from/to the node `DepNode::Variant(K)` (for some variant specific to |
72 |
| -the map). |
73 |
| - |
74 |
| -Each `DepTrackingMap` is parameterized by a special type `M` that |
75 |
| -implements `DepTrackingMapConfig`; this trait defines the key and value |
76 |
| -types of the map, and also defines a fn for converting from the key to |
77 |
| -a `DepNode` label. You don't usually have to muck about with this by |
78 |
| -hand, there is a macro for creating it. You can see the complete set |
79 |
| -of `DepTrackingMap` definitions in `librustc/middle/ty/maps.rs`. |
80 |
| - |
81 |
| -As an example, let's look at the `adt_defs` map. The `adt_defs` map |
82 |
| -maps from the def-id of a struct/enum to its `AdtDef`. It is defined |
83 |
| -using this macro: |
84 |
| - |
85 |
| -```rust |
86 |
| -dep_map_ty! { AdtDefs: ItemSignature(DefId) -> ty::AdtDefMaster<'tcx> } |
87 |
| -// ~~~~~~~ ~~~~~~~~~~~~~ ~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ |
88 |
| -// | | Key type Value type |
89 |
| -// | DepNode variant |
90 |
| -// Name of map id type |
91 |
| -``` |
92 |
| - |
93 |
| -this indicates that a map id type `AdtDefs` will be created. The key |
94 |
| -of the map will be a `DefId` and value will be |
95 |
| -`ty::AdtDefMaster<'tcx>`. The `DepNode` will be created by |
96 |
| -`DepNode::ItemSignature(K)` for a given key. |
97 |
| - |
98 |
| -Once that is done, you can just use the `DepTrackingMap` like any |
99 |
| -other map: |
100 |
| - |
101 |
| -```rust |
102 |
| -let mut map: DepTrackingMap<M> = DepTrackingMap::new(dep_graph); |
103 |
| -map.insert(key, value); // registers dep_graph.write |
104 |
| -map.get(key; // registers dep_graph.read |
105 |
| -``` |
106 |
| - |
107 |
| -#### Memoization |
108 |
| - |
109 |
| -One particularly interesting case is memoization. If you have some |
110 |
| -shared state that you compute in a memoized fashion, the correct thing |
111 |
| -to do is to define a `RefCell<DepTrackingMap>` for it and use the |
112 |
| -`memoize` helper: |
113 |
| - |
114 |
| -```rust |
115 |
| -map.memoize(key, || /* compute value */) |
116 |
| -``` |
117 |
| - |
118 |
| -This will create a graph that looks like |
119 |
| - |
120 |
| - ... -> MapVariant(key) -> CurrentTask |
121 |
| - |
122 |
| -where `MapVariant` is the `DepNode` variant that the map is associated with, |
123 |
| -and `...` are whatever edges the `/* compute value */` closure creates. |
124 |
| - |
125 |
| -In particular, using the memoize helper is much better than writing |
126 |
| -the obvious code yourself: |
127 |
| - |
128 |
| -```rust |
129 |
| -if let Some(result) = map.get(key) { |
130 |
| - return result; |
131 |
| -} |
132 |
| -let value = /* compute value */; |
133 |
| -map.insert(key, value); |
134 |
| -``` |
135 |
| - |
136 |
| -If you write that code manually, the dependency graph you get will |
137 |
| -include artificial edges that are not necessary. For example, imagine that |
138 |
| -two tasks, A and B, both invoke the manual memoization code, but A happens |
139 |
| -to go first. The resulting graph will be: |
140 |
| - |
141 |
| - ... -> A -> MapVariant(key) -> B |
142 |
| - ~~~~~~~~~~~~~~~~~~~~~~~~~~~ // caused by A writing to MapVariant(key) |
143 |
| - ~~~~~~~~~~~~~~~~~~~~ // caused by B reading from MapVariant(key) |
144 |
| - |
145 |
| -This graph is not *wrong*, but it encodes a path from A to B that |
146 |
| -should not exist. In contrast, using the memoized helper, you get: |
147 |
| - |
148 |
| - ... -> MapVariant(key) -> A |
149 |
| - | |
150 |
| - +----------> B |
151 |
| - |
152 |
| -which is much cleaner. |
153 |
| - |
154 |
| -**Be aware though that the closure is executed with `MapVariant(key)` |
155 |
| -pushed onto the stack as the current task!** That means that you must |
156 |
| -add explicit `read` calls for any shared state that it accesses |
157 |
| -implicitly from its environment. See the section on "explicit calls to |
158 |
| -read and write when starting a new subtask" above for more details. |
159 |
| - |
160 |
| -### How to decide where to introduce a new task |
161 |
| - |
162 |
| -Certainly, you need at least one task on the stack: any attempt to |
163 |
| -`read` or `write` shared state will panic if there is no current |
164 |
| -task. But where does it make sense to introduce subtasks? The basic |
165 |
| -rule is that a subtask makes sense for any discrete unit of work you |
166 |
| -may want to skip in the future. Adding a subtask separates out the |
167 |
| -reads/writes from *that particular subtask* versus the larger |
168 |
| -context. An example: you might have a 'meta' task for all of borrow |
169 |
| -checking, and then subtasks for borrow checking individual fns. (Seen |
170 |
| -in this light, memoized computations are just a special case where we |
171 |
| -may want to avoid redoing the work even within the context of one |
172 |
| -compilation.) |
173 |
| - |
174 |
| -The other case where you might want a subtask is to help with refining |
175 |
| -the reads/writes for some later bit of work that needs to be memoized. |
176 |
| -For example, we create a subtask for type-checking the body of each |
177 |
| -fn. However, in the initial version of incr. comp. at least, we do |
178 |
| -not expect to actually *SKIP* type-checking -- we only expect to skip |
179 |
| -trans. However, it's still useful to create subtasks for type-checking |
180 |
| -individual items, because, otherwise, if a fn sig changes, we won't |
181 |
| -know which callers are affected -- in fact, because the graph would be |
182 |
| -so coarse, we'd just have to retrans everything, since we can't |
183 |
| -distinguish which fns used which fn sigs. |
184 |
| - |
185 |
| -### Testing the dependency graph |
186 |
| - |
187 |
| -There are various ways to write tests against the dependency graph. |
188 |
| -The simplest mechanism are the |
189 |
| -`#[rustc_if_this_changed]` and `#[rustc_then_this_would_need]` |
190 |
| -annotations. These are used in compile-fail tests to test whether the |
191 |
| -expected set of paths exist in the dependency graph. As an example, |
192 |
| -see `src/test/compile-fail/dep-graph-caller-callee.rs`. |
193 |
| - |
194 |
| -The idea is that you can annotate a test like: |
195 |
| - |
196 |
| -```rust |
197 |
| -#[rustc_if_this_changed] |
198 |
| -fn foo() { } |
199 |
| - |
200 |
| -#[rustc_then_this_would_need(TypeckTables)] //~ ERROR OK |
201 |
| -fn bar() { foo(); } |
202 |
| - |
203 |
| -#[rustc_then_this_would_need(TypeckTables)] //~ ERROR no path |
204 |
| -fn baz() { } |
205 |
| -``` |
206 |
| - |
207 |
| -This will check whether there is a path in the dependency graph from |
208 |
| -`Hir(foo)` to `TypeckTables(bar)`. An error is reported for each |
209 |
| -`#[rustc_then_this_would_need]` annotation that indicates whether a |
210 |
| -path exists. `//~ ERROR` annotations can then be used to test if a |
211 |
| -path is found (as demonstrated above). |
212 |
| - |
213 |
| -### Debugging the dependency graph |
214 |
| - |
215 |
| -#### Dumping the graph |
216 |
| - |
217 |
| -The compiler is also capable of dumping the dependency graph for your |
218 |
| -debugging pleasure. To do so, pass the `-Z dump-dep-graph` flag. The |
219 |
| -graph will be dumped to `dep_graph.{txt,dot}` in the current |
220 |
| -directory. You can override the filename with the `RUST_DEP_GRAPH` |
221 |
| -environment variable. |
222 |
| - |
223 |
| -Frequently, though, the full dep graph is quite overwhelming and not |
224 |
| -particularly helpful. Therefore, the compiler also allows you to filter |
225 |
| -the graph. You can filter in three ways: |
226 |
| - |
227 |
| -1. All edges originating in a particular set of nodes (usually a single node). |
228 |
| -2. All edges reaching a particular set of nodes. |
229 |
| -3. All edges that lie between given start and end nodes. |
230 |
| - |
231 |
| -To filter, use the `RUST_DEP_GRAPH_FILTER` environment variable, which should |
232 |
| -look like one of the following: |
233 |
| - |
234 |
| -``` |
235 |
| -source_filter // nodes originating from source_filter |
236 |
| --> target_filter // nodes that can reach target_filter |
237 |
| -source_filter -> target_filter // nodes in between source_filter and target_filter |
238 |
| -``` |
239 |
| - |
240 |
| -`source_filter` and `target_filter` are a `&`-separated list of strings. |
241 |
| -A node is considered to match a filter if all of those strings appear in its |
242 |
| -label. So, for example: |
243 |
| - |
244 |
| -``` |
245 |
| -RUST_DEP_GRAPH_FILTER='-> TypeckTables' |
246 |
| -``` |
247 |
| - |
248 |
| -would select the predecessors of all `TypeckTables` nodes. Usually though you |
249 |
| -want the `TypeckTables` node for some particular fn, so you might write: |
250 |
| - |
251 |
| -``` |
252 |
| -RUST_DEP_GRAPH_FILTER='-> TypeckTables & bar' |
253 |
| -``` |
254 |
| - |
255 |
| -This will select only the `TypeckTables` nodes for fns with `bar` in their name. |
256 |
| - |
257 |
| -Perhaps you are finding that when you change `foo` you need to re-type-check `bar`, |
258 |
| -but you don't think you should have to. In that case, you might do: |
259 |
| - |
260 |
| -``` |
261 |
| -RUST_DEP_GRAPH_FILTER='Hir&foo -> TypeckTables & bar' |
262 |
| -``` |
263 |
| - |
264 |
| -This will dump out all the nodes that lead from `Hir(foo)` to |
265 |
| -`TypeckTables(bar)`, from which you can (hopefully) see the source |
266 |
| -of the erroneous edge. |
267 |
| - |
268 |
| -#### Tracking down incorrect edges |
269 |
| - |
270 |
| -Sometimes, after you dump the dependency graph, you will find some |
271 |
| -path that should not exist, but you will not be quite sure how it came |
272 |
| -to be. **When the compiler is built with debug assertions,** it can |
273 |
| -help you track that down. Simply set the `RUST_FORBID_DEP_GRAPH_EDGE` |
274 |
| -environment variable to a filter. Every edge created in the dep-graph |
275 |
| -will be tested against that filter -- if it matches, a `bug!` is |
276 |
| -reported, so you can easily see the backtrace (`RUST_BACKTRACE=1`). |
277 |
| - |
278 |
| -The syntax for these filters is the same as described in the previous |
279 |
| -section. However, note that this filter is applied to every **edge** |
280 |
| -and doesn't handle longer paths in the graph, unlike the previous |
281 |
| -section. |
282 |
| - |
283 |
| -Example: |
284 |
| - |
285 |
| -You find that there is a path from the `Hir` of `foo` to the type |
286 |
| -check of `bar` and you don't think there should be. You dump the |
287 |
| -dep-graph as described in the previous section and open `dep-graph.txt` |
288 |
| -to see something like: |
289 |
| - |
290 |
| - Hir(foo) -> Collect(bar) |
291 |
| - Collect(bar) -> TypeckTables(bar) |
292 |
| - |
293 |
| -That first edge looks suspicious to you. So you set |
294 |
| -`RUST_FORBID_DEP_GRAPH_EDGE` to `Hir&foo -> Collect&bar`, re-run, and |
295 |
| -then observe the backtrace. Voila, bug fixed! |
| 4 | +[rustc guide]: https://rust-lang-nursery.github.io/rustc-guide/query.html |
0 commit comments