Skip to content

Commit 0407030

Browse files
committed
Auto merge of rust-lang#94261 - michaelwoerister:debuginfo-types-refactor, r=wesleywiser
debuginfo: Refactor debuginfo generation for types This PR implements the refactoring of the `rustc_codegen_llvm::debuginfo::metadata` module as described in MCP rust-lang/compiler-team#482. In particular it - changes names to use `di_node` instead of `metadata` - uniformly names all functions that build new debuginfo nodes `build_xyz_di_node` - renames `CrateDebugContext` to `CodegenUnitDebugContext` (which is more accurate) - removes outdated parts from `compiler/rustc_codegen_llvm/src/debuginfo/doc.md` - moves `TypeMap` and functions that work directly work with it to a new `type_map` module - moves enum related builder functions to a new `enums` module - splits enum debuginfo building for the native and cpp-like cases, since they are mostly separate - uses `SmallVec` instead of `Vec` in many places - removes the old infrastructure for dealing with recursion cycles (`create_and_register_recursive_type_forward_declaration()`, `RecursiveTypeDescription`, `set_members_of_composite_type()`, `MemberDescription`, `MemberDescriptionFactory`, `prepare_xyz_metadata()`, etc) - adds `type_map::build_type_with_children()` as a replacement for dealing with recursion cycles - adds many (doc-)comments explaining what's going on - changes cpp-like naming for C-Style enums so they don't get a `enum$<...>` name (because the NatVis visualizer does not apply to them) - fixes detection of what is a C-style enum because some enums where classified as C-style even though they have fields - changes cpp-like naming for generator enums so that NatVis works for them - changes the position of discriminant debuginfo node so it is consistently nested inside the top-level union instead of, sometimes, next to it The following could be done in subsequent PRs: - add caching for `closure_saved_names_of_captured_variables` - add caching for `generator_layout_and_saved_local_names` - fix inconsistent handling of what is considered a C-style enum wrt to debuginfo - rename `metadata` module to `types` - move common generator fields to front instead of appending them This PR is based on rust-lang#93644 which is not merged yet. Right now, the changes are all done in one big commit. They could be split into smaller commits but hopefully the list of changes above makes it tractable to review them as a single commit too. For now: r? `@ghost` (let's see if this affects compile times)
2 parents 3ba1ebe + aa2408a commit 0407030

File tree

24 files changed

+2452
-1878
lines changed

24 files changed

+2452
-1878
lines changed

compiler/rustc_codegen_gcc/src/debuginfo.rs

+1-1
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ impl<'a, 'gcc, 'tcx> DebugInfoBuilderMethods for Builder<'a, 'gcc, 'tcx> {
3131
}
3232

3333
impl<'gcc, 'tcx> DebugInfoMethods<'tcx> for CodegenCx<'gcc, 'tcx> {
34-
fn create_vtable_metadata(&self, _ty: Ty<'tcx>, _trait_ref: Option<PolyExistentialTraitRef<'tcx>>, _vtable: Self::Value) {
34+
fn create_vtable_debuginfo(&self, _ty: Ty<'tcx>, _trait_ref: Option<PolyExistentialTraitRef<'tcx>>, _vtable: Self::Value) {
3535
// TODO(antoyo)
3636
}
3737

compiler/rustc_codegen_llvm/src/allocator.rs

+2-2
Original file line numberDiff line numberDiff line change
@@ -140,8 +140,8 @@ pub(crate) unsafe fn codegen(
140140
llvm::LLVMDisposeBuilder(llbuilder);
141141

142142
if tcx.sess.opts.debuginfo != DebugInfo::None {
143-
let dbg_cx = debuginfo::CrateDebugContext::new(llmod);
144-
debuginfo::metadata::compile_unit_metadata(tcx, module_name, &dbg_cx);
143+
let dbg_cx = debuginfo::CodegenUnitDebugContext::new(llmod);
144+
debuginfo::metadata::build_compile_unit_di_node(tcx, module_name, &dbg_cx);
145145
dbg_cx.finalize(tcx.sess);
146146
}
147147
}

compiler/rustc_codegen_llvm/src/consts.rs

+1-1
Original file line numberDiff line numberDiff line change
@@ -428,7 +428,7 @@ impl<'ll> StaticMethods for CodegenCx<'ll, '_> {
428428
llvm::LLVMSetGlobalConstant(g, llvm::True);
429429
}
430430

431-
debuginfo::create_global_var_metadata(self, def_id, g);
431+
debuginfo::build_global_var_di_node(self, def_id, g);
432432

433433
if attrs.flags.contains(CodegenFnAttrFlags::THREAD_LOCAL) {
434434
llvm::set_thread_local_mode(g, self.tls_model);

compiler/rustc_codegen_llvm/src/context.rs

+7-3
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ pub struct CodegenCx<'ll, 'tcx> {
9595
pub isize_ty: &'ll Type,
9696

9797
pub coverage_cx: Option<coverageinfo::CrateCoverageContext<'ll, 'tcx>>,
98-
pub dbg_cx: Option<debuginfo::CrateDebugContext<'ll, 'tcx>>,
98+
pub dbg_cx: Option<debuginfo::CodegenUnitDebugContext<'ll, 'tcx>>,
9999

100100
eh_personality: Cell<Option<&'ll Value>>,
101101
eh_catch_typeinfo: Cell<Option<&'ll Value>>,
@@ -396,8 +396,12 @@ impl<'ll, 'tcx> CodegenCx<'ll, 'tcx> {
396396
};
397397

398398
let dbg_cx = if tcx.sess.opts.debuginfo != DebugInfo::None {
399-
let dctx = debuginfo::CrateDebugContext::new(llmod);
400-
debuginfo::metadata::compile_unit_metadata(tcx, codegen_unit.name().as_str(), &dctx);
399+
let dctx = debuginfo::CodegenUnitDebugContext::new(llmod);
400+
debuginfo::metadata::build_compile_unit_di_node(
401+
tcx,
402+
codegen_unit.name().as_str(),
403+
&dctx,
404+
);
401405
Some(dctx)
402406
} else {
403407
None

compiler/rustc_codegen_llvm/src/debuginfo/doc.md

+4-53
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ The function will take care of probing the cache for an existing node for
3434
that exact file path.
3535

3636
All private state used by the module is stored within either the
37-
CrateDebugContext struct (owned by the CodegenCx) or the
37+
CodegenUnitDebugContext struct (owned by the CodegenCx) or the
3838
FunctionDebugContext (owned by the FunctionCx).
3939

4040
This file consists of three conceptual sections:
@@ -72,21 +72,16 @@ describe(t = List)
7272
...
7373
```
7474

75-
To break cycles like these, we use "forward declarations". That is, when
75+
To break cycles like these, we use "stubs". That is, when
7676
the algorithm encounters a possibly recursive type (any struct or enum), it
7777
immediately creates a type description node and inserts it into the cache
7878
*before* describing the members of the type. This type description is just
7979
a stub (as type members are not described and added to it yet) but it
8080
allows the algorithm to already refer to the type. After the stub is
8181
inserted into the cache, the algorithm continues as before. If it now
8282
encounters a recursive reference, it will hit the cache and does not try to
83-
describe the type anew.
84-
85-
This behavior is encapsulated in the 'RecursiveTypeDescription' enum,
86-
which represents a kind of continuation, storing all state needed to
87-
continue traversal at the type members after the type has been registered
88-
with the cache. (This implementation approach might be a tad over-
89-
engineered and may change in the future)
83+
describe the type anew. This behavior is encapsulated in the
84+
`type_map::build_type_with_children()` function.
9085

9186

9287
## Source Locations and Line Information
@@ -134,47 +129,3 @@ detection. The `create_argument_metadata()` and related functions take care
134129
of linking the `llvm.dbg.declare` instructions to the correct source
135130
locations even while source location emission is still disabled, so there
136131
is no need to do anything special with source location handling here.
137-
138-
## Unique Type Identification
139-
140-
In order for link-time optimization to work properly, LLVM needs a unique
141-
type identifier that tells it across compilation units which types are the
142-
same as others. This type identifier is created by
143-
`TypeMap::get_unique_type_id_of_type()` using the following algorithm:
144-
145-
1. Primitive types have their name as ID
146-
147-
2. Structs, enums and traits have a multipart identifier
148-
149-
1. The first part is the SVH (strict version hash) of the crate they
150-
were originally defined in
151-
152-
2. The second part is the ast::NodeId of the definition in their
153-
original crate
154-
155-
3. The final part is a concatenation of the type IDs of their concrete
156-
type arguments if they are generic types.
157-
158-
3. Tuple-, pointer-, and function types are structurally identified, which
159-
means that they are equivalent if their component types are equivalent
160-
(i.e., `(i32, i32)` is the same regardless in which crate it is used).
161-
162-
This algorithm also provides a stable ID for types that are defined in one
163-
crate but instantiated from metadata within another crate. We just have to
164-
take care to always map crate and `NodeId`s back to the original crate
165-
context.
166-
167-
As a side-effect these unique type IDs also help to solve a problem arising
168-
from lifetime parameters. Since lifetime parameters are completely omitted
169-
in debuginfo, more than one `Ty` instance may map to the same debuginfo
170-
type metadata, that is, some struct `Struct<'a>` may have N instantiations
171-
with different concrete substitutions for `'a`, and thus there will be N
172-
`Ty` instances for the type `Struct<'a>` even though it is not generic
173-
otherwise. Unfortunately this means that we cannot use `ty::type_id()` as
174-
cheap identifier for type metadata -- we have done this in the past, but it
175-
led to unnecessary metadata duplication in the best case and LLVM
176-
assertions in the worst. However, the unique type ID as described above
177-
*can* be used as identifier. Since it is comparatively expensive to
178-
construct, though, `ty::type_id()` is still used additionally as an
179-
optimization for cases where the exact same type has been seen before
180-
(which is most of the time).

0 commit comments

Comments
 (0)