]> git.proxmox.com Git - rustc.git/blob - src/librustc_trans/debuginfo/doc.rs
Imported Upstream version 1.9.0+dfsg1
[rustc.git] / src / librustc_trans / debuginfo / doc.rs
1 // Copyright 2015 The Rust Project Developers. See the COPYRIGHT
2 // file at the top-level directory of this distribution and at
3 // http://rust-lang.org/COPYRIGHT.
4 //
5 // Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
6 // http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
7 // <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
8 // option. This file may not be copied, modified, or distributed
9 // except according to those terms.
10
11 //! # Debug Info Module
12 //!
13 //! This module serves the purpose of generating debug symbols. We use LLVM's
14 //! [source level debugging](http://!llvm.org/docs/SourceLevelDebugging.html)
15 //! features for generating the debug information. The general principle is
16 //! this:
17 //!
18 //! Given the right metadata in the LLVM IR, the LLVM code generator is able to
19 //! create DWARF debug symbols for the given code. The
20 //! [metadata](http://!llvm.org/docs/LangRef.html#metadata-type) is structured
21 //! much like DWARF *debugging information entries* (DIE), representing type
22 //! information such as datatype layout, function signatures, block layout,
23 //! variable location and scope information, etc. It is the purpose of this
24 //! module to generate correct metadata and insert it into the LLVM IR.
25 //!
26 //! As the exact format of metadata trees may change between different LLVM
27 //! versions, we now use LLVM
28 //! [DIBuilder](http://!llvm.org/docs/doxygen/html/classllvm_1_1DIBuilder.html)
29 //! to create metadata where possible. This will hopefully ease the adaption of
30 //! this module to future LLVM versions.
31 //!
32 //! The public API of the module is a set of functions that will insert the
33 //! correct metadata into the LLVM IR when called with the right parameters.
34 //! The module is thus driven from an outside client with functions like
35 //! `debuginfo::create_local_var_metadata(bcx: block, local: &ast::local)`.
36 //!
37 //! Internally the module will try to reuse already created metadata by
38 //! utilizing a cache. The way to get a shared metadata node when needed is
39 //! thus to just call the corresponding function in this module:
40 //!
41 //! let file_metadata = file_metadata(crate_context, path);
42 //!
43 //! The function will take care of probing the cache for an existing node for
44 //! that exact file path.
45 //!
46 //! All private state used by the module is stored within either the
47 //! CrateDebugContext struct (owned by the CrateContext) or the
48 //! FunctionDebugContext (owned by the FunctionContext).
49 //!
50 //! This file consists of three conceptual sections:
51 //! 1. The public interface of the module
52 //! 2. Module-internal metadata creation functions
53 //! 3. Minor utility functions
54 //!
55 //!
56 //! ## Recursive Types
57 //!
58 //! Some kinds of types, such as structs and enums can be recursive. That means
59 //! that the type definition of some type X refers to some other type which in
60 //! turn (transitively) refers to X. This introduces cycles into the type
61 //! referral graph. A naive algorithm doing an on-demand, depth-first traversal
62 //! of this graph when describing types, can get trapped in an endless loop
63 //! when it reaches such a cycle.
64 //!
65 //! For example, the following simple type for a singly-linked list...
66 //!
67 //! ```
68 //! struct List {
69 //! value: i32,
70 //! tail: Option<Box<List>>,
71 //! }
72 //! ```
73 //!
74 //! will generate the following callstack with a naive DFS algorithm:
75 //!
76 //! ```
77 //! describe(t = List)
78 //! describe(t = i32)
79 //! describe(t = Option<Box<List>>)
80 //! describe(t = Box<List>)
81 //! describe(t = List) // at the beginning again...
82 //! ...
83 //! ```
84 //!
85 //! To break cycles like these, we use "forward declarations". That is, when
86 //! the algorithm encounters a possibly recursive type (any struct or enum), it
87 //! immediately creates a type description node and inserts it into the cache
88 //! *before* describing the members of the type. This type description is just
89 //! a stub (as type members are not described and added to it yet) but it
90 //! allows the algorithm to already refer to the type. After the stub is
91 //! inserted into the cache, the algorithm continues as before. If it now
92 //! encounters a recursive reference, it will hit the cache and does not try to
93 //! describe the type anew.
94 //!
95 //! This behaviour is encapsulated in the 'RecursiveTypeDescription' enum,
96 //! which represents a kind of continuation, storing all state needed to
97 //! continue traversal at the type members after the type has been registered
98 //! with the cache. (This implementation approach might be a tad over-
99 //! engineered and may change in the future)
100 //!
101 //!
102 //! ## Source Locations and Line Information
103 //!
104 //! In addition to data type descriptions the debugging information must also
105 //! allow to map machine code locations back to source code locations in order
106 //! to be useful. This functionality is also handled in this module. The
107 //! following functions allow to control source mappings:
108 //!
109 //! + set_source_location()
110 //! + clear_source_location()
111 //! + start_emitting_source_locations()
112 //!
113 //! `set_source_location()` allows to set the current source location. All IR
114 //! instructions created after a call to this function will be linked to the
115 //! given source location, until another location is specified with
116 //! `set_source_location()` or the source location is cleared with
117 //! `clear_source_location()`. In the later case, subsequent IR instruction
118 //! will not be linked to any source location. As you can see, this is a
119 //! stateful API (mimicking the one in LLVM), so be careful with source
120 //! locations set by previous calls. It's probably best to not rely on any
121 //! specific state being present at a given point in code.
122 //!
123 //! One topic that deserves some extra attention is *function prologues*. At
124 //! the beginning of a function's machine code there are typically a few
125 //! instructions for loading argument values into allocas and checking if
126 //! there's enough stack space for the function to execute. This *prologue* is
127 //! not visible in the source code and LLVM puts a special PROLOGUE END marker
128 //! into the line table at the first non-prologue instruction of the function.
129 //! In order to find out where the prologue ends, LLVM looks for the first
130 //! instruction in the function body that is linked to a source location. So,
131 //! when generating prologue instructions we have to make sure that we don't
132 //! emit source location information until the 'real' function body begins. For
133 //! this reason, source location emission is disabled by default for any new
134 //! function being translated and is only activated after a call to the third
135 //! function from the list above, `start_emitting_source_locations()`. This
136 //! function should be called right before regularly starting to translate the
137 //! top-level block of the given function.
138 //!
139 //! There is one exception to the above rule: `llvm.dbg.declare` instruction
140 //! must be linked to the source location of the variable being declared. For
141 //! function parameters these `llvm.dbg.declare` instructions typically occur
142 //! in the middle of the prologue, however, they are ignored by LLVM's prologue
143 //! detection. The `create_argument_metadata()` and related functions take care
144 //! of linking the `llvm.dbg.declare` instructions to the correct source
145 //! locations even while source location emission is still disabled, so there
146 //! is no need to do anything special with source location handling here.
147 //!
148 //! ## Unique Type Identification
149 //!
150 //! In order for link-time optimization to work properly, LLVM needs a unique
151 //! type identifier that tells it across compilation units which types are the
152 //! same as others. This type identifier is created by
153 //! TypeMap::get_unique_type_id_of_type() using the following algorithm:
154 //!
155 //! (1) Primitive types have their name as ID
156 //! (2) Structs, enums and traits have a multipart identifier
157 //!
158 //! (1) The first part is the SVH (strict version hash) of the crate they
159 //! wereoriginally defined in
160 //!
161 //! (2) The second part is the ast::NodeId of the definition in their
162 //! originalcrate
163 //!
164 //! (3) The final part is a concatenation of the type IDs of their concrete
165 //! typearguments if they are generic types.
166 //!
167 //! (3) Tuple-, pointer and function types are structurally identified, which
168 //! means that they are equivalent if their component types are equivalent
169 //! (i.e. (i32, i32) is the same regardless in which crate it is used).
170 //!
171 //! This algorithm also provides a stable ID for types that are defined in one
172 //! crate but instantiated from metadata within another crate. We just have to
173 //! take care to always map crate and node IDs back to the original crate
174 //! context.
175 //!
176 //! As a side-effect these unique type IDs also help to solve a problem arising
177 //! from lifetime parameters. Since lifetime parameters are completely omitted
178 //! in debuginfo, more than one `Ty` instance may map to the same debuginfo
179 //! type metadata, that is, some struct `Struct<'a>` may have N instantiations
180 //! with different concrete substitutions for `'a`, and thus there will be N
181 //! `Ty` instances for the type `Struct<'a>` even though it is not generic
182 //! otherwise. Unfortunately this means that we cannot use `ty::type_id()` as
183 //! cheap identifier for type metadata---we have done this in the past, but it
184 //! led to unnecessary metadata duplication in the best case and LLVM
185 //! assertions in the worst. However, the unique type ID as described above
186 //! *can* be used as identifier. Since it is comparatively expensive to
187 //! construct, though, `ty::type_id()` is still used additionally as an
188 //! optimization for cases where the exact same type has been seen before
189 //! (which is most of the time).