]> git.proxmox.com Git - rustc.git/blob - src/doc/rustc-dev-guide/src/rustdoc-internals.md
New upstream version 1.63.0+dfsg1
[rustc.git] / src / doc / rustc-dev-guide / src / rustdoc-internals.md
1 # Rustdoc internals
2
3 <!-- toc -->
4
5 This page describes rustdoc's passes and modes. For an overview of rustdoc,
6 see the ["Rustdoc overview" chapter](./rustdoc.md).
7
8 ## From crate to clean
9
10 In `core.rs` are two central items: the `DocContext` struct, and the `run_core`
11 function. The latter is where rustdoc calls out to rustc to compile a crate to
12 the point where rustdoc can take over. The former is a state container used
13 when crawling through a crate to gather its documentation.
14
15 The main process of crate crawling is done in `clean/mod.rs` through several
16 implementations of the `Clean` trait defined within. This is a conversion
17 trait, which defines one method:
18
19 ```rust,ignore
20 pub trait Clean<T> {
21 fn clean(&self, cx: &DocContext) -> T;
22 }
23 ```
24
25 `clean/mod.rs` also defines the types for the "cleaned" AST used later on to
26 render documentation pages. Each usually accompanies an implementation of
27 `Clean` that takes some AST or HIR type from rustc and converts it into the
28 appropriate "cleaned" type. "Big" items like modules or associated items may
29 have some extra processing in its `Clean` implementation, but for the most part
30 these impls are straightforward conversions. The "entry point" to this module
31 is the `impl Clean<Crate> for visit_ast::RustdocVisitor`, which is called by
32 `run_core` above.
33
34 You see, I actually lied a little earlier: There's another AST transformation
35 that happens before the events in `clean/mod.rs`. In `visit_ast.rs` is the
36 type `RustdocVisitor`, which *actually* crawls a `rustc_hir::Crate` to get the first
37 intermediate representation, defined in `doctree.rs`. This pass is mainly to
38 get a few intermediate wrappers around the HIR types and to process visibility
39 and inlining. This is where `#[doc(inline)]`, `#[doc(no_inline)]`, and
40 `#[doc(hidden)]` are processed, as well as the logic for whether a `pub use`
41 should get the full page or a "Reexport" line in the module page.
42
43 The other major thing that happens in `clean/mod.rs` is the collection of doc
44 comments and `#[doc=""]` attributes into a separate field of the Attributes
45 struct, present on anything that gets hand-written documentation. This makes it
46 easier to collect this documentation later in the process.
47
48 The primary output of this process is a `clean::Crate` with a tree of Items
49 which describe the publicly-documentable items in the target crate.
50
51 ### Hot potato
52
53 Before moving on to the next major step, a few important "passes" occur over
54 the documentation. These do things like combine the separate "attributes" into
55 a single string to make the document easier on the markdown parser,
56 or drop items that are not public or deliberately hidden with `#[doc(hidden)]`.
57 These are all implemented in the `passes/` directory, one file per pass.
58 By default, all of these passes are run on a crate, but the ones
59 regarding dropping private/hidden items can be bypassed by passing
60 `--document-private-items` to rustdoc. Note that unlike the previous set of AST
61 transformations, the passes are run on the _cleaned_ crate.
62
63 (Strictly speaking, you can fine-tune the passes run and even add your own, but
64 [we're trying to deprecate that][44136]. If you need finer-grain control over
65 these passes, please let us know!)
66
67 [44136]: https://github.com/rust-lang/rust/issues/44136
68
69 Here is the list of passes as of <!-- date: 2022-05 --> May 2022:
70
71 - `calculate-doc-coverage` calculates information used for the `--show-coverage`
72 flag.
73
74 - `check-bare-urls` detects links that are not linkified, e.g., in Markdown such as
75 `Go to https://example.com/.` It suggests wrapping the link with angle brackets:
76 `Go to <https://example.com/>.` to linkify it. This is the code behind the <!--
77 date: 2022-05 --> `rustdoc::bare_urls` lint.
78
79 - `check-code-block-syntax` validates syntax inside Rust code blocks
80 (<code>```rust</code>)
81
82 - `check-doc-test-visibility` runs doctest visibility–related lints.
83
84 - `check-invalid-html-tags` detects invalid HTML (like an unclosed `<span>`)
85 in doc comments.
86
87 - `collect-intra-doc-links` resolves [intra-doc links](https://doc.rust-lang.org/nightly/rustdoc/write-documentation/linking-to-items-by-name.html).
88
89 - `collect-trait-impls` collects trait impls for each item in the crate. For
90 example, if we define a struct that implements a trait, this pass will note
91 that the struct implements that trait.
92
93 - `propagate-doc-cfg` propagates `#[doc(cfg(...))]` to child items.
94
95 - `strip-priv-imports` strips all private import statements (`use`, `extern
96 crate`) from a crate. This is necessary because rustdoc will handle *public*
97 imports by either inlining the item's documentation to the module or creating
98 a "Reexports" section with the import in it. The pass ensures that all of
99 these imports are actually relevant to documentation.
100
101 - `strip-hidden` and `strip-private` strip all `doc(hidden)` and private items
102 from the output. `strip-private` implies `strip-priv-imports`. Basically, the
103 goal is to remove items that are not relevant for public documentation.
104
105 There is also a `stripper` module in `passes/`, but it is a collection of
106 utility functions for the `strip-*` passes and is not a pass itself.
107
108 ## From clean to crate
109
110 This is where the "second phase" in rustdoc begins. This phase primarily lives
111 in the `html/` folder, and it all starts with `run()` in `html/render.rs`. This
112 code is responsible for setting up the `Context`, `SharedContext`, and `Cache`
113 which are used during rendering, copying out the static files which live in
114 every rendered set of documentation (things like the fonts, CSS, and JavaScript
115 that live in `html/static/`), creating the search index, and printing out the
116 source code rendering, before beginning the process of rendering all the
117 documentation for the crate.
118
119 Several functions implemented directly on `Context` take the `clean::Crate` and
120 set up some state between rendering items or recursing on a module's child
121 items. From here the "page rendering" begins, via an enormous `write!()` call
122 in `html/layout.rs`. The parts that actually generate HTML from the items and
123 documentation occurs within a series of `std::fmt::Display` implementations and
124 functions that pass around a `&mut std::fmt::Formatter`. The top-level
125 implementation that writes out the page body is the `impl<'a> fmt::Display for
126 Item<'a>` in `html/render.rs`, which switches out to one of several `item_*`
127 functions based on the kind of `Item` being rendered.
128
129 Depending on what kind of rendering code you're looking for, you'll probably
130 find it either in `html/render.rs` for major items like "what sections should I
131 print for a struct page" or `html/format.rs` for smaller component pieces like
132 "how should I print a where clause as part of some other item".
133
134 Whenever rustdoc comes across an item that should print hand-written
135 documentation alongside, it calls out to `html/markdown.rs` which interfaces
136 with the Markdown parser. This is exposed as a series of types that wrap a
137 string of Markdown, and implement `fmt::Display` to emit HTML text. It takes
138 special care to enable certain features like footnotes and tables and add
139 syntax highlighting to Rust code blocks (via `html/highlight.rs`) before
140 running the Markdown parser. There's also a function in here
141 (`find_testable_code`) that specifically scans for Rust code blocks so the
142 test-runner code can find all the doctests in the crate.
143
144 ### From soup to nuts
145
146 (alternate title: ["An unbroken thread that stretches from those first `Cell`s
147 to us"][video])
148
149 [video]: https://www.youtube.com/watch?v=hOLAGYmUQV0
150
151 It's important to note that the AST cleaning can ask the compiler for
152 information (crucially, `DocContext` contains a `TyCtxt`), but page rendering
153 cannot. The `clean::Crate` created within `run_core` is passed outside the
154 compiler context before being handed to `html::render::run`. This means that a
155 lot of the "supplementary data" that isn't immediately available inside an
156 item's definition, like which trait is the `Deref` trait used by the language,
157 needs to be collected during cleaning, stored in the `DocContext`, and passed
158 along to the `SharedContext` during HTML rendering. This manifests as a bunch
159 of shared state, context variables, and `RefCell`s.
160
161 Also of note is that some items that come from "asking the compiler" don't go
162 directly into the `DocContext` - for example, when loading items from a foreign
163 crate, rustdoc will ask about trait implementations and generate new `Item`s
164 for the impls based on that information. This goes directly into the returned
165 `Crate` rather than roundabout through the `DocContext`. This way, these
166 implementations can be collected alongside the others, right before rendering
167 the HTML.
168
169 ## Other tricks up its sleeve
170
171 All this describes the process for generating HTML documentation from a Rust
172 crate, but there are couple other major modes that rustdoc runs in. It can also
173 be run on a standalone Markdown file, or it can run doctests on Rust code or
174 standalone Markdown files. For the former, it shortcuts straight to
175 `html/markdown.rs`, optionally including a mode which inserts a Table of
176 Contents to the output HTML.
177
178 For the latter, rustdoc runs a similar partial-compilation to get relevant
179 documentation in `test.rs`, but instead of going through the full clean and
180 render process, it runs a much simpler crate walk to grab *just* the
181 hand-written documentation. Combined with the aforementioned
182 "`find_testable_code`" in `html/markdown.rs`, it builds up a collection of
183 tests to run before handing them off to the test runner. One notable
184 location in `test.rs` is the function `make_test`, which is where hand-written
185 doctests get transformed into something that can be executed.
186
187 Some extra reading about `make_test` can be found
188 [here](https://quietmisdreavus.net/code/2018/02/23/how-the-doctests-get-made/).
189
190 ## Dotting i's and crossing t's
191
192 So that's rustdoc's code in a nutshell, but there's more things in the repo
193 that deal with it. Since we have the full `compiletest` suite at hand, there's
194 a set of tests in `src/test/rustdoc` that make sure the final HTML is what we
195 expect in various situations. These tests also use a supplementary script,
196 `src/etc/htmldocck.py`, that allows it to look through the final HTML using
197 XPath notation to get a precise look at the output. The full description of all
198 the commands available to rustdoc tests (e.g. [`@has`] and [`@matches`]) is in
199 [`htmldocck.py`].
200
201 To use multiple crates in a rustdoc test, add `// aux-build:filename.rs`
202 to the top of the test file. `filename.rs` should be placed in an `auxiliary`
203 directory relative to the test file with the comment. If you need to build
204 docs for the auxiliary file, use `// build-aux-docs`.
205
206 In addition, there are separate tests for the search index and rustdoc's
207 ability to query it. The files in `src/test/rustdoc-js` each contain a
208 different search query and the expected results, broken out by search tab.
209 These files are processed by a script in `src/tools/rustdoc-js` and the Node.js
210 runtime. These tests don't have as thorough of a writeup, but a broad example
211 that features results in all tabs can be found in `basic.js`. The basic idea is
212 that you match a given `QUERY` with a set of `EXPECTED` results, complete with
213 the full item path of each item.
214
215 [`htmldocck.py`]: https://github.com/rust-lang/rust/blob/master/src/etc/htmldocck.py
216 [`@has`]: https://github.com/rust-lang/rust/blob/master/src/etc/htmldocck.py#L39
217 [`@matches`]: https://github.com/rust-lang/rust/blob/master/src/etc/htmldocck.py#L44
218
219 ## Testing locally
220
221 Some features of the generated HTML documentation might require local
222 storage to be used across pages, which doesn't work well without an HTTP
223 server. To test these features locally, you can run a local HTTP server, like
224 this:
225
226 ```bash
227 $ ./x.py doc library/std
228 # The documentation has been generated into `build/[YOUR ARCH]/doc`.
229 $ python3 -m http.server -d build/[YOUR ARCH]/doc
230 ```
231
232 Now you can browse your documentation just like you would if it was hosted
233 on the internet. For example, the url for `std` will be `/std/".
234
235 ## See also
236
237 - The [`rustdoc` api docs]
238 - [An overview of `rustdoc`](./rustdoc.md)
239 - [The rustdoc user guide]
240
241 [`rustdoc` api docs]: https://doc.rust-lang.org/nightly/nightly-rustc/rustdoc/
242 [The rustdoc user guide]: https://doc.rust-lang.org/nightly/rustdoc/