]> git.proxmox.com Git - rustc.git/blame - src/doc/rustc-dev-guide/src/rustdoc-internals.md
New upstream version 1.54.0+dfsg1
[rustc.git] / src / doc / rustc-dev-guide / src / rustdoc-internals.md
CommitLineData
6a06907d
XL
1# Rustdoc internals
2
3<!-- toc -->
4
5This page describes rustdoc's passes and modes. For an overview of rustdoc,
6see the ["Rustdoc overview" chapter](./rustdoc.md).
7
8## From crate to clean
9
10In `core.rs` are two central items: the `DocContext` struct, and the `run_core`
11function. The latter is where rustdoc calls out to rustc to compile a crate to
12the point where rustdoc can take over. The former is a state container used
13when crawling through a crate to gather its documentation.
14
15The main process of crate crawling is done in `clean/mod.rs` through several
16implementations of the `Clean` trait defined within. This is a conversion
17trait, which defines one method:
18
19```rust,ignore
20pub trait Clean<T> {
21 fn clean(&self, cx: &DocContext) -> T;
22}
23```
24
25`clean/mod.rs` also defines the types for the "cleaned" AST used later on to
26render documentation pages. Each usually accompanies an implementation of
27`Clean` that takes some AST or HIR type from rustc and converts it into the
28appropriate "cleaned" type. "Big" items like modules or associated items may
29have some extra processing in its `Clean` implementation, but for the most part
30these impls are straightforward conversions. The "entry point" to this module
31is the `impl Clean<Crate> for visit_ast::RustdocVisitor`, which is called by
32`run_core` above.
33
34You see, I actually lied a little earlier: There's another AST transformation
35that happens before the events in `clean/mod.rs`. In `visit_ast.rs` is the
36type `RustdocVisitor`, which *actually* crawls a `rustc_hir::Crate` to get the first
37intermediate representation, defined in `doctree.rs`. This pass is mainly to
38get a few intermediate wrappers around the HIR types and to process visibility
39and inlining. This is where `#[doc(inline)]`, `#[doc(no_inline)]`, and
40`#[doc(hidden)]` are processed, as well as the logic for whether a `pub use`
41should get the full page or a "Reexport" line in the module page.
42
43The other major thing that happens in `clean/mod.rs` is the collection of doc
44comments and `#[doc=""]` attributes into a separate field of the Attributes
45struct, present on anything that gets hand-written documentation. This makes it
46easier to collect this documentation later in the process.
47
48The primary output of this process is a `clean::Crate` with a tree of Items
49which describe the publicly-documentable items in the target crate.
50
51### Hot potato
52
53Before moving on to the next major step, a few important "passes" occur over
54the documentation. These do things like combine the separate "attributes" into
55a single string and strip leading whitespace to make the document easier on the
56markdown parser, or drop items that are not public or deliberately hidden with
57`#[doc(hidden)]`. These are all implemented in the `passes/` directory, one
58file per pass. By default, all of these passes are run on a crate, but the ones
59regarding dropping private/hidden items can be bypassed by passing
60`--document-private-items` to rustdoc. Note that unlike the previous set of AST
61transformations, the passes are run on the _cleaned_ crate.
62
63(Strictly speaking, you can fine-tune the passes run and even add your own, but
64[we're trying to deprecate that][44136]. If you need finer-grain control over
65these passes, please let us know!)
66
67[44136]: https://github.com/rust-lang/rust/issues/44136
68
69Here is the list of passes as of <!-- date: 2021-02 --> February 2021:
70
71- `calculate-doc-coverage` calculates information used for the `--show-coverage`
72 flag.
73
74- `check-code-block-syntax` validates syntax inside Rust code blocks
75 (`` ```rust ``)
76
77- `check-invalid-html-tags` detects invalid HTML (like an unclosed `<span>`)
78 in doc comments.
79
80- `check-non-autolinks` detects links that could or should be written using
81 angle brackets (the code behind the nightly-only <!-- date: 2021-02 -->
82 `non_autolinks` lint).
83
84- `collapse-docs` concatenates all document attributes into one document
85 attribute. This is necessary because each line of a doc comment is given as a
86 separate doc attribute, and this will combine them into a single string with
87 line breaks between each attribute.
88
89- `collect-intra-doc-links` resolves [intra-doc links](https://doc.rust-lang.org/rustdoc/linking-to-items-by-name.html).
90
91- `collect-trait-impls` collects trait impls for each item in the crate. For
92 example, if we define a struct that implements a trait, this pass will note
93 that the struct implements that trait.
94
95- `doc-test-lints` runs various lints on the doctests.
96
97- `propagate-doc-cfg` propagates `#[doc(cfg(...))]` to child items.
98
99- `strip-priv-imports` strips all private import statements (`use`, `extern
100 crate`) from a crate. This is necessary because rustdoc will handle *public*
101 imports by either inlining the item's documentation to the module or creating
102 a "Reexports" section with the import in it. The pass ensures that all of
103 these imports are actually relevant to documentation.
104
105- `strip-hidden` and `strip-private` strip all `doc(hidden)` and private items
106 from the output. `strip-private` implies `strip-priv-imports`. Basically, the
107 goal is to remove items that are not relevant for public documentation.
108
109- `unindent-comments` removes excess indentation on comments in order for the
110 Markdown to be parsed correctly. This is necessary because the convention for
111 writing documentation is to provide a space between the `///` or `//!` marker
112 and the doc text, but Markdown is whitespace-sensitive. For example, a block
113 of text with four-space indentation is parsed as a code block, so if we didn't
114 unindent comments, these list items
115
116 ```rust,ignore
117 /// A list:
118 ///
119 /// - Foo
120 /// - Bar
121 ```
122
123 would be parsed as if they were in a code block, which is likely not what the
124 user intended.
125
126There is also a `stripper` module in `passes/`, but it is a collection of
127utility functions for the `strip-*` passes and is not a pass itself.
128
129## From clean to crate
130
131This is where the "second phase" in rustdoc begins. This phase primarily lives
132in the `html/` folder, and it all starts with `run()` in `html/render.rs`. This
133code is responsible for setting up the `Context`, `SharedContext`, and `Cache`
134which are used during rendering, copying out the static files which live in
135every rendered set of documentation (things like the fonts, CSS, and JavaScript
136that live in `html/static/`), creating the search index, and printing out the
137source code rendering, before beginning the process of rendering all the
138documentation for the crate.
139
140Several functions implemented directly on `Context` take the `clean::Crate` and
141set up some state between rendering items or recursing on a module's child
142items. From here the "page rendering" begins, via an enormous `write!()` call
143in `html/layout.rs`. The parts that actually generate HTML from the items and
144documentation occurs within a series of `std::fmt::Display` implementations and
145functions that pass around a `&mut std::fmt::Formatter`. The top-level
146implementation that writes out the page body is the `impl<'a> fmt::Display for
147Item<'a>` in `html/render.rs`, which switches out to one of several `item_*`
148functions based on the kind of `Item` being rendered.
149
150Depending on what kind of rendering code you're looking for, you'll probably
151find it either in `html/render.rs` for major items like "what sections should I
152print for a struct page" or `html/format.rs` for smaller component pieces like
153"how should I print a where clause as part of some other item".
154
155Whenever rustdoc comes across an item that should print hand-written
156documentation alongside, it calls out to `html/markdown.rs` which interfaces
157with the Markdown parser. This is exposed as a series of types that wrap a
158string of Markdown, and implement `fmt::Display` to emit HTML text. It takes
159special care to enable certain features like footnotes and tables and add
160syntax highlighting to Rust code blocks (via `html/highlight.rs`) before
161running the Markdown parser. There's also a function in here
162(`find_testable_code`) that specifically scans for Rust code blocks so the
163test-runner code can find all the doctests in the crate.
164
165### From soup to nuts
166
167(alternate title: ["An unbroken thread that stretches from those first `Cell`s
168to us"][video])
169
170[video]: https://www.youtube.com/watch?v=hOLAGYmUQV0
171
172It's important to note that the AST cleaning can ask the compiler for
173information (crucially, `DocContext` contains a `TyCtxt`), but page rendering
174cannot. The `clean::Crate` created within `run_core` is passed outside the
175compiler context before being handed to `html::render::run`. This means that a
176lot of the "supplementary data" that isn't immediately available inside an
177item's definition, like which trait is the `Deref` trait used by the language,
178needs to be collected during cleaning, stored in the `DocContext`, and passed
179along to the `SharedContext` during HTML rendering. This manifests as a bunch
180of shared state, context variables, and `RefCell`s.
181
182Also of note is that some items that come from "asking the compiler" don't go
183directly into the `DocContext` - for example, when loading items from a foreign
184crate, rustdoc will ask about trait implementations and generate new `Item`s
185for the impls based on that information. This goes directly into the returned
186`Crate` rather than roundabout through the `DocContext`. This way, these
187implementations can be collected alongside the others, right before rendering
188the HTML.
189
190## Other tricks up its sleeve
191
192All this describes the process for generating HTML documentation from a Rust
193crate, but there are couple other major modes that rustdoc runs in. It can also
194be run on a standalone Markdown file, or it can run doctests on Rust code or
195standalone Markdown files. For the former, it shortcuts straight to
196`html/markdown.rs`, optionally including a mode which inserts a Table of
197Contents to the output HTML.
198
199For the latter, rustdoc runs a similar partial-compilation to get relevant
200documentation in `test.rs`, but instead of going through the full clean and
201render process, it runs a much simpler crate walk to grab *just* the
202hand-written documentation. Combined with the aforementioned
203"`find_testable_code`" in `html/markdown.rs`, it builds up a collection of
204tests to run before handing them off to the test runner. One notable
205location in `test.rs` is the function `make_test`, which is where hand-written
206doctests get transformed into something that can be executed.
207
208Some extra reading about `make_test` can be found
209[here](https://quietmisdreavus.net/code/2018/02/23/how-the-doctests-get-made/).
210
211## Dotting i's and crossing t's
212
213So that's rustdoc's code in a nutshell, but there's more things in the repo
214that deal with it. Since we have the full `compiletest` suite at hand, there's
215a set of tests in `src/test/rustdoc` that make sure the final HTML is what we
216expect in various situations. These tests also use a supplementary script,
217`src/etc/htmldocck.py`, that allows it to look through the final HTML using
218XPath notation to get a precise look at the output. The full description of all
219the commands available to rustdoc tests (e.g. [`@has`] and [`@matches`]) is in
220[`htmldocck.py`].
221
222To use multiple crates in a rustdoc test, add `// aux-build:filename.rs`
223to the top of the test file. `filename.rs` should be placed in an `auxiliary`
224directory relative to the test file with the comment. If you need to build
225docs for the auxiliary file, use `// build-aux-docs`.
226
227In addition, there are separate tests for the search index and rustdoc's
228ability to query it. The files in `src/test/rustdoc-js` each contain a
229different search query and the expected results, broken out by search tab.
230These files are processed by a script in `src/tools/rustdoc-js` and the Node.js
231runtime. These tests don't have as thorough of a writeup, but a broad example
232that features results in all tabs can be found in `basic.js`. The basic idea is
233that you match a given `QUERY` with a set of `EXPECTED` results, complete with
234the full item path of each item.
235
236[`htmldocck.py`]: https://github.com/rust-lang/rust/blob/master/src/etc/htmldocck.py
237[`@has`]: https://github.com/rust-lang/rust/blob/master/src/etc/htmldocck.py#L39
238[`@matches`]: https://github.com/rust-lang/rust/blob/master/src/etc/htmldocck.py#L44
239
240## Testing locally
241
242Some features of the generated HTML documentation might require local
243storage to be used across pages, which doesn't work well without an HTTP
244server. To test these features locally, you can run a local HTTP server, like
245this:
246
247```bash
17df50a5 248$ ./x.py doc library/std
6a06907d
XL
249# The documentation has been generated into `build/[YOUR ARCH]/doc`.
250$ python3 -m http.server -d build/[YOUR ARCH]/doc
251```
252
253Now you can browse your documentation just like you would if it was hosted
254on the internet. For example, the url for `std` will be `/std/".
255
256## See also
257
258- The [`rustdoc` api docs]
259- [An overview of `rustdoc`](./rustdoc.md)
260- [The rustdoc user guide]
261
262[`rustdoc` api docs]: https://doc.rust-lang.org/nightly/nightly-rustc/rustdoc/
263[The rustdoc user guide]: https://doc.rust-lang.org/nightly/rustdoc/