Merge tag 'debian/1.52.1+dfsg1-1_exp2' into proxmox/buster

[rustc.git] / src / doc / rustc-dev-guide / src / macro-expansion.md
diff --git a/src/doc/rustc-dev-guide/src/macro-expansion.md b/src/doc/rustc-dev-guide/src/macro-expansion.md

index 279598270551d19bc2a6a5879422bb59cae61a34..7385cefb38449f2c897101709b044b397b7dcf3b 100644 (file)
--- a/src/doc/rustc-dev-guide/src/macro-expansion.md
+++ b/src/doc/rustc-dev-guide/src/macro-expansion.md
@@ -1,149 +1,183 @@
  # Macro expansion
  
-> `librustc_ast`, `librustc_expand`, and `librustc_builtin_macros` are all undergoing
+<!-- toc -->
+
+> `rustc_ast`, `rustc_expand`, and `rustc_builtin_macros` are all undergoing
  > refactoring, so some of the links in this chapter may be broken.
  
-Macro expansion happens during parsing. `rustc` has two parsers, in fact: the
-normal Rust parser, and the macro parser. During the parsing phase, the normal
-Rust parser will set aside the contents of macros and their invocations. Later,
-before name resolution, macros are expanded using these portions of the code.
-The macro parser, in turn, may call the normal Rust parser when it needs to
-bind a metavariable (e.g.  `$my_expr`) while parsing the contents of a macro
-invocation. The code for macro expansion is in
-[`src/librustc_expand/mbe/`][code_dir]. This chapter aims to explain how macro
-expansion works.
-
-### Example
-
-It's helpful to have an example to refer to. For the remainder of this chapter,
-whenever we refer to the "example _definition_", we mean the following:
+Rust has a very powerful macro system. In the previous chapter, we saw how the
+parser sets aside macros to be expanded (it temporarily uses [placeholders]).
+This chapter is about the process of expanding those macros iteratively until
+we have a complete AST for our crate with no unexpanded macros (or a compile
+error).
+
+[placeholders]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/placeholders/index.html
+
+First, we will discuss the algorithm that expands and integrates macro output
+into ASTs. Next, we will take a look at how hygiene data is collected. Finally,
+we will look at the specifics of expanding different types of macros.
+
+Many of the algorithms and data structures described below are in [`rustc_expand`],
+with basic data structures in [`rustc_expand::base`][base].
+
+Also of note, `cfg` and `cfg_attr` are treated specially from other macros, and are
+handled in [`rustc_expand::config`][cfg].
+
+[`rustc_expand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/index.html
+[base]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/index.html
+[cfg]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/config/index.html
+
+## Expansion and AST Integration
+
+First of all, expansion happens at the crate level. Given a raw source code for
+a crate, the compiler will produce a massive AST with all macros expanded, all
+modules inlined, etc. The primary entry point for this process is the
+[`MacroExpander::fully_expand_fragment`][fef] method. With few exceptions, we
+use this method on the whole crate (see ["Eager Expansion"](#eager-expansion)
+below for more detailed discussion of edge case expansion issues).
+
+[`rustc_builtin_macros`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_builtin_macros/index.html
+[reb]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/build/index.html
+
+At a high level, [`fully_expand_fragment`][fef] works in iterations. We keep a
+queue of unresolved macro invocations (that is, macros we haven't found the
+definition of yet). We repeatedly try to pick a macro from the queue, resolve
+it, expand it, and integrate it back. If we can't make progress in an
+iteration, this represents a compile error.  Here is the [algorithm][original]:
+
+[fef]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.MacroExpander.html#method.fully_expand_fragment
+[original]: https://github.com/rust-lang/rust/pull/53778#issuecomment-419224049
+
+0. Initialize an `queue` of unresolved macros.
+1. Repeat until `queue` is empty (or we make no progress, which is an error):
+    0. [Resolve](./name-resolution.md) imports in our partially built crate as
+       much as possible.
+    1. Collect as many macro [`Invocation`s][inv] as possible from our
+       partially built crate (fn-like, attributes, derives) and add them to the
+       queue.
+    2. Dequeue the first element, and attempt to resolve it.
+    3. If it's resolved:
+        0. Run the macro's expander function that consumes a [`TokenStream`] or
+           AST and produces a [`TokenStream`] or [`AstFragment`] (depending on
+           the macro kind). (A `TokenStream` is a collection of [`TokenTree`s][tt],
+           each of which are a token (punctuation, identifier, or literal) or a
+           delimited group (anything inside `()`/`[]`/`{}`)).
+            - At this point, we know everything about the macro itself and can
+              call `set_expn_data` to fill in its properties in the global data;
+              that is the hygiene data associated with `ExpnId`. (See [the
+              "Hygiene" section below][hybelow]).
+        1. Integrate that piece of AST into the big existing partially built
+           AST. This is essentially where the "token-like mass" becomes a
+           proper set-in-stone AST with side-tables. It happens as follows:
+            - If the macro produces tokens (e.g. a proc macro), we parse into
+              an AST, which may produce parse errors.
+            - During expansion, we create `SyntaxContext`s (hierarchy 2). (See
+              [the "Hygiene" section below][hybelow])
+            - These three passes happen one after another on every AST fragment
+              freshly expanded from a macro:
+                - [`NodeId`]s are assigned by [`InvocationCollector`]. This
+                  also collects new macro calls from this new AST piece and
+                  adds them to the queue.
+                - ["Def paths"][defpath] are created and [`DefId`]s are
+                  assigned to them by [`DefCollector`].
+                - Names are put into modules (from the resolver's point of
+                  view) by [`BuildReducedGraphVisitor`].
+        2. After expanding a single macro and integrating its output, continue
+           to the next iteration of [`fully_expand_fragment`][fef].
+    4. If it's not resolved:
+        0. Put the macro back in the queue
+        1. Continue to next iteration...
+
+[defpath]: https://rustc-dev-guide.rust-lang.org/hir.html?highlight=def,path#identifiers-in-the-hir
+[`NodeId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/node_id/struct.NodeId.html
+[`InvocationCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.InvocationCollector.html
+[`DefId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/def_id/struct.DefId.html
+[`DefCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/def_collector/struct.DefCollector.html
+[`BuildReducedGraphVisitor`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/build_reduced_graph/struct.BuildReducedGraphVisitor.html
+[hybelow]: #hygiene-and-hierarchies
+[tt]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/enum.TokenTree.html
+[`TokenStream`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/struct.TokenStream.html
+[inv]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.Invocation.html
+[`AstFragment`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragment.html
+
+### Error Recovery
+
+If we make no progress in an iteration, then we have reached a compilation
+error (e.g. an undefined macro). We attempt to recover from failures
+(unresolved macros or imports) for the sake of diagnostics. This allows
+compilation to continue past the first error, so that we can report more errors
+at a time. Recovery can't cause compilation to succeed. We know that it will
+fail at this point. The recovery happens by expanding unresolved macros into
+[`ExprKind::Err`][err].
+
+[err]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/ast/enum.ExprKind.html#variant.Err
+
+### Name Resolution
+
+Notice that name resolution is involved here: we need to resolve imports and
+macro names in the above algorithm. This is done in
+[`rustc_resolve::macros`][mresolve], which resolves macro paths, validates
+those resolutions, and reports various errors (e.g. "not found" or "found, but
+it's unstable" or "expected x, found y"). However, we don't try to resolve
+other names yet. This happens later, as we will see in the [next
+chapter](./name-resolution.md).
+
+[mresolve]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/macros/index.html
+
+### Eager Expansion
+
+_Eager expansion_ means that we expand the arguments of a macro invocation
+before the macro invocation itself. This is implemented only for a few special
+built-in macros that expect literals; expanding arguments first for some of
+these macro results in a smoother user experience.  As an example, consider the
+following:
  
  ```rust,ignore
-macro_rules! printer {
-    (print $mvar:ident) => {
-        println!("{}", $mvar);
-    };
-    (print twice $mvar:ident) => {
-        println!("{}", $mvar);
-        println!("{}", $mvar);
-    };
-}
-```
+macro bar($i: ident) { $i }
+macro foo($i: ident) { $i }
  
-`$mvar` is called a _metavariable_. Unlike normal variables, rather than
-binding to a value in a computation, a metavariable binds _at compile time_ to
-a tree of _tokens_.  A _token_ is a single "unit" of the grammar, such as an
-identifier (e.g. `foo`) or punctuation (e.g. `=>`). There are also other
-special tokens, such as `EOF`, which indicates that there are no more tokens.
-Token trees resulting from paired parentheses-like characters (`(`...`)`,
-`[`...`]`, and `{`...`}`) – they include the open and close and all the tokens
-in between (we do require that parentheses-like characters be balanced). Having
-macro expansion operate on token streams rather than the raw bytes of a source
-file abstracts away a lot of complexity. The macro expander (and much of the
-rest of the compiler) doesn't really care that much about the exact line and
-column of some syntactic construct in the code; it cares about what constructs
-are used in the code. Using tokens allows us to care about _what_ without
-worrying about _where_. For more information about tokens, see the
-[Parsing][parsing] chapter of this book.
-
-Whenever we refer to the "example _invocation_", we mean the following snippet:
-
-```rust,ignore
-printer!(print foo); // Assume `foo` is a variable defined somewhere else...
+foo!(bar!(baz));
  ```
  
-The process of expanding the macro invocation into the syntax tree
-`println!("{}", foo)` and then expanding that into a call to `Display::fmt` is
-called _macro expansion_, and it is the topic of this chapter.
-
-### The macro parser
-
-There are two parts to macro expansion: parsing the definition and parsing the
-invocations. Interestingly, both are done by the macro parser.
-
-Basically, the macro parser is like an NFA-based regex parser. It uses an
-algorithm similar in spirit to the [Earley parsing
-algorithm](https://en.wikipedia.org/wiki/Earley_parser). The macro parser is
-defined in [`src/librustc_expand/mbe/macro_parser.rs`][code_mp].
-
-The interface of the macro parser is as follows (this is slightly simplified):
-
-```rust,ignore
-fn parse_tt(
-    parser: &mut Cow<Parser>, 
-    ms: &[TokenTree],
-) -> NamedParseResult
-```
-
-We use these items in macro parser:
-
-- `sess` is a "parsing session", which keeps track of some metadata. Most
-  notably, this is used to keep track of errors that are generated so they can
-  be reported to the user.
-- `tts` is a stream of tokens. The macro parser's job is to consume the raw
-  stream of tokens and output a binding of metavariables to corresponding token
-  trees.
-- `ms` a _matcher_. This is a sequence of token trees that we want to match
-  `tts` against.
-
-In the analogy of a regex parser, `tts` is the input and we are matching it
-against the pattern `ms`. Using our examples, `tts` could be the stream of
-tokens containing the inside of the example invocation `print foo`, while `ms`
-might be the sequence of token (trees) `print $mvar:ident`.
-
-The output of the parser is a `NamedParseResult`, which indicates which of
-three cases has occurred:
-
-- Success: `tts` matches the given matcher `ms`, and we have produced a binding
-  from metavariables to the corresponding token trees.
-- Failure: `tts` does not match `ms`. This results in an error message such as
-  "No rule expected token _blah_".
-- Error: some fatal error has occurred _in the parser_. For example, this
-  happens if there are more than one pattern match, since that indicates
-  the macro is ambiguous.
-
-The full interface is defined [here][code_parse_int].
-
-The macro parser does pretty much exactly the same as a normal regex parser with
-one exception: in order to parse different types of metavariables, such as
-`ident`, `block`, `expr`, etc., the macro parser must sometimes call back to the
-normal Rust parser.
-
-As mentioned above, both definitions and invocations of macros are parsed using
-the macro parser. This is extremely non-intuitive and self-referential. The code
-to parse macro _definitions_ is in
-[`src/librustc_expand/mbe/macro_rules.rs`][code_mr]. It defines the pattern for
-matching for a macro definition as `$( $lhs:tt => $rhs:tt );+`. In other words,
-a `macro_rules` definition should have in its body at least one occurrence of a
-token tree followed by `=>` followed by another token tree. When the compiler
-comes to a `macro_rules` definition, it uses this pattern to match the two token
-trees per rule in the definition of the macro _using the macro parser itself_.
-In our example definition, the metavariable `$lhs` would match the patterns of
-both arms: `(print $mvar:ident)` and `(print twice $mvar:ident)`.  And `$rhs`
-would match the bodies of both arms: `{ println!("{}", $mvar); }` and `{
-println!("{}", $mvar); println!("{}", $mvar); }`. The parser would keep this
-knowledge around for when it needs to expand a macro invocation.
-
-When the compiler comes to a macro invocation, it parses that invocation using
-the same NFA-based macro parser that is described above. However, the matcher
-used is the first token tree (`$lhs`) extracted from the arms of the macro
-_definition_. Using our example, we would try to match the token stream `print
-foo` from the invocation against the matchers `print $mvar:ident` and `print
-twice $mvar:ident` that we previously extracted from the definition.  The
-algorithm is exactly the same, but when the macro parser comes to a place in the
-current matcher where it needs to match a _non-terminal_ (e.g. `$mvar:ident`),
-it calls back to the normal Rust parser to get the contents of that
-non-terminal. In this case, the Rust parser would look for an `ident` token,
-which it finds (`foo`) and returns to the macro parser. Then, the macro parser
-proceeds in parsing as normal. Also, note that exactly one of the matchers from
-the various arms should match the invocation; if there is more than one match,
-the parse is ambiguous, while if there are no matches at all, there is a syntax
-error.
-
-For more information about the macro parser's implementation, see the comments
-in [`src/librustc_expand/mbe/macro_parser.rs`][code_mp].
-
-### Hygiene
+A lazy expansion would expand `foo!` first. An eager expansion would expand
+`bar!` first.
+
+Eager expansion is not a generally available feature of Rust.  Implementing
+eager expansion more generally would be challenging, but we implement it for a
+few special built-in macros for the sake of user experience.  The built-in
+macros are implemented in [`rustc_builtin_macros`], along with some other early
+code generation facilities like injection of standard library imports or
+generation of test harness. There are some additional helpers for building
+their AST fragments in [`rustc_expand::build`][reb]. Eager expansion generally
+performs a subset of the things that lazy (normal) expansion. It is done by
+invoking [`fully_expand_fragment`][fef] on only part of a crate (as opposed to
+whole crate, like we normally do).
+
+### Other Data Structures
+
+Here are some other notable data structures involved in expansion and integration:
+- [`ResolverExpand`] - a trait used to break crate dependencies. This allows the
+  resolver services to be used in [`rustc_ast`], despite [`rustc_resolve`] and
+  pretty much everything else depending on [`rustc_ast`].
+- [`ExtCtxt`]/[`ExpansionData`] - various intermediate data kept and used by expansion
+  infrastructure in the process of its work
+- [`Annotatable`] - a piece of AST that can be an attribute target, almost same
+  thing as AstFragment except for types and patterns that can be produced by
+  macros but cannot be annotated with attributes
+- [`MacResult`] - a "polymorphic" AST fragment, something that can turn into a
+  different `AstFragment` depending on its [`AstFragmentKind`] - item,
+  or expression, or pattern etc.
+
+[`rustc_ast`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/index.html
+[`rustc_resolve`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/index.html
+[`ResolverExpand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.ResolverExpand.html
+[`ExtCtxt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.ExtCtxt.html
+[`ExpansionData`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.ExpansionData.html
+[`Annotatable`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/enum.Annotatable.html
+[`MacResult`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.MacResult.html
+[`AstFragmentKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragmentKind.html
+
+## Hygiene and Hierarchies
  
  If you have ever used C/C++ preprocessor macros, you know that there are some
  annoying and hard-to-debug gotchas! For example, consider the following C code:
@@ -190,728 +224,394 @@ a macro author may want to introduce a new name to the context where the macro
  was called. Alternately, the macro author may be defining a variable for use
  only within the macro (i.e. it should not be visible outside the macro).
  
-In rustc, this "context" is tracked via `Span`s.
-
-TODO: what is call-site hygiene? what is def-site hygiene?
-
-TODO
-
-### Procedural Macros
-
-TODO
-
-### Custom Derive
-
-TODO
-
-TODO: maybe something about macros 2.0?
-
-
-[code_dir]: https://github.com/rust-lang/rust/tree/master/src/librustc_expand/mbe
+[code_dir]: https://github.com/rust-lang/rust/tree/master/compiler/rustc_expand/src/mbe
  [code_mp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_parser
  [code_mr]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_rules
  [code_parse_int]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_parser/fn.parse_tt.html
  [parsing]: ./the-parser.html
  
+The context is attached to AST nodes. All AST nodes generated by macros have
+context attached. Additionally, there may be other nodes that have context
+attached, such as some desugared syntax (non-macro-expanded nodes are
+considered to just have the "root" context, as described below).
+Throughout the compiler, we use [`rustc_span::Span`s][span] to refer to code locations.
+This struct also has hygiene information attached to it, as we will see later.
  
-# Discussion about hygiene
-
-The rest of this chapter is a dump of a discussion between `mark-i-m` and
-`petrochenkov` about Macro Expansion and Hygiene. I am pasting it here so that
-it never gets lost until we can make it into a proper chapter.
-
-```txt
-mark-i-m: @Vadim Petrochenkov Hi :wave:
-I was wondering if you would have a chance sometime in the next month or so to
-just have a zulip discussion where you tell us (WG-learning) everything you
-know about macros/expansion/hygiene. We were thinking this could be less formal
-(and less work for you) than compiler lecture series lecture... thoughts?
-
-mark-i-m: The goal is to fill out that long-standing gap in the rustc-dev-guide
-
-Vadim Petrochenkov: Ok, I'm at UTC+03:00 and generally available in the
-evenings (or weekends).
-
-mark-i-m: @Vadim Petrochenkov Either of those works for me (your evenings are
-about lunch time for me :) ) Is there a particular date that would work best
-for you?
-
-mark-i-m: @WG-learning Does anyone else have a preferred date?
-
-    Vadim Petrochenkov:
-
-    Is there a particular date that would work best for you?
-
-Nah, not much difference.  (If something changes for a specific day, I'll
-notify.)
-
-Santiago Pastorino: week days are better, but I'd say let's wait for @Vadim
-Petrochenkov to say when they are ready for it and we can set a date
-
-Santiago Pastorino: also, we should record this so ... I guess it doesn't
-matter that much when :)
-
-    mark-i-m:
-
-    also, we should record this so ... I guess it doesn't matter that much when
-    :)
-
-@Santiago Pastorino My thinking was to just use zulip, so we would have the log
-
-mark-i-m: @Vadim Petrochenkov @WG-learning How about 2 weeks from now: July 24
-at 5pm UTC time (if I did the math right, that should be evening for Vadim)
-
-Amanjeev Sethi: i can try and do this but I am starting a new job that week so
-cannot promise.
-
-    Santiago Pastorino:
-
-    Vadim Petrochenkov @WG-learning How about 2 weeks from now: July 24 at 5pm
-    UTC time (if I did the math right, that should be evening for Vadim)
-
-works perfect for me
-
-Santiago Pastorino: @mark-i-m I have access to the compiler calendar so I can
-add something there
-
-Santiago Pastorino: let me know if you want to add an event to the calendar, I
-can do that
-
-Santiago Pastorino: how long it would be?
-
-    mark-i-m:
-
-    let me know if you want to add an event to the calendar, I can do that
-
-mark-i-m: That could be good :+1:
-
-    mark-i-m:
-
-    how long it would be?
-
-Let's start with 30 minutes, and if we need to schedule another we cna
-
-    Vadim Petrochenkov:
-
-    5pm UTC
-
-1-2 hours later would be better, 5pm UTC is not evening enough.
-
-Vadim Petrochenkov: How exactly do you plan the meeting to go (aka how much do
-I need to prepare)?
-
-    Santiago Pastorino:
-
-        5pm UTC
-
-    1-2 hours later would be better, 5pm UTC is not evening enough.
-
-Scheduled for 7pm UTC then
-
-    Santiago Pastorino:
-
-    How exactly do you plan the meeting to go (aka how much do I need to
-    prepare)?
-
-/cc @mark-i-m
-
-mark-i-m: @Vadim Petrochenkov
-
-    How exactly do you plan the meeting to go (aka how much do I need to
-    prepare)?
-
-My hope was that this could be less formal than for a compiler lecture series,
-but it would be nice if you could have in your mind a tour of the design and
-the code
-
-That is, imagine that a new person was joining the compiler team and needed to
-get up to speed about macros/expansion/hygiene. What would you tell such a
-person?
-
-mark-i-m: @Vadim Petrochenkov Are we still on for tomorrow at 7pm UTC?
-
-Vadim Petrochenkov: Yes.
-
-Santiago Pastorino: @Vadim Petrochenkov @mark-i-m I've added an event on rust
-compiler team calendar
-
-mark-i-m: @WG-learning @Vadim Petrochenkov Hello!
-
-mark-i-m: We will be starting in ~7 minutes
-
-mark-i-m: :wave:
-
-Vadim Petrochenkov: I'm here.
-
-mark-i-m: Cool :)
-
-Santiago Pastorino: hello @Vadim Petrochenkov
-
-mark-i-m: Shall we start?
-
-mark-i-m: First off, @Vadim Petrochenkov Thanks for doing this!
-
-Vadim Petrochenkov: Here's some preliminary data I prepared.
-
-Vadim Petrochenkov: Below I'll assume #62771 and #62086 has landed.
-
-Vadim Petrochenkov: Where to find the code: librustc_span/hygiene.rs -
-structures related to hygiene and expansion that are kept in global data (can
-be accessed from any Ident without any context) librustc_span/lib.rs - some
-secondary methods like macro backtrace using primary methods from hygiene.rs
-librustc_builtin_macros - implementations of built-in macros (including macro attributes
-and derives) and some other early code generation facilities like injection of
-standard library imports or generation of test harness.  librustc_ast/config.rs -
-implementation of cfg/cfg_attr (they treated specially from other macros),
-should probably be moved into librustc_ast/ext.  librustc_ast/tokenstream.rs +
-librustc_ast/parse/token.rs - structures for compiler-side tokens, token trees,
-and token streams.  librustc_ast/ext - various expansion-related stuff
-librustc_ast/ext/base.rs - basic structures used by expansion
-librustc_ast/ext/expand.rs - some expansion structures and the bulk of expansion
-infrastructure code - collecting macro invocations, calling into resolve for
-them, calling their expanding functions, and integrating the results back into
-AST librustc_ast/ext/placeholder.rs - the part of expand.rs responsible for
-"integrating the results back into AST" basicallly, "placeholder" is a
-temporary AST node replaced with macro expansion result nodes
-librustc_ast/ext/builer.rs - helper functions for building AST for built-in macros
-in librustc_builtin_macros (and user-defined syntactic plugins previously), can probably
-be moved into librustc_builtin_macros these days librustc_ast/ext/proc_macro.rs +
-librustc_ast/ext/proc_macro_server.rs - interfaces between the compiler and the
-stable proc_macro library, converting tokens and token streams between the two
-representations and sending them through C ABI librustc_ast/ext/tt -
-implementation of macro_rules, turns macro_rules DSL into something with
-signature Fn(TokenStream) -> TokenStream that can eat and produce tokens,
-@mark-i-m knows more about this librustc_resolve/macros.rs - resolving macro
-paths, validating those resolutions, reporting various "not found"/"found, but
-it's unstable"/"expected x, found y" errors librustc_middle/hir/map/def_collector.rs +
-librustc_resolve/build_reduced_graph.rs - integrate an AST fragment freshly
-expanded from a macro into various parent/child structures like module
-hierarchy or "definition paths"
-
-Primary structures: HygieneData - global piece of data containing hygiene and
-expansion info that can be accessed from any Ident without any context ExpnId -
-ID of a macro call or desugaring (and also expansion of that call/desugaring,
-depending on context) ExpnInfo/InternalExpnData - a subset of properties from
-both macro definition and macro call available through global data
-SyntaxContext - ID of a chain of nested macro definitions (identified by
-ExpnIds) SyntaxContextData - data associated with the given SyntaxContext,
-mostly a cache for results of filtering that chain in different ways Span - a
-code location + SyntaxContext Ident - interned string (Symbol) + Span, i.e. a
-string with attached hygiene data TokenStream - a collection of TokenTrees
-TokenTree - a token (punctuation, identifier, or literal) or a delimited group
-(anything inside ()/[]/{}) SyntaxExtension - a lowered macro representation,
-contains its expander function transforming a tokenstream or AST into
-tokenstream or AST + some additional data like stability, or a list of unstable
-features allowed inside the macro.  SyntaxExtensionKind - expander functions
-may have several different signatures (take one token stream, or two, or a
-piece of AST, etc), this is an enum that lists them
-ProcMacro/TTMacroExpander/AttrProcMacro/MultiItemModifier - traits representing
-the expander signatures (TODO: change and rename the signatures into something
-more consistent) trait Resolver - a trait used to break crate dependencies (so
-resolver services can be used in librustc_ast, despite librustc_resolve and pretty
-much everything else depending on librustc_ast) ExtCtxt/ExpansionData - various
-intermediate data kept and used by expansion infra in the process of its work
-AstFragment - a piece of AST that can be produced by a macro (may include
-multiple homogeneous AST nodes, like e.g. a list of items) Annotatable - a
-piece of AST that can be an attribute target, almost same thing as AstFragment
-except for types and patterns that can be produced by macros but cannot be
-annotated with attributes (TODO: Merge into AstFragment) trait MacResult - a
-"polymorphic" AST fragment, something that can turn into a different
-AstFragment depending on its context (aka AstFragmentKind - item, or
-expression, or pattern etc.) Invocation/InvocationKind - a structure describing
-a macro call, these structures are collected by the expansion infra
-(InvocationCollector), queued, resolved, expanded when resolved, etc.
-
-Primary algorithms / actions: TODO
-
-mark-i-m: Very useful :+1:
-
-mark-i-m: @Vadim Petrochenkov Zulip doesn't have an indication of typing, so
-I'm not sure if you are waiting for me or not
-
-Vadim Petrochenkov: The TODO part should be about how a crate transitions from
-the state "macros exist as written in source" to "all macros are expanded", but
-I didn't write it yet.
-
-Vadim Petrochenkov: (That should probably better happen off-line.)
-
-Vadim Petrochenkov: Now, if you have any questions?
-
-mark-i-m: Thanks :)
-
-mark-i-m: /me is still reading :P
-
-mark-i-m: Ok
-
-mark-i-m: So I guess my first question is about hygiene, since that remains the
-most mysterious to me... My understanding is that the parser outputs AST nodes,
-where each node has a Span
-
-mark-i-m: In the absence of macros and desugaring, what does the syntax context
-of an AST node look like?
-
-mark-i-m: @Vadim Petrochenkov
-
-Vadim Petrochenkov: Not each node, but many of them.  When a node is not
-macro-expanded, its context is 0.
-
-Vadim Petrochenkov: aka SyntaxContext::empty()
-
-Vadim Petrochenkov: it's a chain that consists of one expansion - expansion 0
-aka ExpnId::root.
-
-mark-i-m: Do all expansions start at root?
-
-Vadim Petrochenkov: Also, SyntaxContext:empty() is its own father.
-
-mark-i-m: Is this actually stored somewhere or is it a logical value?
-
-Vadim Petrochenkov: All expansion hyerarchies (there are several of them) start
-at ExpnId::root.
-
-Vadim Petrochenkov: Vectors in HygieneData has entries for both ctxt == 0 and
-expn_id == 0.
-
-Vadim Petrochenkov: I don't think anyone looks into them much though.
-
-mark-i-m: Ok
-
-Vadim Petrochenkov: Speaking of multiple hierarchies...
+[span]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/struct.Span.html
  
-mark-i-m: Go ahead :)
+Because macros invocations and definitions can be nested, the syntax context of
+a node must be a hierarchy. For example, if we expand a macro and there is
+another macro invocation or definition in the generated output, then the syntax
+context should reflect the nesting.
  
-Vadim Petrochenkov: One is parent (expn_id1) -> parent(expn_id2) -> ...
+However, it turns out that there are actually a few types of context we may
+want to track for different purposes. Thus, there are not just one but _three_
+expansion hierarchies that together comprise the hygiene information for a
+crate.
  
-Vadim Petrochenkov: This is the order in which macros are expanded.
+All of these hierarchies need some sort of "macro ID" to identify individual
+elements in the chain of expansions. This ID is [`ExpnId`].  All macros receive
+an integer ID, assigned continuously starting from 0 as we discover new macro
+calls.  All hierarchies start at [`ExpnId::root()`][rootid], which is its own
+parent.
  
-Vadim Petrochenkov: Well.
+[`rustc_span::hygiene`][hy] contains all of the hygiene-related algorithms
+(with the exception of some hacks in [`Resolver::resolve_crate_root`][hacks])
+and structures related to hygiene and expansion that are kept in global data.
  
-Vadim Petrochenkov: When we are expanding one macro another macro is revealed
-in its output.
+The actual hierarchies are stored in [`HygieneData`][hd]. This is a global
+piece of data containing hygiene and expansion info that can be accessed from
+any [`Ident`] without any context.
  
-Vadim Petrochenkov: That's the parent-child relation in this hierarchy.
  
-Vadim Petrochenkov: InternalExpnData::parent is the child->parent link.
+[`ExpnId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html
+[rootid]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html#method.root
+[hd]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.HygieneData.html
+[hy]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/index.html
+[hacks]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/struct.Resolver.html#method.resolve_crate_root
+[`Ident`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/symbol/struct.Ident.html
  
-mark-i-m: So in the above chain expn_id1 is the child?
+### The Expansion Order Hierarchy
  
-Vadim Petrochenkov: Yes.
+The first hierarchy tracks the order of expansions, i.e., when a macro
+invocation is in the output of another macro.
  
-Vadim Petrochenkov: The second one is parent (SyntaxContext1) ->
-parent(SyntaxContext2) -> ...
+Here, the children in the hierarchy will be the "innermost" tokens.  The
+[`ExpnData`] struct itself contains a subset of properties from both macro
+definition and macro call available through global data.
+[`ExpnData::parent`][edp] tracks the child -> parent link in this hierarchy.
  
-Vadim Petrochenkov: This is about nested macro definitions.  When we are
-expanding one macro another macro definition is revealed in its output.
+[`ExpnData`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html
+[edp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.parent
  
-Vadim Petrochenkov: SyntaxContextData::parent is the child->parent link here.
-
-Vadim Petrochenkov: So, SyntaxContext is the whole chain in this hierarchy, and
-outer_expns are individual elements in the chain.
-
-mark-i-m: So for example, suppose I have the following:
+For example,
  
+```rust,ignore
  macro_rules! foo { () => { println!(); } }
  
  fn main() { foo!(); }
+```
  
-Then AST nodes that are finally generated would have parent(expn_id_println) ->
-parent(expn_id_foo), right?
-
-Vadim Petrochenkov: Pretty common construction (at least it was, before
-refactorings) is SyntaxContext::empty().apply_mark(expn_id), which means...
-
-    Vadim Petrochenkov:
-
-    Then AST nodes that are finally generated would have
-    parent(expn_id_println) -> parent(expn_id_foo), right?
-
-Yes.
-
-    mark-i-m:
-
-    and outer_expns are individual elements in the chain.
-
-Sorry, what is outer_expns?
-
-Vadim Petrochenkov: SyntaxContextData::outer_expn
-
-mark-i-m: Thanks :) Please continue
-
-Vadim Petrochenkov: ...which means a token produced by a built-in macro (which
-is defined in the root effectively).
-
-mark-i-m: Where does the expn_id come from?
-
-Vadim Petrochenkov: Or a stable proc macro, which are always considered to be
-defined in the root because they are always cross-crate, and we don't have the
-cross-crate hygiene implemented, ha-ha.
-
-    Vadim Petrochenkov:
-
-    Where does the expn_id come from?
-
-Vadim Petrochenkov: ID of the built-in macro call like line!().
-
-Vadim Petrochenkov: Assigned continuously from 0 to N as soon as we discover
-new macro calls.
-
-mark-i-m: Sorry, I didn't quite understand. Do you mean that only built-in
-macros receive continuous IDs?
-
-Vadim Petrochenkov: So, the second hierarchy has a catch - the context
-transplantation hack -
-https://github.com/rust-lang/rust/pull/51762#issuecomment-401400732.
-
-    Vadim Petrochenkov:
-
-    Do you mean that only built-in macros receive continuous IDs?
-
-Vadim Petrochenkov: No, all macro calls receive ID.
-
-Vadim Petrochenkov: Built-ins have the typical pattern
-SyntaxContext::empty().apply_mark(expn_id) for syntax contexts produced by
-them.
-
-mark-i-m: I see, but this pattern is only used for built-ins, right?
-
-Vadim Petrochenkov: And also all stable proc macros, see the comments above.
-
-mark-i-m: Got it
-
-Vadim Petrochenkov: The third hierarchy is call-site hierarchy.
-
-Vadim Petrochenkov: If foo!(bar!(ident)) expands into ident
-
-Vadim Petrochenkov: then hierarchy 1 is root -> foo -> bar -> ident
-
-Vadim Petrochenkov: but hierarchy 3 is root -> ident
-
-Vadim Petrochenkov: ExpnInfo::call_site is the child-parent link in this case.
-
-mark-i-m: When we expand, do we expand foo first or bar? Why is there a
-hierarchy 1 here? Is that foo expands first and it expands to something that
-contains bar!(ident)?
-
-Vadim Petrochenkov: Ah, yes, let's assume both foo and bar are identity macros.
-
-Vadim Petrochenkov: Then foo!(bar!(ident)) -> expand -> bar!(ident) -> expand
--> ident
-
-Vadim Petrochenkov: If bar were expanded first, that would be eager expansion -
-https://github.com/rust-lang/rfcs/pull/2320.
-
-mark-i-m: And after we expand only foo! presumably whatever intermediate state
-has heirarchy 1 of root->foo->(bar_ident), right?
-
-Vadim Petrochenkov: (We have it hacked into some built-in macros, but not
-generally.)
-
-    Vadim Petrochenkov:
-
-    And after we expand only foo! presumably whatever intermediate state has
-    heirarchy 1 of root->foo->(bar_ident), right?
-
-Vadim Petrochenkov: Yes.
-
-mark-i-m: Got it :)
-
-mark-i-m: It looks like we have ~5 minutes left. This has been very helpful
-already, but I also have more questions. Shall we try to schedule another
-meeting in the future?
-
-Vadim Petrochenkov: Sure, why not.
-
-Vadim Petrochenkov: A thread for offline questions-answers would be good too.
-
-    mark-i-m:
-
-    A thread for offline questions-answers would be good too.
-
-I don't mind using this thread, since it already has a lot of info in it. We
-also plan to summarize the info from this thread into the rustc-dev-guide.
-
-    Sure, why not.
-
-Unfortunately, I'm unavailable for a few weeks. Would August 21-ish work for
-you (and @WG-learning )?
-
-mark-i-m: @Vadim Petrochenkov Thanks very much for your time and knowledge!
-
-mark-i-m: One last question: are there more hierarchies?
-
-Vadim Petrochenkov: Not that I know of.  Three + the context transplantation
-hack is already more complex than I'd like.
-
-mark-i-m: Yes, one wonders what it would be like if one also had to think about
-eager expansion...
-
-Santiago Pastorino: sorry but I couldn't follow that much today, will read it
-when I have some time later
-
-Santiago Pastorino: btw https://github.com/rust-lang/rustc-dev-guide/issues/398
-
-mark-i-m: @Vadim Petrochenkov Would 7pm UTC on August 21 work for a followup?
-
-Vadim Petrochenkov: Tentatively yes.
-
-mark-i-m: @Vadim Petrochenkov @WG-learning Does this still work for everyone?
-
-Vadim Petrochenkov: August 21 is still ok.
-
-mark-i-m: @WG-learning @Vadim Petrochenkov We will start in ~30min
-
-Vadim Petrochenkov: Oh.  Thanks for the reminder, I forgot about this entirely.
-
-mark-i-m: Hello!
-
-Vadim Petrochenkov: (I'll be here in a couple of minutes.)
-
-Vadim Petrochenkov: Ok, I'm here.
-
-mark-i-m: Hi :)
-
-Vadim Petrochenkov: Hi.
-
-mark-i-m: so last time, we talked about the 3 context heirarchies
-
-Vadim Petrochenkov: Right.
-
-mark-i-m: Was there anything you wanted to add to that? If not, I think it
-would be good to get a big-picture... Given some piece of rust code, how do we
-get to the point where things are expanded and hygiene context is computed?
-
-mark-i-m: (I'm assuming that hygiene info is computed as we expand stuff, since
-I don't think you can discover it beforehand)
-
-Vadim Petrochenkov: Ok, let's move from hygiene to expansion.
-
-Vadim Petrochenkov: Especially given that I don't remember the specific hygiene
-algorithms like adjust in detail.
-
-    Vadim Petrochenkov:
-
-    Given some piece of rust code, how do we get to the point where things are
-    expanded
-
-So, first of all, the "some piece of rust code" is the whole crate.
-
-mark-i-m: Just to confirm, the algorithms are well-encapsulated, right? Like a
-function or a struct as opposed to a bunch of conventions distributed across
-the codebase?
-
-Vadim Petrochenkov: We run fully_expand_fragment in it.
-
-    Vadim Petrochenkov:
-
-    Just to confirm, the algorithms are well-encapsulated, right?
-
-Yes, the algorithmic parts are entirely inside hygiene.rs.
-
-Vadim Petrochenkov: Ok, some are in fn resolve_crate_root, but those are hacks.
-
-Vadim Petrochenkov: (Continuing about expansion.) If fully_expand_fragment is
-run not on a whole crate, it means that we are performing eager expansion.
-
-Vadim Petrochenkov: Eager expansion is done for arguments of some built-in
-macros that expect literals.
-
-Vadim Petrochenkov: It generally performs a subset of actions performed by the
-non-eager expansion.
-
-Vadim Petrochenkov: So, I'll talk about non-eager expansion for now.
-
-mark-i-m: Eager expansion is not exposed as a language feature, right? i.e. it
-is not possible for me to write an eager macro?
+In this code, the AST nodes that are finally generated would have hierarchy:
  
-Vadim Petrochenkov:
-https://github.com/rust-lang/rust/pull/53778#issuecomment-419224049 (vvv The
-link is explained below vvv )
+```
+root
+    expn_id_foo
+        expn_id_println
+```
  
-    Vadim Petrochenkov:
+### The Macro Definition Hierarchy
  
-    Eager expansion is not exposed as a language feature, right? i.e. it is not
-    possible for me to write an eager macro?
+The second hierarchy tracks the order of macro definitions, i.e., when we are
+expanding one macro another macro definition is revealed in its output.  This
+one is a bit tricky and more complex than the other two hierarchies.
  
-Yes, it's entirely an ability of some built-in macros.
+[`SyntaxContext`][sc] represents a whole chain in this hierarchy via an ID.
+[`SyntaxContextData`][scd] contains data associated with the given
+`SyntaxContext`; mostly it is a cache for results of filtering that chain in
+different ways.  [`SyntaxContextData::parent`][scdp] is the child -> parent
+link here, and [`SyntaxContextData::outer_expns`][scdoe] are individual
+elements in the chain.  The "chaining operator" is
+[`SyntaxContext::apply_mark`][am] in compiler code.
  
-Vadim Petrochenkov: Not exposed for general use.
+A [`Span`][span], mentioned above, is actually just a compact representation of
+a code location and `SyntaxContext`. Likewise, an [`Ident`] is just an interned
+[`Symbol`] + `Span` (i.e. an interned string + hygiene data).
  
-Vadim Petrochenkov: fully_expand_fragment works in iterations.
+[`Symbol`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/symbol/struct.Symbol.html
+[scd]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html
+[scdp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.parent
+[sc]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html
+[scdoe]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.outer_expn
+[am]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html#method.apply_mark
  
-Vadim Petrochenkov: Iterations looks roughly like this:
-- Resolve imports in our partially built crate as much as possible.
-- Collect as many macro invocations as possible from our partially built crate
-  (fn-like, attributes, derives) from the crate and add them to the queue.
+For built-in macros, we use the context:
+`SyntaxContext::empty().apply_mark(expn_id)`, and such macros are considered to
+be defined at the hierarchy root. We do the same for proc-macros because we
+haven't implemented cross-crate hygiene yet.
  
-    Vadim Petrochenkov: Take a macro from the queue, and attempt to resolve it.
+If the token had context `X` before being produced by a macro then after being
+produced by the macro it has context `X -> macro_id`. Here are some examples:
  
-    Vadim Petrochenkov: If it's resolved - run its expander function that
-    consumes tokens or AST and produces tokens or AST (depending on the macro
-    kind).
+Example 0:
  
-    Vadim Petrochenkov: (If it's not resolved, then put it back into the
-    queue.)
+```rust,ignore
+macro m() { ident }
  
-Vadim Petrochenkov: ^^^ That's where we fill in the hygiene data associated
-with ExpnIds.
+m!();
+```
  
-mark-i-m: When we put it back in the queue?
+Here `ident` originally has context [`SyntaxContext::root()`][scr]. `ident` has
+context `ROOT -> id(m)` after it's produced by `m`.
  
-mark-i-m: or do you mean the collect step in general?
+[scr]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html#method.root
  
-Vadim Petrochenkov: Once we resolved the macro call to the macro definition we
-know everything about the macro and can call set_expn_data to fill in its
-properties in the global data.
  
-Vadim Petrochenkov: I mean, immediately after successful resolution.
+Example 1:
  
-Vadim Petrochenkov: That's the first part of hygiene data, the second one is
-associated with SyntaxContext rather than with ExpnId, it's filled in later
-during expansion.
+```rust,ignore
+macro m() { macro n() { ident } }
  
-Vadim Petrochenkov: So, after we run the macro's expander function and got a
-piece of AST (or got tokens and parsed them into a piece of AST) we need to
-integrate that piece of AST into the big existing partially built AST.
+m!();
+n!();
+```
+In this example the `ident` has context `ROOT` originally, then `ROOT -> id(m)`
+after the first expansion, then `ROOT -> id(m) -> id(n)`.
  
-Vadim Petrochenkov: This integration is a really important step where the next
-things happen:
-- NodeIds are assigned.
+Example 2:
  
-    Vadim Petrochenkov: "def paths"s and their IDs (DefIds) are created
+Note that these chains are not entirely determined by their last element, in
+other words `ExpnId` is not isomorphic to `SyntaxContext`.
  
-    Vadim Petrochenkov: Names are put into modules from the resolver point of
-    view.
+```rust,ignore
+macro m($i: ident) { macro n() { ($i, bar) } }
  
-Vadim Petrochenkov: So, we are basically turning some vague token-like mass
-into proper set in stone hierarhical AST and side tables.
+m!(foo);
+```
  
-Vadim Petrochenkov: Where exactly this happens - NodeIds are assigned by
-InvocationCollector (which also collects new macro calls from this new AST
-piece and adds them to the queue), DefIds are created by DefCollector, and
-modules are filled by BuildReducedGraphVisitor.
+After all expansions, `foo` has context `ROOT -> id(n)` and `bar` has context
+`ROOT -> id(m) -> id(n)`.
  
-Vadim Petrochenkov: These three passes run one after another on every AST
-fragment freshly expanded from a macro.
+Finally, one last thing to mention is that currently, this hierarchy is subject
+to the ["context transplantation hack"][hack]. Basically, the more modern (and
+experimental) `macro` macros have stronger hygiene than the older MBE system,
+but this can result in weird interactions between the two. The hack is intended
+to make things "just work" for now.
  
-Vadim Petrochenkov: After expanding a single macro and integrating its output
-we again try to resolve all imports in the crate, and then return to the big
-queue processing loop and pick up the next macro.
+[hack]: https://github.com/rust-lang/rust/pull/51762#issuecomment-401400732
  
-Vadim Petrochenkov: Repeat until there's no more macros.  Vadim Petrochenkov:
+### The Call-site Hierarchy
  
-mark-i-m: The integration step is where we would get parser errors too right?
+The third and final hierarchy tracks the location of macro invocations.
  
-mark-i-m: Also, when do we know definitively that resolution has failed for
-particular ident?
+In this hierarchy [`ExpnData::call_site`][callsite] is the child -> parent link.
  
-    Vadim Petrochenkov:
+[callsite]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.call_site
  
-    The integration step is where we would get parser errors too right?
+Here is an example:
  
-Yes, if the macro produced tokens (rather than AST directly) and we had to
-parse them.
+```rust,ignore
+macro bar($i: ident) { $i }
+macro foo($i: ident) { $i }
  
-    Vadim Petrochenkov:
+foo!(bar!(baz));
+```
  
-    when do we know definitively that resolution has failed for particular
-    ident?
+For the `baz` AST node in the final output, the first hierarchy is `ROOT ->
+id(foo) -> id(bar) -> baz`, while the third hierarchy is `ROOT -> baz`.
  
-So, ident is looked up in a number of scopes during resolution.  From closest
-like the current block or module, to far away like preludes or built-in types.
+### Macro Backtraces
  
-Vadim Petrochenkov: If lookup is certainly failed in all of the scopes, then
-it's certainly failed.
+Macro backtraces are implemented in [`rustc_span`] using the hygiene machinery
+in [`rustc_span::hygiene`][hy].
  
-mark-i-m: This is after all expansions and integrations are done, right?
+[`rustc_span`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/index.html
  
-Vadim Petrochenkov: "Certainly" is determined differently for different scopes,
-e.g. for a module scope it means no unexpanded macros and no unresolved glob
-imports in that module.
+## Producing Macro Output
  
-    Vadim Petrochenkov:
+Above, we saw how the output of a macro is integrated into the AST for a crate,
+and we also saw how the hygiene data for a crate is generated. But how do we
+actually produce the output of a macro? It depends on the type of macro.
  
-    This is after all expansions and integrations are done, right?
+There are two types of macros in Rust:
+`macro_rules!` macros (a.k.a. "Macros By Example" (MBE)) and procedural macros
+(or "proc macros"; including custom derives). During the parsing phase, the normal
+Rust parser will set aside the contents of macros and their invocations. Later,
+macros are expanded using these portions of the code.
+
+Some important data structures/interfaces here:
+- [`SyntaxExtension`] - a lowered macro representation, contains its expander
+  function, which transforms a `TokenStream` or AST into another `TokenStream`
+  or AST + some additional data like stability, or a list of unstable features
+  allowed inside the macro.
+- [`SyntaxExtensionKind`] - expander functions may have several different
+  signatures (take one token stream, or two, or a piece of AST, etc). This is
+  an enum that lists them.
+- [`ProcMacro`]/[`TTMacroExpander`]/[`AttrProcMacro`]/[`MultiItemModifier`] -
+  traits representing the expander function signatures.
+
+[`SyntaxExtension`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.SyntaxExtension.html
+[`SyntaxExtensionKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/enum.SyntaxExtensionKind.html
+[`ProcMacro`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.ProcMacro.html
+[`TTMacroExpander`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.TTMacroExpander.html
+[`AttrProcMacro`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.AttrProcMacro.html
+[`MultiItemModifier`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.MultiItemModifier.html
+
+## Macros By Example
+
+MBEs have their own parser distinct from the normal Rust parser. When macros
+are expanded, we may invoke the MBE parser to parse and expand a macro.  The
+MBE parser, in turn, may call the normal Rust parser when it needs to bind a
+metavariable (e.g.  `$my_expr`) while parsing the contents of a macro
+invocation. The code for macro expansion is in
+[`compiler/rustc_expand/src/mbe/`][code_dir].
  
-For macro and import names this happens during expansions and integrations.
+### Example
  
-mark-i-m: Makes sense
+It's helpful to have an example to refer to. For the remainder of this chapter,
+whenever we refer to the "example _definition_", we mean the following:
  
-Vadim Petrochenkov: For all other names we certainly know whether a name is
-resolved successfully or not on the first attempt, because no new names can
-appear.
+```rust,ignore
+macro_rules! printer {
+    (print $mvar:ident) => {
+        println!("{}", $mvar);
+    };
+    (print twice $mvar:ident) => {
+        println!("{}", $mvar);
+        println!("{}", $mvar);
+    };
+}
+```
  
-Vadim Petrochenkov: (They are resolved in a later pass, see
-librustc_resolve/late.rs.)
+`$mvar` is called a _metavariable_. Unlike normal variables, rather than
+binding to a value in a computation, a metavariable binds _at compile time_ to
+a tree of _tokens_.  A _token_ is a single "unit" of the grammar, such as an
+identifier (e.g. `foo`) or punctuation (e.g. `=>`). There are also other
+special tokens, such as `EOF`, which indicates that there are no more tokens.
+Token trees resulting from paired parentheses-like characters (`(`...`)`,
+`[`...`]`, and `{`...`}`) – they include the open and close and all the tokens
+in between (we do require that parentheses-like characters be balanced). Having
+macro expansion operate on token streams rather than the raw bytes of a source
+file abstracts away a lot of complexity. The macro expander (and much of the
+rest of the compiler) doesn't really care that much about the exact line and
+column of some syntactic construct in the code; it cares about what constructs
+are used in the code. Using tokens allows us to care about _what_ without
+worrying about _where_. For more information about tokens, see the
+[Parsing][parsing] chapter of this book.
  
-mark-i-m: And if at the end of the iteration, there are still things in the
-queue that can't be resolve, this represents an error, right?
+Whenever we refer to the "example _invocation_", we mean the following snippet:
  
-mark-i-m: i.e. an undefined macro?
+```rust,ignore
+printer!(print foo); // Assume `foo` is a variable defined somewhere else...
+```
  
-Vadim Petrochenkov: Yes, if we make no progress during an iteration, then we
-are stuck and that state represent an error.
+The process of expanding the macro invocation into the syntax tree
+`println!("{}", foo)` and then expanding that into a call to `Display::fmt` is
+called _macro expansion_, and it is the topic of this chapter.
  
-Vadim Petrochenkov: We attempt to recover though, using dummies expanding into
-nothing or ExprKind::Err or something like that for unresolved macros.
+### The MBE parser
  
-mark-i-m: This is for the purposes of diagnostics, though, right?
+There are two parts to MBE expansion: parsing the definition and parsing the
+invocations. Interestingly, both are done by the macro parser.
  
-Vadim Petrochenkov: But if we are going through recovery, then compilation must
-result in an error anyway.
+Basically, the MBE parser is like an NFA-based regex parser. It uses an
+algorithm similar in spirit to the [Earley parsing
+algorithm](https://en.wikipedia.org/wiki/Earley_parser). The macro parser is
+defined in [`compiler/rustc_expand/src/mbe/macro_parser.rs`][code_mp].
  
-Vadim Petrochenkov: Yes, that's for diagnostics, without recovery we would
-stuck at the first unresolved macro or import.  Vadim Petrochenkov:
+The interface of the macro parser is as follows (this is slightly simplified):
  
-So, about the SyntaxContext hygiene...
+```rust,ignore
+fn parse_tt(
+    parser: &mut Cow<Parser>,
+    ms: &[TokenTree],
+) -> NamedParseResult
+```
  
-Vadim Petrochenkov: New syntax contexts are created during macro expansion.
+We use these items in macro parser:
  
-Vadim Petrochenkov: If the token had context X before being produced by a
-macro, e.g. here ident has context SyntaxContext::root(): Vadim Petrochenkov:
+- `parser` is a reference to the state of a normal Rust parser, including the
+  token stream and parsing session. The token stream is what we are about to
+  ask the MBE parser to parse. We will consume the raw stream of tokens and
+  output a binding of metavariables to corresponding token trees. The parsing
+  session can be used to report parser errors.
+- `ms` a _matcher_. This is a sequence of token trees that we want to match
+  the token stream against.
  
-macro m() { ident }
+In the analogy of a regex parser, the token stream is the input and we are matching it
+against the pattern `ms`. Using our examples, the token stream could be the stream of
+tokens containing the inside of the example invocation `print foo`, while `ms`
+might be the sequence of token (trees) `print $mvar:ident`.
  
-Vadim Petrochenkov: , then after being produced by the macro it has context X
--> macro_id.
+The output of the parser is a `NamedParseResult`, which indicates which of
+three cases has occurred:
  
-Vadim Petrochenkov: I.e. our ident has context ROOT -> id(m) after it's
-produced by m.
+- Success: the token stream matches the given matcher `ms`, and we have produced a binding
+  from metavariables to the corresponding token trees.
+- Failure: the token stream does not match `ms`. This results in an error message such as
+  "No rule expected token _blah_".
+- Error: some fatal error has occurred _in the parser_. For example, this
+  happens if there are more than one pattern match, since that indicates
+  the macro is ambiguous.
  
-Vadim Petrochenkov: The "chaining operator" -> is apply_mark in compiler code.
-Vadim Petrochenkov:
+The full interface is defined [here][code_parse_int].
  
-macro m() { macro n() { ident } }
+The macro parser does pretty much exactly the same as a normal regex parser with
+one exception: in order to parse different types of metavariables, such as
+`ident`, `block`, `expr`, etc., the macro parser must sometimes call back to the
+normal Rust parser.
  
-Vadim Petrochenkov: In this example the ident has context ROOT originally, then
-ROOT -> id(m), then ROOT -> id(m) -> id(n).
+As mentioned above, both definitions and invocations of macros are parsed using
+the macro parser. This is extremely non-intuitive and self-referential. The code
+to parse macro _definitions_ is in
+[`compiler/rustc_expand/src/mbe/macro_rules.rs`][code_mr]. It defines the pattern for
+matching for a macro definition as `$( $lhs:tt => $rhs:tt );+`. In other words,
+a `macro_rules` definition should have in its body at least one occurrence of a
+token tree followed by `=>` followed by another token tree. When the compiler
+comes to a `macro_rules` definition, it uses this pattern to match the two token
+trees per rule in the definition of the macro _using the macro parser itself_.
+In our example definition, the metavariable `$lhs` would match the patterns of
+both arms: `(print $mvar:ident)` and `(print twice $mvar:ident)`.  And `$rhs`
+would match the bodies of both arms: `{ println!("{}", $mvar); }` and `{
+println!("{}", $mvar); println!("{}", $mvar); }`. The parser would keep this
+knowledge around for when it needs to expand a macro invocation.
  
-Vadim Petrochenkov: Note that these chains are not entirely determined by their
-last element, in other words ExpnId is not isomorphic to SyntaxCtxt.
+When the compiler comes to a macro invocation, it parses that invocation using
+the same NFA-based macro parser that is described above. However, the matcher
+used is the first token tree (`$lhs`) extracted from the arms of the macro
+_definition_. Using our example, we would try to match the token stream `print
+foo` from the invocation against the matchers `print $mvar:ident` and `print
+twice $mvar:ident` that we previously extracted from the definition.  The
+algorithm is exactly the same, but when the macro parser comes to a place in the
+current matcher where it needs to match a _non-terminal_ (e.g. `$mvar:ident`),
+it calls back to the normal Rust parser to get the contents of that
+non-terminal. In this case, the Rust parser would look for an `ident` token,
+which it finds (`foo`) and returns to the macro parser. Then, the macro parser
+proceeds in parsing as normal. Also, note that exactly one of the matchers from
+the various arms should match the invocation; if there is more than one match,
+the parse is ambiguous, while if there are no matches at all, there is a syntax
+error.
  
-Vadim Petrochenkov: Couterexample: Vadim Petrochenkov:
+For more information about the macro parser's implementation, see the comments
+in [`compiler/rustc_expand/src/mbe/macro_parser.rs`][code_mp].
  
-macro m($i: ident) { macro n() { ($i, bar) } }
+### `macro`s and Macros 2.0
  
-m!(foo);
+There is an old and mostly undocumented effort to improve the MBE system, give
+it more hygiene-related features, better scoping and visibility rules, etc. There
+hasn't been a lot of work on this recently, unfortunately. Internally, `macro`
+macros use the same machinery as today's MBEs; they just have additional
+syntactic sugar and are allowed to be in namespaces.
  
-Vadim Petrochenkov: foo has context ROOT -> id(n) and bar has context ROOT ->
-id(m) -> id(n) after all the expansions.
+## Procedural Macros
  
-mark-i-m: Cool :)
+Precedural macros are also expanded during parsing, as mentioned above.
+However, they use a rather different mechanism. Rather than having a parser in
+the compiler, procedural macros are implemented as custom, third-party crates.
+The compiler will compile the proc macro crate and specially annotated
+functions in them (i.e. the proc macro itself), passing them a stream of tokens.
  
-mark-i-m: It looks like we are out of time
+The proc macro can then transform the token stream and output a new token
+stream, which is synthesized into the AST.
  
-mark-i-m: Is there anything you wanted to add?
+It's worth noting that the token stream type used by proc macros is _stable_,
+so `rustc` does not use it internally (since our internal data structures are
+unstable). The compiler's token stream is
+[`rustc_ast::tokenstream::TokenStream`][rustcts], as previously. This is
+converted into the stable [`proc_macro::TokenStream`][stablets] and back in
+[`rustc_expand::proc_macro`][pm] and [`rustc_expand::proc_macro_server`][pms].
+Because the Rust ABI is unstable, we use the C ABI for this conversion.
  
-mark-i-m: We can schedule another meeting if you would like
+[tsmod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/index.html
+[rustcts]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/struct.TokenStream.html
+[stablets]: https://doc.rust-lang.org/proc_macro/struct.TokenStream.html
+[pm]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/proc_macro/index.html
+[pms]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/proc_macro_server/index.html
  
-Vadim Petrochenkov: Yep, 23.06 already.  No, I think this is an ok point to
-stop.
+TODO: more here.
  
-mark-i-m: :+1:
+### Custom Derive
  
-mark-i-m: Thanks @Vadim Petrochenkov ! This was very helpful
+Custom derives are a special type of proc macro.
  
-Vadim Petrochenkov: Yeah, we can schedule another one.  So far it's been like 1
-hour of meetings per month? Certainly not a big burden.
-```
+TODO: more?