]>
Commit | Line | Data |
---|---|---|
60c5eb7d XL |
1 | # How Salsa works |
2 | ||
6a06907d XL |
3 | <!-- toc --> |
4 | ||
60c5eb7d XL |
5 | This chapter is based on the explanation given by Niko Matsakis in this |
6 | [video](https://www.youtube.com/watch?v=_muY4HjSqVw) about | |
6a06907d XL |
7 | [Salsa](https://github.com/salsa-rs/salsa). To find out more you may |
8 | want to watch [Salsa In More | |
9 | Depth](https://www.youtube.com/watch?v=i_IhACacPRY), also by Niko | |
10 | Matsakis. | |
60c5eb7d | 11 | |
487cf647 | 12 | > As of <!-- date-check --> November 2022, although Salsa is inspired by |
6a06907d XL |
13 | > (among other things) rustc's query system, it is not used directly in rustc. |
14 | > It _is_ used in chalk and extensively in `rust-analyzer`, but there are no | |
15 | > medium or long-term concrete plans to integrate it into the compiler. | |
60c5eb7d XL |
16 | |
17 | ## What is Salsa? | |
18 | ||
dfeec247 XL |
19 | Salsa is a library for incremental recomputation. This means it allows reusing |
20 | computations that were already done in the past to increase the efficiency | |
60c5eb7d XL |
21 | of future computations. |
22 | ||
23 | The objectives of Salsa are: | |
24 | * Provide that functionality in an automatic way, so reusing old computations | |
25 | is done automatically by the library | |
26 | * Doing so in a "sound", or "correct", way, therefore leading to the same | |
27 | results as if it had been done from scratch | |
28 | ||
29 | Salsa's actual model is much richer, allowing many kinds of inputs and many | |
30 | different outputs. | |
31 | For example, integrating Salsa with an IDE could mean that the inputs could be | |
32 | the manifest (`Cargo.toml`), entire source files (`foo.rs`), snippets and so | |
33 | on; the outputs of such an integration could range from a binary executable, to | |
34 | lints, types (for example, if a user selects a certain variable and wishes to | |
35 | see its type), completions, etc. | |
36 | ||
37 | ## How does it work? | |
38 | ||
6a06907d XL |
39 | The first thing that Salsa has to do is identify the "base inputs" that |
40 | are not something computed but given as input. | |
60c5eb7d XL |
41 | |
42 | Then Salsa has to also identify intermediate, "derived" values, which are | |
43 | something that the library produces, but, for each derived value there's a | |
44 | "pure" function that computes the derived value. | |
45 | ||
46 | For example, there might be a function `ast(x: Path) -> AST`. The produced | |
c295e0f8 | 47 | `AST` isn't a final value, it's an intermediate value that the library would |
60c5eb7d XL |
48 | use for the computation. |
49 | ||
50 | This means that when you try to compute with the library, Salsa is going to | |
51 | compute various derived values, and eventually read the input and produce the | |
52 | result for the asked computation. | |
53 | ||
54 | In the course of computing, Salsa tracks which inputs were accessed and which | |
55 | values are derived. This information is used to determine what's going to | |
56 | happen when the inputs change: are the derived values still valid? | |
57 | ||
58 | This doesn't necessarily mean that each computation downstream from the input | |
59 | is going to be checked, which could be costly. Salsa only needs to check each | |
60 | downstream computation until it finds one that isn't changed. At that point, it | |
61 | won't check other derived computations since they wouldn't need to change. | |
62 | ||
064997fb | 63 | It's helpful to think about this as a graph with nodes. Each derived value |
60c5eb7d XL |
64 | has a dependency on other values, which could themselves be either base or |
65 | derived. Base values don't have a dependency. | |
66 | ||
67 | ```ignore | |
68 | I <- A <- C ... | |
69 | | | |
70 | J <- B <--+ | |
71 | ``` | |
72 | ||
73 | When an input `I` changes, the derived value `A` could change. The derived | |
064997fb | 74 | value `B`, which does not depend on `I`, `A`, or any value derived from `A` or |
60c5eb7d XL |
75 | `I`, is not subject to change. Therefore, Salsa can reuse the computation done |
76 | for `B` in the past, without having to compute it again. | |
77 | ||
78 | The computation could also terminate early. Keeping the same graph as before, | |
064997fb | 79 | say that input `I` has changed in some way (and input `J` hasn't), but when |
60c5eb7d XL |
80 | computing `A` again, it's found that `A` hasn't changed from the previous |
81 | computation. This leads to an "early termination", because there's no need to | |
82 | check if `C` needs to change, since both `C` direct inputs, `A` and `B`, | |
83 | haven't changed. | |
84 | ||
85 | ## Key Salsa concepts | |
86 | ||
87 | ### Query | |
88 | ||
89 | A query is some value that Salsa can access in the course of computation. Each | |
90 | query can have a number of keys (from 0 to many), and all queries have a | |
91 | result, akin to functions. 0-key queries are called "input" queries. | |
92 | ||
93 | ### Database | |
94 | ||
95 | The database is basically the context for the entire computation, it's meant to | |
96 | store Salsa's internal state, all intermediate values for each query, and | |
97 | anything else that the computation might need. The database must know all the | |
98 | queries that the library is going to do before it can be built, but they don't | |
99 | need to be specified in the same place. | |
100 | ||
101 | After the database is formed, it can be accessed with queries that are very | |
102 | similar to functions. Since each query's result is stored in the database, | |
103 | when a query is invoked N times, it will return N **cloned** results, without | |
104 | having to recompute the query (unless the input has changed in such a way that | |
105 | it warrants recomputation). | |
106 | ||
107 | For each input query (0-key), a "set" method is generated, allowing the user to | |
108 | change the output of such query, and trigger previous memoized values to be | |
109 | potentially invalidated. | |
110 | ||
111 | ### Query Groups | |
112 | ||
113 | A query group is a set of queries which have been defined together as a unit. | |
6a06907d XL |
114 | The database is formed by combining query groups. Query groups are akin to |
115 | "Salsa modules". | |
60c5eb7d XL |
116 | |
117 | A set of queries in a query group are just a set of methods in a trait. | |
118 | ||
119 | To create a query group a trait annotated with a specific attribute | |
120 | (`#[salsa::query_group(...)]`) has to be created. | |
121 | ||
122 | An argument must also be provided to said attribute as it will be used by Salsa | |
123 | to create a struct to be used later when the database is created. | |
124 | ||
125 | Example input query group: | |
126 | ||
127 | ```rust,ignore | |
128 | /// This attribute will process this tree, produce this tree as output, and produce | |
c295e0f8 | 129 | /// a bunch of intermediate stuff that Salsa also uses. One of these things is a |
60c5eb7d XL |
130 | /// "StorageStruct", whose name we have specified in the attribute. |
131 | /// | |
132 | /// This query group is a bunch of **input** queries, that do not rely on any | |
133 | /// derived input. | |
134 | #[salsa::query_group(InputsStorage)] | |
135 | pub trait Inputs { | |
136 | /// This attribute (`#[salsa::input]`) indicates that this query is a base | |
137 | /// input, therefore `set_manifest` is going to be auto-generated | |
138 | #[salsa::input] | |
139 | fn manifest(&self) -> Manifest; | |
140 | ||
141 | #[salsa::input] | |
142 | fn source_text(&self, name: String) -> String; | |
143 | } | |
144 | ``` | |
145 | ||
146 | To create a **derived** query group, one must specify which other query groups | |
147 | this one depends on by specifying them as supertraits, as seen in the following | |
148 | example: | |
149 | ||
150 | ```rust,ignore | |
151 | /// This query group is going to contain queries that depend on derived values a | |
152 | /// query group can access another query group's queries by specifying the | |
153 | /// dependency as a super trait query groups can be stacked as much as needed using | |
154 | /// that pattern. | |
155 | #[salsa::query_group(ParserStorage)] | |
156 | pub trait Parser: Inputs { | |
157 | /// This query `ast` is not an input query, it's a derived query this means | |
158 | /// that a definition is necessary. | |
159 | fn ast(&self, name: String) -> String; | |
160 | } | |
161 | ``` | |
162 | ||
163 | When creating a derived query the implementation of said query must be defined | |
164 | outside the trait. The definition must take a database parameter as an `impl | |
165 | Trait` (or `dyn Trait`), where `Trait` is the query group that the definition | |
166 | belongs to, in addition to the other keys. | |
167 | ||
168 | ```rust,ignore | |
169 | ///This is going to be the definition of the `ast` query in the `Parser` trait. | |
170 | ///So, when the query `ast` is invoked, and it needs to be recomputed, Salsa is going to call this function | |
171 | ///and it's is going to give it the database as `impl Parser`. | |
172 | ///The function doesn't need to be aware of all the queries of all the query groups | |
173 | fn ast(db: &impl Parser, name: String) -> String { | |
174 | //! Note, `impl Parser` is used here but `dyn Parser` works just as well | |
175 | /* code */ | |
176 | ///By passing an `impl Parser`, this is allowed | |
177 | let source_text = db.input_file(name); | |
178 | /* do the actual parsing */ | |
179 | return ast; | |
180 | } | |
181 | ``` | |
182 | ||
183 | Eventually, after all the query groups have been defined, the database can be | |
184 | created by declaring a struct. | |
185 | ||
186 | To specify which query groups are going to be part of the database an attribute | |
187 | (`#[salsa::database(...)]`) must be added. The argument of said attribute is a | |
188 | list of identifiers, specifying the query groups **storages**. | |
189 | ||
190 | ```rust,ignore | |
191 | ///This attribute specifies which query groups are going to be in the database | |
192 | #[salsa::database(InputsStorage, ParserStorage)] | |
193 | #[derive(Default)] //optional! | |
194 | struct MyDatabase { | |
195 | ///You also need this one field | |
196 | runtime : salsa::Runtime<MyDatabase>, | |
197 | } | |
198 | ///And this trait has to be implemented | |
064997fb | 199 | impl salsa::Database for MyDatabase { |
60c5eb7d XL |
200 | fn salsa_runtime(&self) -> &salsa::Runtime<MyDatabase> { |
201 | &self.runtime | |
202 | } | |
203 | } | |
204 | ``` | |
205 | ||
206 | Example usage: | |
207 | ||
208 | ```rust,ignore | |
209 | fn main() { | |
210 | let db = MyDatabase::default(); | |
211 | db.set_manifest(...); | |
212 | db.set_source_text(...); | |
213 | loop { | |
214 | db.ast(...); //will reuse results | |
215 | db.set_source_text(...); | |
216 | } | |
217 | } | |
218 | ``` |