]>
Commit | Line | Data |
---|---|---|
9e0c209e SL |
1 | //! This module manages how the incremental compilation cache is represented in |
2 | //! the file system. | |
3 | //! | |
4 | //! Incremental compilation caches are managed according to a copy-on-write | |
5 | //! strategy: Once a complete, consistent cache version is finalized, it is | |
6 | //! never modified. Instead, when a subsequent compilation session is started, | |
7 | //! the compiler will allocate a new version of the cache that starts out as | |
8 | //! a copy of the previous version. Then only this new copy is modified and it | |
9 | //! will not be visible to other processes until it is finalized. This ensures | |
10 | //! that multiple compiler processes can be executed concurrently for the same | |
11 | //! crate without interfering with each other or blocking each other. | |
12 | //! | |
13 | //! More concretely this is implemented via the following protocol: | |
14 | //! | |
15 | //! 1. For a newly started compilation session, the compiler allocates a | |
16 | //! new `session` directory within the incremental compilation directory. | |
17 | //! This session directory will have a unique name that ends with the suffix | |
18 | //! "-working" and that contains a creation timestamp. | |
19 | //! 2. Next, the compiler looks for the newest finalized session directory, | |
20 | //! that is, a session directory from a previous compilation session that | |
21 | //! has been marked as valid and consistent. A session directory is | |
22 | //! considered finalized if the "-working" suffix in the directory name has | |
23 | //! been replaced by the SVH of the crate. | |
24 | //! 3. Once the compiler has found a valid, finalized session directory, it will | |
25 | //! hard-link/copy its contents into the new "-working" directory. If all | |
26 | //! goes well, it will have its own, private copy of the source directory and | |
27 | //! subsequently not have to worry about synchronizing with other compiler | |
28 | //! processes. | |
29 | //! 4. Now the compiler can do its normal compilation process, which involves | |
30 | //! reading and updating its private session directory. | |
31 | //! 5. When compilation finishes without errors, the private session directory | |
32 | //! will be in a state where it can be used as input for other compilation | |
33 | //! sessions. That is, it will contain a dependency graph and cache artifacts | |
34 | //! that are consistent with the state of the source code it was compiled | |
35 | //! from, with no need to change them ever again. At this point, the compiler | |
36 | //! finalizes and "publishes" its private session directory by renaming it | |
37 | //! from "s-{timestamp}-{random}-working" to "s-{timestamp}-{SVH}". | |
38 | //! 6. At this point the "old" session directory that we copied our data from | |
39 | //! at the beginning of the session has become obsolete because we have just | |
40 | //! published a more current version. Thus the compiler will delete it. | |
41 | //! | |
42 | //! ## Garbage Collection | |
43 | //! | |
44 | //! Naively following the above protocol might lead to old session directories | |
45 | //! piling up if a compiler instance crashes for some reason before its able to | |
46 | //! remove its private session directory. In order to avoid wasting disk space, | |
47 | //! the compiler also does some garbage collection each time it is started in | |
48 | //! incremental compilation mode. Specifically, it will scan the incremental | |
49 | //! compilation directory for private session directories that are not in use | |
50 | //! any more and will delete those. It will also delete any finalized session | |
51 | //! directories for a given crate except for the most recent one. | |
52 | //! | |
53 | //! ## Synchronization | |
54 | //! | |
55 | //! There is some synchronization needed in order for the compiler to be able to | |
56 | //! determine whether a given private session directory is not in used any more. | |
57 | //! This is done by creating a lock file for each session directory and | |
58 | //! locking it while the directory is still being used. Since file locks have | |
59 | //! operating system support, we can rely on the lock being released if the | |
60 | //! compiler process dies for some unexpected reason. Thus, when garbage | |
61 | //! collecting private session directories, the collecting process can determine | |
62 | //! whether the directory is still in use by trying to acquire a lock on the | |
63 | //! file. If locking the file fails, the original process must still be alive. | |
64 | //! If locking the file succeeds, we know that the owning process is not alive | |
65 | //! any more and we can safely delete the directory. | |
66 | //! There is still a small time window between the original process creating the | |
67 | //! lock file and actually locking it. In order to minimize the chance that | |
68 | //! another process tries to acquire the lock in just that instance, only | |
69 | //! session directories that are older than a few seconds are considered for | |
70 | //! garbage collection. | |
71 | //! | |
72 | //! Another case that has to be considered is what happens if one process | |
73 | //! deletes a finalized session directory that another process is currently | |
74 | //! trying to copy from. This case is also handled via the lock file. Before | |
75 | //! a process starts copying a finalized session directory, it will acquire a | |
76 | //! shared lock on the directory's lock file. Any garbage collecting process, | |
77 | //! on the other hand, will acquire an exclusive lock on the lock file. | |
78 | //! Thus, if a directory is being collected, any reader process will fail | |
79 | //! acquiring the shared lock and will leave the directory alone. Conversely, | |
80 | //! if a collecting process can't acquire the exclusive lock because the | |
81 | //! directory is currently being read from, it will leave collecting that | |
82 | //! directory to another process at a later point in time. | |
83 | //! The exact same scheme is also used when reading the metadata hashes file | |
84 | //! from an extern crate. When a crate is compiled, the hash values of its | |
85 | //! metadata are stored in a file in its session directory. When the | |
86 | //! compilation session of another crate imports the first crate's metadata, | |
87 | //! it also has to read in the accompanying metadata hashes. It thus will access | |
88 | //! the finalized session directory of all crates it links to and while doing | |
89 | //! so, it will also place a read lock on that the respective session directory | |
90 | //! so that it won't be deleted while the metadata hashes are loaded. | |
91 | //! | |
92 | //! ## Preconditions | |
93 | //! | |
94 | //! This system relies on two features being available in the file system in | |
95 | //! order to work really well: file locking and hard linking. | |
96 | //! If hard linking is not available (like on FAT) the data in the cache | |
97 | //! actually has to be copied at the beginning of each session. | |
98 | //! If file locking does not work reliably (like on NFS), some of the | |
99 | //! synchronization will go haywire. | |
100 | //! In both cases we recommend to locate the incremental compilation directory | |
101 | //! on a file system that supports these things. | |
102 | //! It might be a good idea though to try and detect whether we are on an | |
103 | //! unsupported file system and emit a warning in that case. This is not yet | |
104 | //! implemented. | |
105 | ||
9ffffee4 | 106 | use crate::errors; |
dfeec247 | 107 | use rustc_data_structures::fx::{FxHashMap, FxHashSet}; |
b7449926 | 108 | use rustc_data_structures::svh::Svh; |
dfeec247 | 109 | use rustc_data_structures::{base_n, flock}; |
5e7ed085 | 110 | use rustc_errors::ErrorGuaranteed; |
dfeec247 | 111 | use rustc_fs_util::{link_or_copy, LinkOrCopy}; |
136023e0 | 112 | use rustc_session::{Session, StableCrateId}; |
487cf647 | 113 | use rustc_span::Symbol; |
9e0c209e | 114 | |
9e0c209e | 115 | use std::fs as std_fs; |
5e7ed085 | 116 | use std::io::{self, ErrorKind}; |
9e0c209e | 117 | use std::path::{Path, PathBuf}; |
dfeec247 | 118 | use std::time::{Duration, SystemTime, UNIX_EPOCH}; |
abe05a73 | 119 | |
dfeec247 | 120 | use rand::{thread_rng, RngCore}; |
9e0c209e | 121 | |
416331ca XL |
122 | #[cfg(test)] |
123 | mod tests; | |
124 | ||
0731742a XL |
125 | const LOCK_FILE_EXT: &str = ".lock"; |
126 | const DEP_GRAPH_FILENAME: &str = "dep-graph.bin"; | |
cdc7bbd5 | 127 | const STAGING_DEP_GRAPH_FILENAME: &str = "dep-graph.part.bin"; |
0731742a XL |
128 | const WORK_PRODUCTS_FILENAME: &str = "work-products.bin"; |
129 | const QUERY_CACHE_FILENAME: &str = "query-cache.bin"; | |
9e0c209e | 130 | |
476ff2be SL |
131 | // We encode integers using the following base, so they are shorter than decimal |
132 | // or hexadecimal numbers (we want short file and directory names). Since these | |
133 | // numbers will be used in file names, we choose an encoding that is not | |
134 | // case-sensitive (as opposed to base64, for example). | |
ff7c6d11 | 135 | const INT_ENCODE_BASE: usize = base_n::CASE_INSENSITIVE; |
476ff2be | 136 | |
a2a8927a | 137 | /// Returns the path to a session's dependency graph. |
9e0c209e SL |
138 | pub fn dep_graph_path(sess: &Session) -> PathBuf { |
139 | in_incr_comp_dir_sess(sess, DEP_GRAPH_FILENAME) | |
140 | } | |
a2a8927a XL |
141 | /// Returns the path to a session's staging dependency graph. |
142 | /// | |
143 | /// On the difference between dep-graph and staging dep-graph, | |
144 | /// see `build_dep_graph`. | |
cdc7bbd5 XL |
145 | pub fn staging_dep_graph_path(sess: &Session) -> PathBuf { |
146 | in_incr_comp_dir_sess(sess, STAGING_DEP_GRAPH_FILENAME) | |
147 | } | |
9e0c209e SL |
148 | pub fn work_products_path(sess: &Session) -> PathBuf { |
149 | in_incr_comp_dir_sess(sess, WORK_PRODUCTS_FILENAME) | |
150 | } | |
a2a8927a | 151 | /// Returns the path to a session's query cache. |
abe05a73 XL |
152 | pub fn query_cache_path(sess: &Session) -> PathBuf { |
153 | in_incr_comp_dir_sess(sess, QUERY_CACHE_FILENAME) | |
154 | } | |
155 | ||
a2a8927a | 156 | /// Locks a given session directory. |
9e0c209e SL |
157 | pub fn lock_file_path(session_dir: &Path) -> PathBuf { |
158 | let crate_dir = session_dir.parent().unwrap(); | |
159 | ||
160 | let directory_name = session_dir.file_name().unwrap().to_string_lossy(); | |
161 | assert_no_characters_lost(&directory_name); | |
162 | ||
74b04a01 | 163 | let dash_indices: Vec<_> = directory_name.match_indices('-').map(|(idx, _)| idx).collect(); |
9e0c209e | 164 | if dash_indices.len() != 3 { |
dfeec247 XL |
165 | bug!( |
166 | "Encountered incremental compilation session directory with \ | |
9e0c209e | 167 | malformed name: {}", |
dfeec247 XL |
168 | session_dir.display() |
169 | ) | |
9e0c209e SL |
170 | } |
171 | ||
dfeec247 | 172 | crate_dir.join(&directory_name[0..dash_indices[2]]).with_extension(&LOCK_FILE_EXT[1..]) |
9e0c209e SL |
173 | } |
174 | ||
a2a8927a XL |
175 | /// Returns the path for a given filename within the incremental compilation directory |
176 | /// in the current session. | |
9e0c209e SL |
177 | pub fn in_incr_comp_dir_sess(sess: &Session, file_name: &str) -> PathBuf { |
178 | in_incr_comp_dir(&sess.incr_comp_session_dir(), file_name) | |
179 | } | |
180 | ||
a2a8927a XL |
181 | /// Returns the path for a given filename within the incremental compilation directory, |
182 | /// not necessarily from the current session. | |
183 | /// | |
184 | /// To ensure the file is part of the current session, use [`in_incr_comp_dir_sess`]. | |
9e0c209e SL |
185 | pub fn in_incr_comp_dir(incr_comp_session_dir: &Path, file_name: &str) -> PathBuf { |
186 | incr_comp_session_dir.join(file_name) | |
187 | } | |
188 | ||
a2a8927a XL |
189 | /// Allocates the private session directory. |
190 | /// | |
191 | /// If the result of this function is `Ok`, we have a valid incremental | |
192 | /// compilation session directory. A valid session | |
9e0c209e SL |
193 | /// directory is one that contains a locked lock file. It may or may not contain |
194 | /// a dep-graph and work products from a previous session. | |
a2a8927a XL |
195 | /// |
196 | /// This always attempts to load a dep-graph from the directory. | |
197 | /// If loading fails for some reason, we fallback to a disabled `DepGraph`. | |
198 | /// See [`rustc_interface::queries::dep_graph`]. | |
199 | /// | |
200 | /// If this function returns an error, it may leave behind an invalid session directory. | |
9e0c209e | 201 | /// The garbage collection will take care of it. |
a2a8927a XL |
202 | /// |
203 | /// [`rustc_interface::queries::dep_graph`]: ../../rustc_interface/struct.Queries.html#structfield.dep_graph | |
dfeec247 XL |
204 | pub fn prepare_session_directory( |
205 | sess: &Session, | |
487cf647 | 206 | crate_name: Symbol, |
136023e0 | 207 | stable_crate_id: StableCrateId, |
5e7ed085 | 208 | ) -> Result<(), ErrorGuaranteed> { |
ea8adc8c | 209 | if sess.opts.incremental.is_none() { |
17df50a5 | 210 | return Ok(()); |
ea8adc8c XL |
211 | } |
212 | ||
dfeec247 XL |
213 | let _timer = sess.timer("incr_comp_prepare_session_directory"); |
214 | ||
9e0c209e SL |
215 | debug!("prepare_session_directory"); |
216 | ||
217 | // {incr-comp-dir}/{crate-name-and-disambiguator} | |
136023e0 | 218 | let crate_dir = crate_path(sess, crate_name, stable_crate_id); |
9e0c209e | 219 | debug!("crate-dir: {}", crate_dir.display()); |
17df50a5 | 220 | create_dir(sess, &crate_dir, "crate")?; |
9e0c209e | 221 | |
476ff2be SL |
222 | // Hack: canonicalize the path *after creating the directory* |
223 | // because, on windows, long paths can cause problems; | |
224 | // canonicalization inserts this weird prefix that makes windows | |
225 | // tolerate long paths. | |
226 | let crate_dir = match crate_dir.canonicalize() { | |
227 | Ok(v) => v, | |
228 | Err(err) => { | |
9ffffee4 | 229 | return Err(sess.emit_err(errors::CanonicalizePath { path: crate_dir, err })); |
476ff2be SL |
230 | } |
231 | }; | |
232 | ||
0bf4aa26 | 233 | let mut source_directories_already_tried = FxHashSet::default(); |
9e0c209e SL |
234 | |
235 | loop { | |
236 | // Generate a session directory of the form: | |
237 | // | |
238 | // {incr-comp-dir}/{crate-name-and-disambiguator}/s-{timestamp}-{random}-working | |
239 | let session_dir = generate_session_dir_path(&crate_dir); | |
240 | debug!("session-dir: {}", session_dir.display()); | |
241 | ||
242 | // Lock the new session directory. If this fails, return an | |
243 | // error without retrying | |
17df50a5 | 244 | let (directory_lock, lock_file_path) = lock_directory(sess, &session_dir)?; |
9e0c209e SL |
245 | |
246 | // Now that we have the lock, we can actually create the session | |
247 | // directory | |
17df50a5 | 248 | create_dir(sess, &session_dir, "session")?; |
9e0c209e SL |
249 | |
250 | // Find a suitable source directory to copy from. Ignore those that we | |
251 | // have already tried before. | |
dfeec247 | 252 | let source_directory = find_source_directory(&crate_dir, &source_directories_already_tried); |
9e0c209e | 253 | |
3c0e092e | 254 | let Some(source_directory) = source_directory else { |
9e0c209e | 255 | // There's nowhere to copy from, we're done |
dfeec247 XL |
256 | debug!( |
257 | "no source directory found. Continuing with empty session \ | |
258 | directory." | |
259 | ); | |
9e0c209e | 260 | |
ea8adc8c | 261 | sess.init_incr_comp_session(session_dir, directory_lock, false); |
17df50a5 | 262 | return Ok(()); |
9e0c209e SL |
263 | }; |
264 | ||
dfeec247 | 265 | debug!("attempting to copy data from source: {}", source_directory.display()); |
9e0c209e SL |
266 | |
267 | // Try copying over all files from the source directory | |
dfeec247 XL |
268 | if let Ok(allows_links) = copy_files(sess, &session_dir, &source_directory) { |
269 | debug!("successfully copied data from: {}", source_directory.display()); | |
9e0c209e | 270 | |
c30ab7b3 | 271 | if !allows_links { |
9ffffee4 | 272 | sess.emit_warning(errors::HardLinkFailed { path: &session_dir }); |
c30ab7b3 SL |
273 | } |
274 | ||
ea8adc8c | 275 | sess.init_incr_comp_session(session_dir, directory_lock, true); |
17df50a5 | 276 | return Ok(()); |
9e0c209e | 277 | } else { |
dfeec247 | 278 | debug!("copying failed - trying next directory"); |
9e0c209e SL |
279 | |
280 | // Something went wrong while trying to copy/link files from the | |
281 | // source directory. Try again with a different one. | |
282 | source_directories_already_tried.insert(source_directory); | |
283 | ||
284 | // Try to remove the session directory we just allocated. We don't | |
285 | // know if there's any garbage in it from the failed copy action. | |
286 | if let Err(err) = safe_remove_dir_all(&session_dir) { | |
9ffffee4 | 287 | sess.emit_warning(errors::DeletePartial { path: &session_dir, err }); |
9e0c209e SL |
288 | } |
289 | ||
ea8adc8c | 290 | delete_session_dir_lock_file(sess, &lock_file_path); |
9c376795 | 291 | drop(directory_lock); |
9e0c209e SL |
292 | } |
293 | } | |
294 | } | |
295 | ||
9e0c209e SL |
296 | /// This function finalizes and thus 'publishes' the session directory by |
297 | /// renaming it to `s-{timestamp}-{svh}` and releasing the file lock. | |
298 | /// If there have been compilation errors, however, this function will just | |
299 | /// delete the presumably invalid session directory. | |
300 | pub fn finalize_session_directory(sess: &Session, svh: Svh) { | |
301 | if sess.opts.incremental.is_none() { | |
302 | return; | |
303 | } | |
304 | ||
dfeec247 XL |
305 | let _timer = sess.timer("incr_comp_finalize_session_directory"); |
306 | ||
9e0c209e SL |
307 | let incr_comp_session_dir: PathBuf = sess.incr_comp_session_dir().clone(); |
308 | ||
487cf647 | 309 | if let Some(_) = sess.has_errors_or_delayed_span_bugs() { |
9e0c209e SL |
310 | // If there have been any errors during compilation, we don't want to |
311 | // publish this session directory. Rather, we'll just delete it. | |
312 | ||
dfeec247 XL |
313 | debug!( |
314 | "finalize_session_directory() - invalidating session directory: {}", | |
315 | incr_comp_session_dir.display() | |
316 | ); | |
9e0c209e SL |
317 | |
318 | if let Err(err) = safe_remove_dir_all(&*incr_comp_session_dir) { | |
9ffffee4 | 319 | sess.emit_warning(errors::DeleteFull { path: &incr_comp_session_dir, err }); |
9e0c209e SL |
320 | } |
321 | ||
322 | let lock_file_path = lock_file_path(&*incr_comp_session_dir); | |
323 | delete_session_dir_lock_file(sess, &lock_file_path); | |
324 | sess.mark_incr_comp_session_as_invalid(); | |
325 | } | |
326 | ||
dfeec247 | 327 | debug!("finalize_session_directory() - session directory: {}", incr_comp_session_dir.display()); |
9e0c209e | 328 | |
dfeec247 | 329 | let old_sub_dir_name = incr_comp_session_dir.file_name().unwrap().to_string_lossy(); |
9e0c209e SL |
330 | assert_no_characters_lost(&old_sub_dir_name); |
331 | ||
332 | // Keep the 's-{timestamp}-{random-number}' prefix, but replace the | |
333 | // '-working' part with the SVH of the crate | |
74b04a01 | 334 | let dash_indices: Vec<_> = old_sub_dir_name.match_indices('-').map(|(idx, _)| idx).collect(); |
9e0c209e | 335 | if dash_indices.len() != 3 { |
dfeec247 XL |
336 | bug!( |
337 | "Encountered incremental compilation session directory with \ | |
9e0c209e | 338 | malformed name: {}", |
dfeec247 XL |
339 | incr_comp_session_dir.display() |
340 | ) | |
9e0c209e SL |
341 | } |
342 | ||
343 | // State: "s-{timestamp}-{random-number}-" | |
dfeec247 | 344 | let mut new_sub_dir_name = String::from(&old_sub_dir_name[..=dash_indices[2]]); |
9e0c209e SL |
345 | |
346 | // Append the svh | |
ff7c6d11 | 347 | base_n::push_str(svh.as_u64() as u128, INT_ENCODE_BASE, &mut new_sub_dir_name); |
9e0c209e SL |
348 | |
349 | // Create the full path | |
350 | let new_path = incr_comp_session_dir.parent().unwrap().join(new_sub_dir_name); | |
351 | debug!("finalize_session_directory() - new path: {}", new_path.display()); | |
352 | ||
5e7ed085 | 353 | match rename_path_with_retry(&*incr_comp_session_dir, &new_path, 3) { |
9e0c209e SL |
354 | Ok(_) => { |
355 | debug!("finalize_session_directory() - directory renamed successfully"); | |
356 | ||
357 | // This unlocks the directory | |
358 | sess.finalize_incr_comp_session(new_path); | |
359 | } | |
360 | Err(e) => { | |
361 | // Warn about the error. However, no need to abort compilation now. | |
9ffffee4 | 362 | sess.emit_warning(errors::Finalize { path: &incr_comp_session_dir, err: e }); |
9e0c209e SL |
363 | |
364 | debug!("finalize_session_directory() - error, marking as invalid"); | |
365 | // Drop the file lock, so we can garage collect | |
366 | sess.mark_incr_comp_session_as_invalid(); | |
367 | } | |
368 | } | |
369 | ||
370 | let _ = garbage_collect_session_directories(sess); | |
371 | } | |
372 | ||
373 | pub fn delete_all_session_dir_contents(sess: &Session) -> io::Result<()> { | |
374 | let sess_dir_iterator = sess.incr_comp_session_dir().read_dir()?; | |
375 | for entry in sess_dir_iterator { | |
376 | let entry = entry?; | |
377 | safe_remove_file(&entry.path())? | |
378 | } | |
379 | Ok(()) | |
380 | } | |
381 | ||
dfeec247 | 382 | fn copy_files(sess: &Session, target_dir: &Path, source_dir: &Path) -> Result<bool, ()> { |
9e0c209e SL |
383 | // We acquire a shared lock on the lock file of the directory, so that |
384 | // nobody deletes it out from under us while we are reading from it. | |
385 | let lock_file_path = lock_file_path(source_dir); | |
3c0e092e XL |
386 | |
387 | // not exclusive | |
388 | let Ok(_lock) = flock::Lock::new( | |
dfeec247 XL |
389 | &lock_file_path, |
390 | false, // don't wait, | |
391 | false, // don't create | |
392 | false, | |
3c0e092e | 393 | ) else { |
9e0c209e | 394 | // Could not acquire the lock, don't try to copy from here |
dfeec247 | 395 | return Err(()); |
9e0c209e SL |
396 | }; |
397 | ||
5e7ed085 FG |
398 | let Ok(source_dir_iterator) = source_dir.read_dir() else { |
399 | return Err(()); | |
9e0c209e SL |
400 | }; |
401 | ||
402 | let mut files_linked = 0; | |
403 | let mut files_copied = 0; | |
404 | ||
405 | for entry in source_dir_iterator { | |
406 | match entry { | |
407 | Ok(entry) => { | |
408 | let file_name = entry.file_name(); | |
409 | ||
410 | let target_file_path = target_dir.join(file_name); | |
411 | let source_path = entry.path(); | |
412 | ||
413 | debug!("copying into session dir: {}", source_path.display()); | |
b7449926 | 414 | match link_or_copy(source_path, target_file_path) { |
dfeec247 XL |
415 | Ok(LinkOrCopy::Link) => files_linked += 1, |
416 | Ok(LinkOrCopy::Copy) => files_copied += 1, | |
417 | Err(_) => return Err(()), | |
9e0c209e SL |
418 | } |
419 | } | |
dfeec247 | 420 | Err(_) => return Err(()), |
9e0c209e SL |
421 | } |
422 | } | |
423 | ||
064997fb | 424 | if sess.opts.unstable_opts.incremental_info { |
6a06907d | 425 | eprintln!( |
dfeec247 XL |
426 | "[incremental] session directory: \ |
427 | {} files hard-linked", | |
428 | files_linked | |
429 | ); | |
6a06907d | 430 | eprintln!( |
dfeec247 XL |
431 | "[incremental] session directory: \ |
432 | {} files copied", | |
433 | files_copied | |
434 | ); | |
9e0c209e SL |
435 | } |
436 | ||
c30ab7b3 | 437 | Ok(files_linked > 0 || files_copied == 0) |
9e0c209e SL |
438 | } |
439 | ||
9fa01778 | 440 | /// Generates unique directory path of the form: |
9e0c209e SL |
441 | /// {crate_dir}/s-{timestamp}-{random-number}-working |
442 | fn generate_session_dir_path(crate_dir: &Path) -> PathBuf { | |
443 | let timestamp = timestamp_to_string(SystemTime::now()); | |
444 | debug!("generate_session_dir_path: timestamp = {}", timestamp); | |
445 | let random_number = thread_rng().next_u32(); | |
446 | debug!("generate_session_dir_path: random_number = {}", random_number); | |
447 | ||
dfeec247 XL |
448 | let directory_name = format!( |
449 | "s-{}-{}-working", | |
450 | timestamp, | |
451 | base_n::encode(random_number as u128, INT_ENCODE_BASE) | |
452 | ); | |
9e0c209e SL |
453 | debug!("generate_session_dir_path: directory_name = {}", directory_name); |
454 | let directory_path = crate_dir.join(directory_name); | |
455 | debug!("generate_session_dir_path: directory_path = {}", directory_path.display()); | |
456 | directory_path | |
457 | } | |
458 | ||
5e7ed085 | 459 | fn create_dir(sess: &Session, path: &Path, dir_tag: &str) -> Result<(), ErrorGuaranteed> { |
cc61c64b | 460 | match std_fs::create_dir_all(path) { |
9e0c209e SL |
461 | Ok(()) => { |
462 | debug!("{} directory created successfully", dir_tag); | |
463 | Ok(()) | |
464 | } | |
9ffffee4 | 465 | Err(err) => Err(sess.emit_err(errors::CreateIncrCompDir { tag: dir_tag, path, err })), |
9e0c209e SL |
466 | } |
467 | } | |
468 | ||
a1dfa0c6 | 469 | /// Allocate the lock-file and lock it. |
17df50a5 XL |
470 | fn lock_directory( |
471 | sess: &Session, | |
472 | session_dir: &Path, | |
5e7ed085 | 473 | ) -> Result<(flock::Lock, PathBuf), ErrorGuaranteed> { |
9e0c209e SL |
474 | let lock_file_path = lock_file_path(session_dir); |
475 | debug!("lock_directory() - lock_file: {}", lock_file_path.display()); | |
476 | ||
dfeec247 XL |
477 | match flock::Lock::new( |
478 | &lock_file_path, | |
479 | false, // don't wait | |
480 | true, // create the lock file | |
481 | true, | |
482 | ) { | |
483 | // the lock should be exclusive | |
9e0c209e | 484 | Ok(lock) => Ok((lock, lock_file_path)), |
17df50a5 | 485 | Err(lock_err) => { |
9ffffee4 FG |
486 | let is_unsupported_lock = flock::Lock::error_unsupported(&lock_err).then_some(()); |
487 | Err(sess.emit_err(errors::CreateLock { | |
488 | lock_err, | |
489 | session_dir, | |
490 | is_unsupported_lock, | |
491 | is_cargo: std::env::var_os("CARGO").map(|_| ()), | |
492 | })) | |
9e0c209e SL |
493 | } |
494 | } | |
495 | } | |
496 | ||
dfeec247 | 497 | fn delete_session_dir_lock_file(sess: &Session, lock_file_path: &Path) { |
9e0c209e | 498 | if let Err(err) = safe_remove_file(&lock_file_path) { |
9ffffee4 | 499 | sess.emit_warning(errors::DeleteLock { path: lock_file_path, err }); |
9e0c209e SL |
500 | } |
501 | } | |
502 | ||
9fa01778 | 503 | /// Finds the most recent published session directory that is not in the |
9e0c209e | 504 | /// ignore-list. |
dfeec247 XL |
505 | fn find_source_directory( |
506 | crate_dir: &Path, | |
507 | source_directories_already_tried: &FxHashSet<PathBuf>, | |
508 | ) -> Option<PathBuf> { | |
509 | let iter = crate_dir | |
510 | .read_dir() | |
511 | .unwrap() // FIXME | |
512 | .filter_map(|e| e.ok().map(|e| e.path())); | |
9e0c209e SL |
513 | |
514 | find_source_directory_in_iter(iter, source_directories_already_tried) | |
515 | } | |
516 | ||
dfeec247 XL |
517 | fn find_source_directory_in_iter<I>( |
518 | iter: I, | |
519 | source_directories_already_tried: &FxHashSet<PathBuf>, | |
520 | ) -> Option<PathBuf> | |
521 | where | |
522 | I: Iterator<Item = PathBuf>, | |
9e0c209e SL |
523 | { |
524 | let mut best_candidate = (UNIX_EPOCH, None); | |
525 | ||
526 | for session_dir in iter { | |
dfeec247 | 527 | debug!("find_source_directory_in_iter - inspecting `{}`", session_dir.display()); |
9e0c209e SL |
528 | |
529 | let directory_name = session_dir.file_name().unwrap().to_string_lossy(); | |
530 | assert_no_characters_lost(&directory_name); | |
531 | ||
dfeec247 XL |
532 | if source_directories_already_tried.contains(&session_dir) |
533 | || !is_session_directory(&directory_name) | |
534 | || !is_finalized(&directory_name) | |
535 | { | |
416331ca | 536 | debug!("find_source_directory_in_iter - ignoring"); |
dfeec247 | 537 | continue; |
9e0c209e SL |
538 | } |
539 | ||
dfeec247 XL |
540 | let timestamp = extract_timestamp_from_session_dir(&directory_name).unwrap_or_else(|_| { |
541 | bug!("unexpected incr-comp session dir: {}", session_dir.display()) | |
542 | }); | |
9e0c209e SL |
543 | |
544 | if timestamp > best_candidate.0 { | |
545 | best_candidate = (timestamp, Some(session_dir.clone())); | |
546 | } | |
547 | } | |
548 | ||
549 | best_candidate.1 | |
550 | } | |
551 | ||
552 | fn is_finalized(directory_name: &str) -> bool { | |
553 | !directory_name.ends_with("-working") | |
554 | } | |
555 | ||
556 | fn is_session_directory(directory_name: &str) -> bool { | |
dfeec247 | 557 | directory_name.starts_with("s-") && !directory_name.ends_with(LOCK_FILE_EXT) |
9e0c209e SL |
558 | } |
559 | ||
560 | fn is_session_directory_lock_file(file_name: &str) -> bool { | |
561 | file_name.starts_with("s-") && file_name.ends_with(LOCK_FILE_EXT) | |
562 | } | |
563 | ||
dfeec247 | 564 | fn extract_timestamp_from_session_dir(directory_name: &str) -> Result<SystemTime, ()> { |
9e0c209e | 565 | if !is_session_directory(directory_name) { |
dfeec247 | 566 | return Err(()); |
9e0c209e SL |
567 | } |
568 | ||
74b04a01 | 569 | let dash_indices: Vec<_> = directory_name.match_indices('-').map(|(idx, _)| idx).collect(); |
9e0c209e | 570 | if dash_indices.len() != 3 { |
dfeec247 | 571 | return Err(()); |
9e0c209e SL |
572 | } |
573 | ||
dfeec247 | 574 | string_to_timestamp(&directory_name[dash_indices[0] + 1..dash_indices[1]]) |
9e0c209e SL |
575 | } |
576 | ||
9e0c209e SL |
577 | fn timestamp_to_string(timestamp: SystemTime) -> String { |
578 | let duration = timestamp.duration_since(UNIX_EPOCH).unwrap(); | |
dfeec247 | 579 | let micros = duration.as_secs() * 1_000_000 + (duration.subsec_nanos() as u64) / 1000; |
ff7c6d11 | 580 | base_n::encode(micros as u128, INT_ENCODE_BASE) |
9e0c209e SL |
581 | } |
582 | ||
583 | fn string_to_timestamp(s: &str) -> Result<SystemTime, ()> { | |
2c00a5a8 | 584 | let micros_since_unix_epoch = u64::from_str_radix(s, INT_ENCODE_BASE as u32); |
9e0c209e SL |
585 | |
586 | if micros_since_unix_epoch.is_err() { | |
dfeec247 | 587 | return Err(()); |
9e0c209e SL |
588 | } |
589 | ||
590 | let micros_since_unix_epoch = micros_since_unix_epoch.unwrap(); | |
591 | ||
dfeec247 XL |
592 | let duration = Duration::new( |
593 | micros_since_unix_epoch / 1_000_000, | |
594 | 1000 * (micros_since_unix_epoch % 1_000_000) as u32, | |
595 | ); | |
9e0c209e SL |
596 | Ok(UNIX_EPOCH + duration) |
597 | } | |
598 | ||
487cf647 | 599 | fn crate_path(sess: &Session, crate_name: Symbol, stable_crate_id: StableCrateId) -> PathBuf { |
9e0c209e SL |
600 | let incr_dir = sess.opts.incremental.as_ref().unwrap().clone(); |
601 | ||
136023e0 | 602 | let stable_crate_id = base_n::encode(stable_crate_id.to_u64() as u128, INT_ENCODE_BASE); |
9e0c209e | 603 | |
136023e0 | 604 | let crate_name = format!("{}-{}", crate_name, stable_crate_id); |
9e0c209e SL |
605 | incr_dir.join(crate_name) |
606 | } | |
607 | ||
608 | fn assert_no_characters_lost(s: &str) { | |
609 | if s.contains('\u{FFFD}') { | |
610 | bug!("Could not losslessly convert '{}'.", s) | |
611 | } | |
612 | } | |
613 | ||
614 | fn is_old_enough_to_be_collected(timestamp: SystemTime) -> bool { | |
615 | timestamp < SystemTime::now() - Duration::from_secs(10) | |
616 | } | |
617 | ||
a2a8927a | 618 | /// Runs garbage collection for the current session. |
9e0c209e SL |
619 | pub fn garbage_collect_session_directories(sess: &Session) -> io::Result<()> { |
620 | debug!("garbage_collect_session_directories() - begin"); | |
621 | ||
622 | let session_directory = sess.incr_comp_session_dir(); | |
dfeec247 XL |
623 | debug!( |
624 | "garbage_collect_session_directories() - session directory: {}", | |
625 | session_directory.display() | |
626 | ); | |
9e0c209e SL |
627 | |
628 | let crate_directory = session_directory.parent().unwrap(); | |
dfeec247 XL |
629 | debug!( |
630 | "garbage_collect_session_directories() - crate directory: {}", | |
631 | crate_directory.display() | |
632 | ); | |
9e0c209e SL |
633 | |
634 | // First do a pass over the crate directory, collecting lock files and | |
635 | // session directories | |
0bf4aa26 XL |
636 | let mut session_directories = FxHashSet::default(); |
637 | let mut lock_files = FxHashSet::default(); | |
9e0c209e | 638 | |
a1dfa0c6 | 639 | for dir_entry in crate_directory.read_dir()? { |
5e7ed085 FG |
640 | let Ok(dir_entry) = dir_entry else { |
641 | // Ignore any errors | |
642 | continue; | |
9e0c209e SL |
643 | }; |
644 | ||
645 | let entry_name = dir_entry.file_name(); | |
646 | let entry_name = entry_name.to_string_lossy(); | |
647 | ||
648 | if is_session_directory_lock_file(&entry_name) { | |
649 | assert_no_characters_lost(&entry_name); | |
650 | lock_files.insert(entry_name.into_owned()); | |
651 | } else if is_session_directory(&entry_name) { | |
652 | assert_no_characters_lost(&entry_name); | |
653 | session_directories.insert(entry_name.into_owned()); | |
654 | } else { | |
655 | // This is something we don't know, leave it alone | |
656 | } | |
657 | } | |
658 | ||
659 | // Now map from lock files to session directories | |
dfeec247 XL |
660 | let lock_file_to_session_dir: FxHashMap<String, Option<String>> = lock_files |
661 | .into_iter() | |
662 | .map(|lock_file_name| { | |
663 | assert!(lock_file_name.ends_with(LOCK_FILE_EXT)); | |
664 | let dir_prefix_end = lock_file_name.len() - LOCK_FILE_EXT.len(); | |
665 | let session_dir = { | |
666 | let dir_prefix = &lock_file_name[0..dir_prefix_end]; | |
667 | session_directories.iter().find(|dir_name| dir_name.starts_with(dir_prefix)) | |
668 | }; | |
669 | (lock_file_name, session_dir.map(String::clone)) | |
670 | }) | |
671 | .collect(); | |
9e0c209e SL |
672 | |
673 | // Delete all lock files, that don't have an associated directory. They must | |
674 | // be some kind of leftover | |
675 | for (lock_file_name, directory_name) in &lock_file_to_session_dir { | |
676 | if directory_name.is_none() { | |
5e7ed085 FG |
677 | let Ok(timestamp) = extract_timestamp_from_session_dir(lock_file_name) else { |
678 | debug!( | |
679 | "found lock-file with malformed timestamp: {}", | |
680 | crate_directory.join(&lock_file_name).display() | |
681 | ); | |
682 | // Ignore it | |
683 | continue; | |
9e0c209e SL |
684 | }; |
685 | ||
686 | let lock_file_path = crate_directory.join(&**lock_file_name); | |
687 | ||
688 | if is_old_enough_to_be_collected(timestamp) { | |
dfeec247 XL |
689 | debug!( |
690 | "garbage_collect_session_directories() - deleting \ | |
691 | garbage lock file: {}", | |
692 | lock_file_path.display() | |
693 | ); | |
9e0c209e SL |
694 | delete_session_dir_lock_file(sess, &lock_file_path); |
695 | } else { | |
dfeec247 XL |
696 | debug!( |
697 | "garbage_collect_session_directories() - lock file with \ | |
9e0c209e | 698 | no session dir not old enough to be collected: {}", |
dfeec247 XL |
699 | lock_file_path.display() |
700 | ); | |
9e0c209e SL |
701 | } |
702 | } | |
703 | } | |
704 | ||
705 | // Filter out `None` directories | |
dfeec247 XL |
706 | let lock_file_to_session_dir: FxHashMap<String, String> = lock_file_to_session_dir |
707 | .into_iter() | |
708 | .filter_map(|(lock_file_name, directory_name)| directory_name.map(|n| (lock_file_name, n))) | |
709 | .collect(); | |
9e0c209e | 710 | |
2c00a5a8 XL |
711 | // Delete all session directories that don't have a lock file. |
712 | for directory_name in session_directories { | |
713 | if !lock_file_to_session_dir.values().any(|dir| *dir == directory_name) { | |
714 | let path = crate_directory.join(directory_name); | |
715 | if let Err(err) = safe_remove_dir_all(&path) { | |
9ffffee4 | 716 | sess.emit_warning(errors::InvalidGcFailed { path: &path, err }); |
2c00a5a8 XL |
717 | } |
718 | } | |
719 | } | |
720 | ||
721 | // Now garbage collect the valid session directories. | |
9e0c209e | 722 | let mut deletion_candidates = vec![]; |
9e0c209e SL |
723 | |
724 | for (lock_file_name, directory_name) in &lock_file_to_session_dir { | |
dfeec247 | 725 | debug!("garbage_collect_session_directories() - inspecting: {}", directory_name); |
9e0c209e | 726 | |
5e7ed085 FG |
727 | let Ok(timestamp) = extract_timestamp_from_session_dir(directory_name) else { |
728 | debug!( | |
729 | "found session-dir with malformed timestamp: {}", | |
730 | crate_directory.join(directory_name).display() | |
731 | ); | |
732 | // Ignore it | |
733 | continue; | |
9e0c209e SL |
734 | }; |
735 | ||
736 | if is_finalized(directory_name) { | |
737 | let lock_file_path = crate_directory.join(lock_file_name); | |
dfeec247 XL |
738 | match flock::Lock::new( |
739 | &lock_file_path, | |
740 | false, // don't wait | |
741 | false, // don't create the lock-file | |
742 | true, | |
743 | ) { | |
744 | // get an exclusive lock | |
9e0c209e | 745 | Ok(lock) => { |
dfeec247 XL |
746 | debug!( |
747 | "garbage_collect_session_directories() - \ | |
748 | successfully acquired lock" | |
749 | ); | |
750 | debug!( | |
751 | "garbage_collect_session_directories() - adding \ | |
752 | deletion candidate: {}", | |
753 | directory_name | |
754 | ); | |
9e0c209e SL |
755 | |
756 | // Note that we are holding on to the lock | |
dfeec247 XL |
757 | deletion_candidates.push(( |
758 | timestamp, | |
759 | crate_directory.join(directory_name), | |
760 | Some(lock), | |
761 | )); | |
9e0c209e SL |
762 | } |
763 | Err(_) => { | |
dfeec247 XL |
764 | debug!( |
765 | "garbage_collect_session_directories() - \ | |
766 | not collecting, still in use" | |
767 | ); | |
9e0c209e SL |
768 | } |
769 | } | |
770 | } else if is_old_enough_to_be_collected(timestamp) { | |
771 | // When cleaning out "-working" session directories, i.e. | |
772 | // session directories that might still be in use by another | |
773 | // compiler instance, we only look a directories that are | |
774 | // at least ten seconds old. This is supposed to reduce the | |
775 | // chance of deleting a directory in the time window where | |
776 | // the process has allocated the directory but has not yet | |
777 | // acquired the file-lock on it. | |
778 | ||
779 | // Try to acquire the directory lock. If we can't, it | |
780 | // means that the owning process is still alive and we | |
781 | // leave this directory alone. | |
782 | let lock_file_path = crate_directory.join(lock_file_name); | |
dfeec247 XL |
783 | match flock::Lock::new( |
784 | &lock_file_path, | |
785 | false, // don't wait | |
786 | false, // don't create the lock-file | |
787 | true, | |
788 | ) { | |
789 | // get an exclusive lock | |
9e0c209e | 790 | Ok(lock) => { |
dfeec247 XL |
791 | debug!( |
792 | "garbage_collect_session_directories() - \ | |
793 | successfully acquired lock" | |
794 | ); | |
9e0c209e | 795 | |
29967ef6 XL |
796 | delete_old(sess, &crate_directory.join(directory_name)); |
797 | ||
798 | // Let's make it explicit that the file lock is released at this point, | |
799 | // or rather, that we held on to it until here | |
9c376795 | 800 | drop(lock); |
9e0c209e SL |
801 | } |
802 | Err(_) => { | |
dfeec247 XL |
803 | debug!( |
804 | "garbage_collect_session_directories() - \ | |
805 | not collecting, still in use" | |
806 | ); | |
9e0c209e SL |
807 | } |
808 | } | |
809 | } else { | |
dfeec247 XL |
810 | debug!( |
811 | "garbage_collect_session_directories() - not finalized, not \ | |
812 | old enough" | |
813 | ); | |
9e0c209e SL |
814 | } |
815 | } | |
816 | ||
817 | // Delete all but the most recent of the candidates | |
818 | for (path, lock) in all_except_most_recent(deletion_candidates) { | |
dfeec247 | 819 | debug!("garbage_collect_session_directories() - deleting `{}`", path.display()); |
9e0c209e SL |
820 | |
821 | if let Err(err) = safe_remove_dir_all(&path) { | |
9ffffee4 | 822 | sess.emit_warning(errors::FinalizedGcFailed { path: &path, err }); |
9e0c209e SL |
823 | } else { |
824 | delete_session_dir_lock_file(sess, &lock_file_path(&path)); | |
825 | } | |
826 | ||
9e0c209e SL |
827 | // Let's make it explicit that the file lock is released at this point, |
828 | // or rather, that we held on to it until here | |
9c376795 | 829 | drop(lock); |
9e0c209e SL |
830 | } |
831 | ||
29967ef6 XL |
832 | Ok(()) |
833 | } | |
9e0c209e | 834 | |
29967ef6 XL |
835 | fn delete_old(sess: &Session, path: &Path) { |
836 | debug!("garbage_collect_session_directories() - deleting `{}`", path.display()); | |
9e0c209e | 837 | |
29967ef6 | 838 | if let Err(err) = safe_remove_dir_all(&path) { |
9ffffee4 | 839 | sess.emit_warning(errors::SessionGcFailed { path: &path, err }); |
29967ef6 XL |
840 | } else { |
841 | delete_session_dir_lock_file(sess, &lock_file_path(&path)); | |
9e0c209e | 842 | } |
9e0c209e SL |
843 | } |
844 | ||
dfeec247 XL |
845 | fn all_except_most_recent( |
846 | deletion_candidates: Vec<(SystemTime, PathBuf, Option<flock::Lock>)>, | |
847 | ) -> FxHashMap<PathBuf, Option<flock::Lock>> { | |
848 | let most_recent = deletion_candidates.iter().map(|&(timestamp, ..)| timestamp).max(); | |
9e0c209e SL |
849 | |
850 | if let Some(most_recent) = most_recent { | |
dfeec247 XL |
851 | deletion_candidates |
852 | .into_iter() | |
853 | .filter(|&(timestamp, ..)| timestamp != most_recent) | |
854 | .map(|(_, path, lock)| (path, lock)) | |
855 | .collect() | |
9e0c209e | 856 | } else { |
0bf4aa26 | 857 | FxHashMap::default() |
9e0c209e SL |
858 | } |
859 | } | |
860 | ||
861 | /// Since paths of artifacts within session directories can get quite long, we | |
862 | /// need to support deleting files with very long paths. The regular | |
863 | /// WinApi functions only support paths up to 260 characters, however. In order | |
864 | /// to circumvent this limitation, we canonicalize the path of the directory | |
865 | /// before passing it to std::fs::remove_dir_all(). This will convert the path | |
866 | /// into the '\\?\' format, which supports much longer paths. | |
867 | fn safe_remove_dir_all(p: &Path) -> io::Result<()> { | |
5869c6ff XL |
868 | let canonicalized = match std_fs::canonicalize(p) { |
869 | Ok(canonicalized) => canonicalized, | |
870 | Err(err) if err.kind() == io::ErrorKind::NotFound => return Ok(()), | |
871 | Err(err) => return Err(err), | |
872 | }; | |
873 | ||
874 | std_fs::remove_dir_all(canonicalized) | |
9e0c209e SL |
875 | } |
876 | ||
877 | fn safe_remove_file(p: &Path) -> io::Result<()> { | |
5869c6ff XL |
878 | let canonicalized = match std_fs::canonicalize(p) { |
879 | Ok(canonicalized) => canonicalized, | |
880 | Err(err) if err.kind() == io::ErrorKind::NotFound => return Ok(()), | |
881 | Err(err) => return Err(err), | |
882 | }; | |
883 | ||
884 | match std_fs::remove_file(canonicalized) { | |
885 | Err(err) if err.kind() == io::ErrorKind::NotFound => Ok(()), | |
886 | result => result, | |
9e0c209e SL |
887 | } |
888 | } | |
5e7ed085 FG |
889 | |
890 | // On Windows the compiler would sometimes fail to rename the session directory because | |
891 | // the OS thought something was still being accessed in it. So we retry a few times to give | |
892 | // the OS time to catch up. | |
893 | // See https://github.com/rust-lang/rust/issues/86929. | |
894 | fn rename_path_with_retry(from: &Path, to: &Path, mut retries_left: usize) -> std::io::Result<()> { | |
895 | loop { | |
896 | match std_fs::rename(from, to) { | |
897 | Ok(()) => return Ok(()), | |
898 | Err(e) => { | |
899 | if retries_left > 0 && e.kind() == ErrorKind::PermissionDenied { | |
900 | // Try again after a short waiting period. | |
901 | std::thread::sleep(Duration::from_millis(50)); | |
902 | retries_left -= 1; | |
903 | } else { | |
904 | return Err(e); | |
905 | } | |
906 | } | |
907 | } | |
908 | } | |
909 | } |