]> git.proxmox.com Git - rustc.git/blame - compiler/rustc_mir_transform/src/dest_prop.rs
New upstream version 1.71.1+dfsg1
[rustc.git] / compiler / rustc_mir_transform / src / dest_prop.rs
CommitLineData
1b1a35ee
XL
1//! Propagates assignment destinations backwards in the CFG to eliminate redundant assignments.
2//!
3//! # Motivation
4//!
5//! MIR building can insert a lot of redundant copies, and Rust code in general often tends to move
6//! values around a lot. The result is a lot of assignments of the form `dest = {move} src;` in MIR.
7//! MIR building for constants in particular tends to create additional locals that are only used
8//! inside a single block to shuffle a value around unnecessarily.
9//!
10//! LLVM by itself is not good enough at eliminating these redundant copies (eg. see
29967ef6 11//! <https://github.com/rust-lang/rust/issues/32966>), so this leaves some performance on the table
1b1a35ee
XL
12//! that we can regain by implementing an optimization for removing these assign statements in rustc
13//! itself. When this optimization runs fast enough, it can also speed up the constant evaluation
14//! and code generation phases of rustc due to the reduced number of statements and locals.
15//!
16//! # The Optimization
17//!
18//! Conceptually, this optimization is "destination propagation". It is similar to the Named Return
19//! Value Optimization, or NRVO, known from the C++ world, except that it isn't limited to return
20//! values or the return place `_0`. On a very high level, independent of the actual implementation
21//! details, it does the following:
22//!
487cf647
FG
23//! 1) Identify `dest = src;` statements with values for `dest` and `src` whose storage can soundly
24//! be merged.
1b1a35ee
XL
25//! 2) Replace all mentions of `src` with `dest` ("unifying" them and propagating the destination
26//! backwards).
27//! 3) Delete the `dest = src;` statement (by making it a `nop`).
28//!
29//! Step 1) is by far the hardest, so it is explained in more detail below.
30//!
31//! ## Soundness
32//!
487cf647
FG
33//! We have a pair of places `p` and `q`, whose memory we would like to merge. In order for this to
34//! be sound, we need to check a number of conditions:
1b1a35ee 35//!
487cf647
FG
36//! * `p` and `q` must both be *constant* - it does not make much sense to talk about merging them
37//! if they do not consistently refer to the same place in memory. This is satisfied if they do
38//! not contain any indirection through a pointer or any indexing projections.
1b1a35ee 39//!
487cf647
FG
40//! * We need to make sure that the goal of "merging the memory" is actually structurally possible
41//! in MIR. For example, even if all the other conditions are satisfied, there is no way to
42//! "merge" `_5.foo` and `_6.bar`. For now, we ensure this by requiring that both `p` and `q` are
43//! locals with no further projections. Future iterations of this pass should improve on this.
1b1a35ee 44//!
487cf647
FG
45//! * Finally, we want `p` and `q` to use the same memory - however, we still need to make sure that
46//! each of them has enough "ownership" of that memory to continue "doing its job." More
47//! precisely, what we will check is that whenever the program performs a write to `p`, then it
48//! does not currently care about what the value in `q` is (and vice versa). We formalize the
49//! notion of "does not care what the value in `q` is" by checking the *liveness* of `q`.
1b1a35ee 50//!
487cf647
FG
51//! Because of the difficulty of computing liveness of places that have their address taken, we do
52//! not even attempt to do it. Any places that are in a local that has its address taken is
53//! excluded from the optimization.
1b1a35ee 54//!
487cf647
FG
55//! The first two conditions are simple structural requirements on the `Assign` statements that can
56//! be trivially checked. The third requirement however is more difficult and costly to check.
1b1a35ee 57//!
487cf647
FG
58//! ## Future Improvements
59//!
60//! There are a number of ways in which this pass could be improved in the future:
61//!
62//! * Merging storage liveness ranges instead of removing storage statements completely. This may
63//! improve stack usage.
64//!
65//! * Allow merging locals into places with projections, eg `_5` into `_6.foo`.
66//!
67//! * Liveness analysis with more precision than whole locals at a time. The smaller benefit of this
68//! is that it would allow us to dest prop at "sub-local" levels in some cases. The bigger benefit
69//! of this is that such liveness analysis can report more accurate results about whole locals at
70//! a time. For example, consider:
71//!
49aad941 72//! ```ignore (syntax-highlighting-only)
487cf647
FG
73//! _1 = u;
74//! // unrelated code
75//! _1.f1 = v;
76//! _2 = _1.f1;
77//! ```
78//!
79//! Because the current analysis only thinks in terms of locals, it does not have enough
80//! information to report that `_1` is dead in the "unrelated code" section.
81//!
82//! * Liveness analysis enabled by alias analysis. This would allow us to not just bail on locals
83//! that ever have their address taken. Of course that requires actually having alias analysis
84//! (and a model to build it on), so this might be a bit of a ways off.
85//!
353b0b11 86//! * Various perf improvements. There are a bunch of comments in here marked `PERF` with ideas for
487cf647
FG
87//! how to do things more efficiently. However, the complexity of the pass as a whole should be
88//! kept in mind.
1b1a35ee
XL
89//!
90//! ## Previous Work
91//!
487cf647
FG
92//! A [previous attempt][attempt 1] at implementing an optimization like this turned out to be a
93//! significant regression in compiler performance. Fixing the regressions introduced a lot of
94//! undesirable complexity to the implementation.
95//!
96//! A [subsequent approach][attempt 2] tried to avoid the costly computation by limiting itself to
97//! acyclic CFGs, but still turned out to be far too costly to run due to suboptimal performance
98//! within individual basic blocks, requiring a walk across the entire block for every assignment
99//! found within the block. For the `tuple-stress` benchmark, which has 458745 statements in a
100//! single block, this proved to be far too costly.
1b1a35ee 101//!
487cf647
FG
102//! [Another approach after that][attempt 3] was much closer to correct, but had some soundness
103//! issues - it was failing to consider stores outside live ranges, and failed to uphold some of the
104//! requirements that MIR has for non-overlapping places within statements. However, it also had
105//! performance issues caused by `O(l² * s)` runtime, where `l` is the number of locals and `s` is
106//! the number of statements and terminators.
1b1a35ee
XL
107//!
108//! Since the first attempt at this, the compiler has improved dramatically, and new analysis
109//! frameworks have been added that should make this approach viable without requiring a limited
110//! approach that only works for some classes of CFGs:
111//! - rustc now has a powerful dataflow analysis framework that can handle forwards and backwards
112//! analyses efficiently.
113//! - Layout optimizations for generators have been added to improve code generation for
487cf647 114//! async/await, which are very similar in spirit to what this optimization does.
1b1a35ee
XL
115//!
116//! Also, rustc now has a simple NRVO pass (see `nrvo.rs`), which handles a subset of the cases that
117//! this destination propagation pass handles, proving that similar optimizations can be performed
118//! on MIR.
119//!
120//! ## Pre/Post Optimization
121//!
122//! It is recommended to run `SimplifyCfg` and then `SimplifyLocals` some time after this pass, as
123//! it replaces the eliminated assign statements with `nop`s and leaves unused locals behind.
124//!
125//! [liveness]: https://en.wikipedia.org/wiki/Live_variable_analysis
487cf647
FG
126//! [attempt 1]: https://github.com/rust-lang/rust/pull/47954
127//! [attempt 2]: https://github.com/rust-lang/rust/pull/71003
128//! [attempt 3]: https://github.com/rust-lang/rust/pull/72632
129
130use std::collections::hash_map::{Entry, OccupiedEntry};
1b1a35ee 131
9c376795 132use crate::simplify::remove_dead_blocks;
c295e0f8 133use crate::MirPass;
487cf647
FG
134use rustc_data_structures::fx::FxHashMap;
135use rustc_index::bit_set::BitSet;
9c376795 136use rustc_middle::mir::visit::{MutVisitor, PlaceContext, Visitor};
c295e0f8 137use rustc_middle::mir::{dump_mir, PassWhere};
1b1a35ee 138use rustc_middle::mir::{
9ffffee4
FG
139 traversal, Body, InlineAsmOperand, Local, LocalKind, Location, Operand, Place, Rvalue,
140 Statement, StatementKind, TerminatorKind,
487cf647 141};
17df50a5 142use rustc_middle::ty::TyCtxt;
487cf647
FG
143use rustc_mir_dataflow::impls::MaybeLiveLocals;
144use rustc_mir_dataflow::{Analysis, ResultsCursor};
1b1a35ee
XL
145
146pub struct DestinationPropagation;
147
148impl<'tcx> MirPass<'tcx> for DestinationPropagation {
a2a8927a 149 fn is_enabled(&self, sess: &rustc_session::Session) -> bool {
487cf647
FG
150 // For now, only run at MIR opt level 3. Two things need to be changed before this can be
151 // turned on by default:
152 // 1. Because of the overeager removal of storage statements, this can cause stack space
153 // regressions. This opt is not the place to fix this though, it's a more general
154 // problem in MIR.
155 // 2. Despite being an overall perf improvement, this still causes a 30% regression in
156 // keccak. We can temporarily fix this by bounding function size, but in the long term
157 // we should fix this by being smarter about invalidating analysis results.
158 sess.mir_opt_level() >= 3
a2a8927a 159 }
1b1a35ee 160
a2a8927a 161 fn run_pass(&self, tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
29967ef6 162 let def_id = body.source.def_id();
487cf647
FG
163 let mut allocations = Allocations::default();
164 trace!(func = ?tcx.def_path_str(def_id));
29967ef6 165
487cf647 166 let borrowed = rustc_mir_dataflow::impls::borrowed_locals(body);
1b1a35ee 167
487cf647
FG
168 // In order to avoid having to collect data for every single pair of locals in the body, we
169 // do not allow doing more than one merge for places that are derived from the same local at
170 // once. To avoid missed opportunities, we instead iterate to a fixed point - we'll refer to
171 // each of these iterations as a "round."
172 //
173 // Reaching a fixed point could in theory take up to `min(l, s)` rounds - however, we do not
174 // expect to see MIR like that. To verify this, a test was run against `[rust-lang/regex]` -
175 // the average MIR body saw 1.32 full iterations of this loop. The most that was hit were 30
176 // for a single function. Only 80/2801 (2.9%) of functions saw at least 5.
177 //
178 // [rust-lang/regex]:
179 // https://github.com/rust-lang/regex/tree/b5372864e2df6a2f5e543a556a62197f50ca3650
180 let mut round_count = 0;
181 loop {
182 // PERF: Can we do something smarter than recalculating the candidates and liveness
183 // results?
184 let mut candidates = find_candidates(
185 body,
186 &borrowed,
187 &mut allocations.candidates,
188 &mut allocations.candidates_reverse,
1b1a35ee 189 );
487cf647
FG
190 trace!(?candidates);
191 let mut live = MaybeLiveLocals
192 .into_engine(tcx, body)
193 .iterate_to_fixpoint()
194 .into_results_cursor(body);
195 dest_prop_mir_dump(tcx, body, &mut live, round_count);
196
197 FilterInformation::filter_liveness(
198 &mut candidates,
199 &mut live,
200 &mut allocations.write_info,
201 body,
1b1a35ee 202 );
1b1a35ee 203
487cf647
FG
204 // Because we do not update liveness information, it is unsound to use a local for more
205 // than one merge operation within a single round of optimizations. We store here which
206 // ones we have already used.
207 let mut merged_locals: BitSet<Local> = BitSet::new_empty(body.local_decls.len());
1b1a35ee 208
487cf647
FG
209 // This is the set of merges we will apply this round. It is a subset of the candidates.
210 let mut merges = FxHashMap::default();
1b1a35ee 211
487cf647
FG
212 for (src, candidates) in candidates.c.iter() {
213 if merged_locals.contains(*src) {
214 continue;
215 }
216 let Some(dest) =
217 candidates.iter().find(|dest| !merged_locals.contains(**dest)) else {
218 continue;
219 };
220 if !tcx.consider_optimizing(|| {
221 format!("{} round {}", tcx.def_path_str(def_id), round_count)
222 }) {
223 break;
224 }
225 merges.insert(*src, *dest);
226 merged_locals.insert(*src);
227 merged_locals.insert(*dest);
1b1a35ee 228 }
487cf647 229 trace!(merging = ?merges);
1b1a35ee 230
487cf647 231 if merges.is_empty() {
1b1a35ee
XL
232 break;
233 }
487cf647 234 round_count += 1;
1b1a35ee 235
487cf647 236 apply_merges(body, tcx, &merges, &merged_locals);
1b1a35ee
XL
237 }
238
9c376795
FG
239 if round_count != 0 {
240 // Merging can introduce overlap between moved arguments and/or call destination in an
241 // unreachable code, which validator considers to be ill-formed.
242 remove_dead_blocks(tcx, body);
243 }
244
487cf647 245 trace!(round_count);
1b1a35ee
XL
246 }
247}
248
487cf647
FG
249/// Container for the various allocations that we need.
250///
251/// We store these here and hand out `&mut` access to them, instead of dropping and recreating them
252/// frequently. Everything with a `&'alloc` lifetime points into here.
253#[derive(Default)]
254struct Allocations {
255 candidates: FxHashMap<Local, Vec<Local>>,
256 candidates_reverse: FxHashMap<Local, Vec<Local>>,
257 write_info: WriteInfo,
258 // PERF: Do this for `MaybeLiveLocals` allocations too.
1b1a35ee
XL
259}
260
487cf647
FG
261#[derive(Debug)]
262struct Candidates<'alloc> {
263 /// The set of candidates we are considering in this optimization.
264 ///
265 /// We will always merge the key into at most one of its values.
266 ///
267 /// Whether a place ends up in the key or the value does not correspond to whether it appears as
268 /// the lhs or rhs of any assignment. As a matter of fact, the places in here might never appear
269 /// in an assignment at all. This happens because if we see an assignment like this:
270 ///
271 /// ```ignore (syntax-highlighting-only)
272 /// _1.0 = _2.0
273 /// ```
274 ///
275 /// We will still report that we would like to merge `_1` and `_2` in an attempt to allow us to
276 /// remove that assignment.
277 c: &'alloc mut FxHashMap<Local, Vec<Local>>,
278 /// A reverse index of the `c` set; if the `c` set contains `a => Place { local: b, proj }`,
279 /// then this contains `b => a`.
280 // PERF: Possibly these should be `SmallVec`s?
281 reverse: &'alloc mut FxHashMap<Local, Vec<Local>>,
1b1a35ee
XL
282}
283
487cf647
FG
284//////////////////////////////////////////////////////////
285// Merging
286//
287// Applies the actual optimization
1b1a35ee 288
487cf647
FG
289fn apply_merges<'tcx>(
290 body: &mut Body<'tcx>,
291 tcx: TyCtxt<'tcx>,
292 merges: &FxHashMap<Local, Local>,
293 merged_locals: &BitSet<Local>,
294) {
295 let mut merger = Merger { tcx, merges, merged_locals };
296 merger.visit_body_preserves_cfg(body);
1b1a35ee
XL
297}
298
487cf647 299struct Merger<'a, 'tcx> {
1b1a35ee 300 tcx: TyCtxt<'tcx>,
487cf647
FG
301 merges: &'a FxHashMap<Local, Local>,
302 merged_locals: &'a BitSet<Local>,
1b1a35ee
XL
303}
304
487cf647 305impl<'a, 'tcx> MutVisitor<'tcx> for Merger<'a, 'tcx> {
a2a8927a 306 fn tcx(&self) -> TyCtxt<'tcx> {
1b1a35ee
XL
307 self.tcx
308 }
309
487cf647
FG
310 fn visit_local(&mut self, local: &mut Local, _: PlaceContext, _location: Location) {
311 if let Some(dest) = self.merges.get(local) {
312 *local = *dest;
1b1a35ee
XL
313 }
314 }
315
1b1a35ee 316 fn visit_statement(&mut self, statement: &mut Statement<'tcx>, location: Location) {
1b1a35ee 317 match &statement.kind {
487cf647 318 // FIXME: Don't delete storage statements, but "merge" the storage ranges instead.
1b1a35ee 319 StatementKind::StorageDead(local) | StatementKind::StorageLive(local)
487cf647 320 if self.merged_locals.contains(*local) =>
1b1a35ee 321 {
487cf647
FG
322 statement.make_nop();
323 return;
1b1a35ee 324 }
487cf647
FG
325 _ => (),
326 };
327 self.super_statement(statement, location);
328 match &statement.kind {
1b1a35ee
XL
329 StatementKind::Assign(box (dest, rvalue)) => {
330 match rvalue {
9ffffee4
FG
331 Rvalue::CopyForDeref(place)
332 | Rvalue::Use(Operand::Copy(place) | Operand::Move(place)) => {
1b1a35ee
XL
333 // These might've been turned into self-assignments by the replacement
334 // (this includes the original statement we wanted to eliminate).
335 if dest == place {
336 debug!("{:?} turned into self-assignment, deleting", location);
337 statement.make_nop();
338 }
339 }
340 _ => {}
341 }
342 }
343
344 _ => {}
345 }
346 }
347}
348
487cf647
FG
349//////////////////////////////////////////////////////////
350// Liveness filtering
351//
352// This section enforces bullet point 2
353
354struct FilterInformation<'a, 'body, 'alloc, 'tcx> {
355 body: &'body Body<'tcx>,
356 live: &'a mut ResultsCursor<'body, 'tcx, MaybeLiveLocals>,
357 candidates: &'a mut Candidates<'alloc>,
358 write_info: &'alloc mut WriteInfo,
359 at: Location,
1b1a35ee
XL
360}
361
487cf647 362// We first implement some utility functions which we will expose removing candidates according to
49aad941 363// different needs. Throughout the liveness filtering, the `candidates` are only ever accessed
487cf647
FG
364// through these methods, and not directly.
365impl<'alloc> Candidates<'alloc> {
366 /// Just `Vec::retain`, but the condition is inverted and we add debugging output
9c376795 367 fn vec_filter_candidates(
487cf647
FG
368 src: Local,
369 v: &mut Vec<Local>,
9c376795 370 mut f: impl FnMut(Local) -> CandidateFilter,
487cf647
FG
371 at: Location,
372 ) {
373 v.retain(|dest| {
374 let remove = f(*dest);
9c376795 375 if remove == CandidateFilter::Remove {
487cf647 376 trace!("eliminating {:?} => {:?} due to conflict at {:?}", src, dest, at);
29967ef6 377 }
9c376795 378 remove == CandidateFilter::Keep
29967ef6 379 });
487cf647 380 }
1b1a35ee 381
9c376795
FG
382 /// `vec_filter_candidates` but for an `Entry`
383 fn entry_filter_candidates(
487cf647
FG
384 mut entry: OccupiedEntry<'_, Local, Vec<Local>>,
385 p: Local,
9c376795 386 f: impl FnMut(Local) -> CandidateFilter,
487cf647
FG
387 at: Location,
388 ) {
389 let candidates = entry.get_mut();
9c376795 390 Self::vec_filter_candidates(p, candidates, f, at);
487cf647
FG
391 if candidates.len() == 0 {
392 entry.remove();
393 }
394 }
1b1a35ee 395
9c376795
FG
396 /// For all candidates `(p, q)` or `(q, p)` removes the candidate if `f(q)` says to do so
397 fn filter_candidates_by(
398 &mut self,
399 p: Local,
400 mut f: impl FnMut(Local) -> CandidateFilter,
401 at: Location,
402 ) {
487cf647
FG
403 // Cover the cases where `p` appears as a `src`
404 if let Entry::Occupied(entry) = self.c.entry(p) {
9c376795 405 Self::entry_filter_candidates(entry, p, &mut f, at);
487cf647
FG
406 }
407 // And the cases where `p` appears as a `dest`
408 let Some(srcs) = self.reverse.get_mut(&p) else {
409 return;
410 };
411 // We use `retain` here to remove the elements from the reverse set if we've removed the
412 // matching candidate in the forward set.
413 srcs.retain(|src| {
9c376795 414 if f(*src) == CandidateFilter::Keep {
487cf647 415 return true;
1b1a35ee 416 }
487cf647
FG
417 let Entry::Occupied(entry) = self.c.entry(*src) else {
418 return false;
419 };
9c376795
FG
420 Self::entry_filter_candidates(
421 entry,
422 *src,
423 |dest| {
424 if dest == p { CandidateFilter::Remove } else { CandidateFilter::Keep }
425 },
426 at,
427 );
487cf647
FG
428 false
429 });
430 }
431}
1b1a35ee 432
9c376795
FG
433#[derive(Copy, Clone, PartialEq, Eq)]
434enum CandidateFilter {
435 Keep,
436 Remove,
437}
438
487cf647
FG
439impl<'a, 'body, 'alloc, 'tcx> FilterInformation<'a, 'body, 'alloc, 'tcx> {
440 /// Filters the set of candidates to remove those that conflict.
441 ///
442 /// The steps we take are exactly those that are outlined at the top of the file. For each
443 /// statement/terminator, we collect the set of locals that are written to in that
444 /// statement/terminator, and then we remove all pairs of candidates that contain one such local
445 /// and another one that is live.
446 ///
447 /// We need to be careful about the ordering of operations within each statement/terminator
448 /// here. Many statements might write and read from more than one place, and we need to consider
449 /// them all. The strategy for doing this is as follows: We first gather all the places that are
450 /// written to within the statement/terminator via `WriteInfo`. Then, we use the liveness
451 /// analysis from *before* the statement/terminator (in the control flow sense) to eliminate
452 /// candidates - this is because we want to conservatively treat a pair of locals that is both
453 /// read and written in the statement/terminator to be conflicting, and the liveness analysis
454 /// before the statement/terminator will correctly report locals that are read in the
455 /// statement/terminator to be live. We are additionally conservative by treating all written to
456 /// locals as also being read from.
457 fn filter_liveness<'b>(
458 candidates: &mut Candidates<'alloc>,
459 live: &mut ResultsCursor<'b, 'tcx, MaybeLiveLocals>,
460 write_info_alloc: &'alloc mut WriteInfo,
461 body: &'b Body<'tcx>,
462 ) {
463 let mut this = FilterInformation {
464 body,
465 live,
466 candidates,
467 // We don't actually store anything at this scope, we just keep things here to be able
468 // to reuse the allocation.
469 write_info: write_info_alloc,
470 // Doesn't matter what we put here, will be overwritten before being used
9ffffee4 471 at: Location::START,
487cf647
FG
472 };
473 this.internal_filter_liveness();
474 }
1b1a35ee 475
487cf647
FG
476 fn internal_filter_liveness(&mut self) {
477 for (block, data) in traversal::preorder(self.body) {
478 self.at = Location { block, statement_index: data.statements.len() };
479 self.live.seek_after_primary_effect(self.at);
480 self.write_info.for_terminator(&data.terminator().kind);
481 self.apply_conflicts();
482
483 for (i, statement) in data.statements.iter().enumerate().rev() {
484 self.at = Location { block, statement_index: i };
485 self.live.seek_after_primary_effect(self.at);
9c376795 486 self.write_info.for_statement(&statement.kind, self.body);
487cf647 487 self.apply_conflicts();
1b1a35ee 488 }
1b1a35ee 489 }
1b1a35ee
XL
490 }
491
487cf647
FG
492 fn apply_conflicts(&mut self) {
493 let writes = &self.write_info.writes;
494 for p in writes {
9c376795
FG
495 let other_skip = self.write_info.skip_pair.and_then(|(a, b)| {
496 if a == *p {
497 Some(b)
498 } else if b == *p {
499 Some(a)
500 } else {
501 None
502 }
503 });
504 self.candidates.filter_candidates_by(
487cf647 505 *p,
9c376795
FG
506 |q| {
507 if Some(q) == other_skip {
508 return CandidateFilter::Keep;
509 }
510 // It is possible that a local may be live for less than the
511 // duration of a statement This happens in the case of function
512 // calls or inline asm. Because of this, we also mark locals as
513 // conflicting when both of them are written to in the same
514 // statement.
515 if self.live.contains(q) || writes.contains(&q) {
516 CandidateFilter::Remove
517 } else {
518 CandidateFilter::Keep
519 }
520 },
487cf647
FG
521 self.at,
522 );
523 }
524 }
487cf647 525}
1b1a35ee 526
487cf647
FG
527/// Describes where a statement/terminator writes to
528#[derive(Default, Debug)]
529struct WriteInfo {
530 writes: Vec<Local>,
9c376795
FG
531 /// If this pair of locals is a candidate pair, completely skip processing it during this
532 /// statement. All other candidates are unaffected.
533 skip_pair: Option<(Local, Local)>,
487cf647
FG
534}
535
536impl WriteInfo {
9c376795
FG
537 fn for_statement<'tcx>(&mut self, statement: &StatementKind<'tcx>, body: &Body<'tcx>) {
538 self.reset();
487cf647
FG
539 match statement {
540 StatementKind::Assign(box (lhs, rhs)) => {
541 self.add_place(*lhs);
542 match rhs {
9c376795
FG
543 Rvalue::Use(op) => {
544 self.add_operand(op);
545 self.consider_skipping_for_assign_use(*lhs, op, body);
546 }
547 Rvalue::Repeat(op, _) => {
487cf647
FG
548 self.add_operand(op);
549 }
550 Rvalue::Cast(_, op, _)
551 | Rvalue::UnaryOp(_, op)
552 | Rvalue::ShallowInitBox(op, _) => {
553 self.add_operand(op);
554 }
555 Rvalue::BinaryOp(_, ops) | Rvalue::CheckedBinaryOp(_, ops) => {
556 for op in [&ops.0, &ops.1] {
557 self.add_operand(op);
558 }
559 }
560 Rvalue::Aggregate(_, ops) => {
561 for op in ops {
562 self.add_operand(op);
563 }
564 }
565 Rvalue::ThreadLocalRef(_)
566 | Rvalue::NullaryOp(_, _)
567 | Rvalue::Ref(_, _, _)
568 | Rvalue::AddressOf(_, _)
569 | Rvalue::Len(_)
570 | Rvalue::Discriminant(_)
571 | Rvalue::CopyForDeref(_) => (),
572 }
573 }
574 // Retags are technically also reads, but reporting them as a write suffices
575 StatementKind::SetDiscriminant { place, .. }
576 | StatementKind::Deinit(place)
577 | StatementKind::Retag(_, place) => {
578 self.add_place(**place);
579 }
580 StatementKind::Intrinsic(_)
9ffffee4 581 | StatementKind::ConstEvalCounter
487cf647
FG
582 | StatementKind::Nop
583 | StatementKind::Coverage(_)
584 | StatementKind::StorageLive(_)
49aad941
FG
585 | StatementKind::StorageDead(_)
586 | StatementKind::PlaceMention(_) => (),
587 StatementKind::FakeRead(_) | StatementKind::AscribeUserType(_, _) => {
487cf647
FG
588 bug!("{:?} not found in this MIR phase", statement)
589 }
1b1a35ee
XL
590 }
591 }
592
9c376795
FG
593 fn consider_skipping_for_assign_use<'tcx>(
594 &mut self,
595 lhs: Place<'tcx>,
596 rhs: &Operand<'tcx>,
597 body: &Body<'tcx>,
598 ) {
599 let Some(rhs) = rhs.place() else {
600 return
601 };
602 if let Some(pair) = places_to_candidate_pair(lhs, rhs, body) {
603 self.skip_pair = Some(pair);
604 }
605 }
606
487cf647 607 fn for_terminator<'tcx>(&mut self, terminator: &TerminatorKind<'tcx>) {
9c376795 608 self.reset();
487cf647
FG
609 match terminator {
610 TerminatorKind::SwitchInt { discr: op, .. }
611 | TerminatorKind::Assert { cond: op, .. } => {
612 self.add_operand(op);
1b1a35ee 613 }
487cf647
FG
614 TerminatorKind::Call { destination, func, args, .. } => {
615 self.add_place(*destination);
616 self.add_operand(func);
617 for arg in args {
618 self.add_operand(arg);
1b1a35ee
XL
619 }
620 }
487cf647
FG
621 TerminatorKind::InlineAsm { operands, .. } => {
622 for asm_operand in operands {
623 match asm_operand {
624 InlineAsmOperand::In { value, .. } => {
625 self.add_operand(value);
1b1a35ee 626 }
487cf647
FG
627 InlineAsmOperand::Out { place, .. } => {
628 if let Some(place) = place {
629 self.add_place(*place);
1b1a35ee
XL
630 }
631 }
487cf647
FG
632 // Note that the `late` field in `InOut` is about whether the registers used
633 // for these things overlap, and is of absolutely no interest to us.
634 InlineAsmOperand::InOut { in_value, out_place, .. } => {
635 if let Some(place) = out_place {
636 self.add_place(*place);
637 }
638 self.add_operand(in_value);
1b1a35ee 639 }
487cf647
FG
640 InlineAsmOperand::Const { .. }
641 | InlineAsmOperand::SymFn { .. }
642 | InlineAsmOperand::SymStatic { .. } => (),
1b1a35ee
XL
643 }
644 }
645 }
1b1a35ee 646 TerminatorKind::Goto { .. }
353b0b11
FG
647 | TerminatorKind::Resume
648 | TerminatorKind::Terminate
1b1a35ee 649 | TerminatorKind::Return
487cf647
FG
650 | TerminatorKind::Unreachable { .. } => (),
651 TerminatorKind::Drop { .. } => {
652 // `Drop`s create a `&mut` and so are not considered
653 }
353b0b11 654 TerminatorKind::Yield { .. }
1b1a35ee
XL
655 | TerminatorKind::GeneratorDrop
656 | TerminatorKind::FalseEdge { .. }
487cf647
FG
657 | TerminatorKind::FalseUnwind { .. } => {
658 bug!("{:?} not found in this MIR phase", terminator)
659 }
1b1a35ee
XL
660 }
661 }
662
9c376795 663 fn add_place(&mut self, place: Place<'_>) {
487cf647
FG
664 self.writes.push(place.local);
665 }
1b1a35ee 666
487cf647
FG
667 fn add_operand<'tcx>(&mut self, op: &Operand<'tcx>) {
668 match op {
669 // FIXME(JakobDegen): In a previous version, the `Move` case was incorrectly treated as
670 // being a read only. This was unsound, however we cannot add a regression test because
671 // it is not possible to set this off with current MIR. Once we have that ability, a
672 // regression test should be added.
673 Operand::Move(p) => self.add_place(*p),
674 Operand::Copy(_) | Operand::Constant(_) => (),
1b1a35ee 675 }
1b1a35ee 676 }
9c376795
FG
677
678 fn reset(&mut self) {
679 self.writes.clear();
680 self.skip_pair = None;
681 }
487cf647 682}
1b1a35ee 683
487cf647
FG
684/////////////////////////////////////////////////////
685// Candidate accumulation
1b1a35ee 686
487cf647
FG
687/// If the pair of places is being considered for merging, returns the candidate which would be
688/// merged in order to accomplish this.
689///
690/// The contract here is in one direction - there is a guarantee that merging the locals that are
691/// outputted by this function would result in an assignment between the inputs becoming a
692/// self-assignment. However, there is no guarantee that the returned pair is actually suitable for
693/// merging - candidate collection must still check this independently.
1b1a35ee 694///
487cf647
FG
695/// This output is unique for each unordered pair of input places.
696fn places_to_candidate_pair<'tcx>(
697 a: Place<'tcx>,
698 b: Place<'tcx>,
699 body: &Body<'tcx>,
700) -> Option<(Local, Local)> {
701 let (mut a, mut b) = if a.projection.len() == 0 && b.projection.len() == 0 {
702 (a.local, b.local)
703 } else {
704 return None;
705 };
706
707 // By sorting, we make sure we're input order independent
708 if a > b {
709 std::mem::swap(&mut a, &mut b);
710 }
711
712 // We could now return `(a, b)`, but then we miss some candidates in the case where `a` can't be
713 // used as a `src`.
714 if is_local_required(a, body) {
715 std::mem::swap(&mut a, &mut b);
716 }
717 // We could check `is_local_required` again here, but there's no need - after all, we make no
718 // promise that the candidate pair is actually valid
719 Some((a, b))
1b1a35ee
XL
720}
721
487cf647 722/// Collects the candidates for merging
1b1a35ee 723///
487cf647
FG
724/// This is responsible for enforcing the first and third bullet point.
725fn find_candidates<'alloc, 'tcx>(
726 body: &Body<'tcx>,
727 borrowed: &BitSet<Local>,
728 candidates: &'alloc mut FxHashMap<Local, Vec<Local>>,
729 candidates_reverse: &'alloc mut FxHashMap<Local, Vec<Local>>,
730) -> Candidates<'alloc> {
731 candidates.clear();
732 candidates_reverse.clear();
733 let mut visitor = FindAssignments { body, candidates, borrowed };
1b1a35ee 734 visitor.visit_body(body);
487cf647
FG
735 // Deduplicate candidates
736 for (_, cands) in candidates.iter_mut() {
737 cands.sort();
738 cands.dedup();
739 }
740 // Generate the reverse map
741 for (src, cands) in candidates.iter() {
742 for dest in cands.iter().copied() {
743 candidates_reverse.entry(dest).or_default().push(*src);
744 }
745 }
746 Candidates { c: candidates, reverse: candidates_reverse }
1b1a35ee
XL
747}
748
487cf647 749struct FindAssignments<'a, 'alloc, 'tcx> {
1b1a35ee 750 body: &'a Body<'tcx>,
487cf647
FG
751 candidates: &'alloc mut FxHashMap<Local, Vec<Local>>,
752 borrowed: &'a BitSet<Local>,
1b1a35ee
XL
753}
754
487cf647
FG
755impl<'tcx> Visitor<'tcx> for FindAssignments<'_, '_, 'tcx> {
756 fn visit_statement(&mut self, statement: &Statement<'tcx>, _: Location) {
1b1a35ee 757 if let StatementKind::Assign(box (
487cf647 758 lhs,
9ffffee4 759 Rvalue::CopyForDeref(rhs) | Rvalue::Use(Operand::Copy(rhs) | Operand::Move(rhs)),
1b1a35ee
XL
760 )) = &statement.kind
761 {
487cf647 762 let Some((src, dest)) = places_to_candidate_pair(*lhs, *rhs, self.body) else {
1b1a35ee 763 return;
487cf647 764 };
1b1a35ee 765
487cf647
FG
766 // As described at the top of the file, we do not go near things that have their address
767 // taken.
768 if self.borrowed.contains(src) || self.borrowed.contains(dest) {
1b1a35ee
XL
769 return;
770 }
771
487cf647
FG
772 // Also, we need to make sure that MIR actually allows the `src` to be removed
773 if is_local_required(src, self.body) {
1b1a35ee
XL
774 return;
775 }
776
487cf647
FG
777 // We may insert duplicates here, but that's fine
778 self.candidates.entry(src).or_default().push(dest);
1b1a35ee
XL
779 }
780 }
781}
782
783/// Some locals are part of the function's interface and can not be removed.
784///
785/// Note that these locals *can* still be merged with non-required locals by removing that other
786/// local.
787fn is_local_required(local: Local, body: &Body<'_>) -> bool {
788 match body.local_kind(local) {
789 LocalKind::Arg | LocalKind::ReturnPointer => true,
353b0b11 790 LocalKind::Temp => false,
1b1a35ee
XL
791 }
792}
793
487cf647
FG
794/////////////////////////////////////////////////////////
795// MIR Dump
1b1a35ee 796
487cf647
FG
797fn dest_prop_mir_dump<'body, 'tcx>(
798 tcx: TyCtxt<'tcx>,
799 body: &'body Body<'tcx>,
800 live: &mut ResultsCursor<'body, 'tcx, MaybeLiveLocals>,
801 round: usize,
802) {
803 let mut reachable = None;
804 dump_mir(tcx, false, "DestinationPropagation-dataflow", &round, body, |pass_where, w| {
805 let reachable = reachable.get_or_insert_with(|| traversal::reachable_as_bitset(body));
806
807 match pass_where {
808 PassWhere::BeforeLocation(loc) if reachable.contains(loc.block) => {
809 live.seek_after_primary_effect(loc);
810 writeln!(w, " // live: {:?}", live.get())?;
811 }
812 PassWhere::AfterTerminator(bb) if reachable.contains(bb) => {
813 let loc = body.terminator_loc(bb);
814 live.seek_before_primary_effect(loc);
815 writeln!(w, " // live: {:?}", live.get())?;
816 }
1b1a35ee 817
487cf647
FG
818 PassWhere::BeforeBlock(bb) if reachable.contains(bb) => {
819 live.seek_to_block_start(bb);
820 writeln!(w, " // live: {:?}", live.get())?;
821 }
822
823 PassWhere::BeforeCFG | PassWhere::AfterCFG | PassWhere::AfterLocation(_) => {}
824
825 PassWhere::BeforeLocation(_) | PassWhere::AfterTerminator(_) => {
826 writeln!(w, " // live: <unreachable>")?;
827 }
828
829 PassWhere::BeforeBlock(_) => {
830 writeln!(w, " // live: <unreachable>")?;
831 }
1b1a35ee 832 }
487cf647
FG
833
834 Ok(())
835 });
1b1a35ee 836}