1 //! Propagates assignment destinations backwards in the CFG to eliminate redundant assignments.
5 //! MIR building can insert a lot of redundant copies, and Rust code in general often tends to move
6 //! values around a lot. The result is a lot of assignments of the form `dest = {move} src;` in MIR.
7 //! MIR building for constants in particular tends to create additional locals that are only used
8 //! inside a single block to shuffle a value around unnecessarily.
10 //! LLVM by itself is not good enough at eliminating these redundant copies (eg. see
11 //! <https://github.com/rust-lang/rust/issues/32966>), so this leaves some performance on the table
12 //! that we can regain by implementing an optimization for removing these assign statements in rustc
13 //! itself. When this optimization runs fast enough, it can also speed up the constant evaluation
14 //! and code generation phases of rustc due to the reduced number of statements and locals.
16 //! # The Optimization
18 //! Conceptually, this optimization is "destination propagation". It is similar to the Named Return
19 //! Value Optimization, or NRVO, known from the C++ world, except that it isn't limited to return
20 //! values or the return place `_0`. On a very high level, independent of the actual implementation
21 //! details, it does the following:
23 //! 1) Identify `dest = src;` statements that can be soundly eliminated.
24 //! 2) Replace all mentions of `src` with `dest` ("unifying" them and propagating the destination
26 //! 3) Delete the `dest = src;` statement (by making it a `nop`).
28 //! Step 1) is by far the hardest, so it is explained in more detail below.
32 //! Given an `Assign` statement `dest = src;`, where `dest` is a `Place` and `src` is an `Rvalue`,
33 //! there are a few requirements that must hold for the optimization to be sound:
35 //! * `dest` must not contain any *indirection* through a pointer. It must access part of the base
36 //! local. Otherwise it might point to arbitrary memory that is hard to track.
38 //! It must also not contain any indexing projections, since those take an arbitrary `Local` as
39 //! the index, and that local might only be initialized shortly before `dest` is used.
41 //! Subtle case: If `dest` is a, or projects through a union, then we have to make sure that there
42 //! remains an assignment to it, since that sets the "active field" of the union. But if `src` is
43 //! a ZST, it might not be initialized, so there might not be any use of it before the assignment,
44 //! and performing the optimization would simply delete the assignment, leaving `dest`
47 //! * `src` must be a bare `Local` without any indirections or field projections (FIXME: Is this a
48 //! fundamental restriction or just current impl state?). It can be copied or moved by the
51 //! * The `dest` and `src` locals must never be [*live*][liveness] at the same time. If they are, it
52 //! means that they both hold a (potentially different) value that is needed by a future use of
53 //! the locals. Unifying them would overwrite one of the values.
55 //! Note that computing liveness of locals that have had their address taken is more difficult:
56 //! Short of doing full escape analysis on the address/pointer/reference, the pass would need to
57 //! assume that any operation that can potentially involve opaque user code (such as function
58 //! calls, destructors, and inline assembly) may access any local that had its address taken
59 //! before that point.
61 //! Here, the first two conditions are simple structural requirements on the `Assign` statements
62 //! that can be trivially checked. The liveness requirement however is more difficult and costly to
67 //! A [previous attempt] at implementing an optimization like this turned out to be a significant
68 //! regression in compiler performance. Fixing the regressions introduced a lot of undesirable
69 //! complexity to the implementation.
71 //! A [subsequent approach] tried to avoid the costly computation by limiting itself to acyclic
72 //! CFGs, but still turned out to be far too costly to run due to suboptimal performance within
73 //! individual basic blocks, requiring a walk across the entire block for every assignment found
74 //! within the block. For the `tuple-stress` benchmark, which has 458745 statements in a single
75 //! block, this proved to be far too costly.
77 //! Since the first attempt at this, the compiler has improved dramatically, and new analysis
78 //! frameworks have been added that should make this approach viable without requiring a limited
79 //! approach that only works for some classes of CFGs:
80 //! - rustc now has a powerful dataflow analysis framework that can handle forwards and backwards
81 //! analyses efficiently.
82 //! - Layout optimizations for generators have been added to improve code generation for
83 //! async/await, which are very similar in spirit to what this optimization does. Both walk the
84 //! MIR and record conflicting uses of locals in a `BitMatrix`.
86 //! Also, rustc now has a simple NRVO pass (see `nrvo.rs`), which handles a subset of the cases that
87 //! this destination propagation pass handles, proving that similar optimizations can be performed
90 //! ## Pre/Post Optimization
92 //! It is recommended to run `SimplifyCfg` and then `SimplifyLocals` some time after this pass, as
93 //! it replaces the eliminated assign statements with `nop`s and leaves unused locals behind.
95 //! [liveness]: https://en.wikipedia.org/wiki/Live_variable_analysis
96 //! [previous attempt]: https://github.com/rust-lang/rust/pull/47954
97 //! [subsequent approach]: https://github.com/rust-lang/rust/pull/71003
99 use crate::dataflow
::impls
::{MaybeInitializedLocals, MaybeLiveLocals}
;
100 use crate::dataflow
::Analysis
;
103 util
::{dump_mir, PassWhere}
,
105 use itertools
::Itertools
;
106 use rustc_data_structures
::unify
::{InPlaceUnificationTable, UnifyKey}
;
108 bit_set
::{BitMatrix, BitSet}
,
111 use rustc_middle
::mir
::tcx
::PlaceTy
;
112 use rustc_middle
::mir
::visit
::{MutVisitor, PlaceContext, Visitor}
;
113 use rustc_middle
::mir
::{
114 traversal
, Body
, InlineAsmOperand
, Local
, LocalKind
, Location
, Operand
, Place
, PlaceElem
,
115 Rvalue
, Statement
, StatementKind
, Terminator
, TerminatorKind
,
117 use rustc_middle
::ty
::{self, Ty, TyCtxt}
;
119 // Empirical measurements have resulted in some observations:
120 // - Running on a body with a single block and 500 locals takes barely any time
121 // - Running on a body with ~400 blocks and ~300 relevant locals takes "too long"
122 // ...so we just limit both to somewhat reasonable-ish looking values.
123 const MAX_LOCALS
: usize = 500;
124 const MAX_BLOCKS
: usize = 250;
126 pub struct DestinationPropagation
;
128 impl<'tcx
> MirPass
<'tcx
> for DestinationPropagation
{
129 fn run_pass(&self, tcx
: TyCtxt
<'tcx
>, body
: &mut Body
<'tcx
>) {
130 // FIXME(#79191, #82678)
131 if !tcx
.sess
.opts
.debugging_opts
.unsound_mir_opts
{
135 // Only run at mir-opt-level=3 or higher for now (we don't fix up debuginfo and remove
136 // storage statements at the moment).
137 if tcx
.sess
.mir_opt_level() < 3 {
141 let def_id
= body
.source
.def_id();
143 let candidates
= find_candidates(tcx
, body
);
144 if candidates
.is_empty() {
145 debug
!("{:?}: no dest prop candidates, done", def_id
);
149 // Collect all locals we care about. We only compute conflicts for these to save time.
150 let mut relevant_locals
= BitSet
::new_empty(body
.local_decls
.len());
151 for CandidateAssignment { dest, src, loc: _ }
in &candidates
{
152 relevant_locals
.insert(dest
.local
);
153 relevant_locals
.insert(*src
);
156 // This pass unfortunately has `O(l² * s)` performance, where `l` is the number of locals
157 // and `s` is the number of statements and terminators in the function.
158 // To prevent blowing up compile times too much, we bail out when there are too many locals.
159 let relevant
= relevant_locals
.count();
161 "{:?}: {} locals ({} relevant), {} blocks",
163 body
.local_decls
.len(),
165 body
.basic_blocks().len()
167 if relevant
> MAX_LOCALS
{
169 "too many candidate locals in {:?} ({}, max is {}), not optimizing",
170 def_id
, relevant
, MAX_LOCALS
174 if body
.basic_blocks().len() > MAX_BLOCKS
{
176 "too many blocks in {:?} ({}, max is {}), not optimizing",
178 body
.basic_blocks().len(),
184 let mut conflicts
= Conflicts
::build(tcx
, body
, &relevant_locals
);
186 let mut replacements
= Replacements
::new(body
.local_decls
.len());
187 for candidate @ CandidateAssignment { dest, src, loc }
in candidates
{
188 // Merge locals that don't conflict.
189 if !conflicts
.can_unify(dest
.local
, src
) {
190 debug
!("at assignment {:?}, conflict {:?} vs. {:?}", loc
, dest
.local
, src
);
194 if replacements
.for_src(candidate
.src
).is_some() {
195 debug
!("src {:?} already has replacement", candidate
.src
);
199 if !tcx
.consider_optimizing(|| {
200 format
!("DestinationPropagation {:?} {:?}", def_id
, candidate
)
205 replacements
.push(candidate
);
206 conflicts
.unify(candidate
.src
, candidate
.dest
.local
);
209 replacements
.flatten(tcx
);
211 debug
!("replacements {:?}", replacements
.map
);
213 Replacer { tcx, replacements, place_elem_cache: Vec::new() }
.visit_body(body
);
215 // FIXME fix debug info
219 #[derive(Debug, Eq, PartialEq, Copy, Clone)]
220 struct UnifyLocal(Local
);
222 impl From
<Local
> for UnifyLocal
{
223 fn from(l
: Local
) -> Self {
228 impl UnifyKey
for UnifyLocal
{
230 fn index(&self) -> u32 {
233 fn from_index(u
: u32) -> Self {
234 Self(Local
::from_u32(u
))
236 fn tag() -> &'
static str {
241 struct Replacements
<'tcx
> {
242 /// Maps locals to their replacement.
243 map
: IndexVec
<Local
, Option
<Place
<'tcx
>>>,
245 /// Whose locals' live ranges to kill.
249 impl Replacements
<'tcx
> {
250 fn new(locals
: usize) -> Self {
251 Self { map: IndexVec::from_elem_n(None, locals), kill: BitSet::new_empty(locals) }
254 fn push(&mut self, candidate
: CandidateAssignment
<'tcx
>) {
255 trace
!("Replacements::push({:?})", candidate
);
256 let entry
= &mut self.map
[candidate
.src
];
257 assert
!(entry
.is_none());
259 *entry
= Some(candidate
.dest
);
260 self.kill
.insert(candidate
.src
);
261 self.kill
.insert(candidate
.dest
.local
);
264 /// Applies the stored replacements to all replacements, until no replacements would result in
265 /// locals that need further replacements when applied.
266 fn flatten(&mut self, tcx
: TyCtxt
<'tcx
>) {
267 // Note: This assumes that there are no cycles in the replacements, which is enforced via
268 // `self.unified_locals`. Otherwise this can cause an infinite loop.
270 for local
in self.map
.indices() {
271 if let Some(replacement
) = self.map
[local
] {
272 // Substitute the base local of `replacement` until fixpoint.
273 let mut base
= replacement
.local
;
274 let mut reversed_projection_slices
= Vec
::with_capacity(1);
275 while let Some(replacement_for_replacement
) = self.map
[base
] {
276 base
= replacement_for_replacement
.local
;
277 reversed_projection_slices
.push(replacement_for_replacement
.projection
);
280 let projection
: Vec
<_
> = reversed_projection_slices
283 .flat_map(|projs
| projs
.iter())
284 .chain(replacement
.projection
.iter())
286 let projection
= tcx
.intern_place_elems(&projection
);
288 // Replace with the final `Place`.
289 self.map
[local
] = Some(Place { local: base, projection }
);
294 fn for_src(&self, src
: Local
) -> Option
<Place
<'tcx
>> {
299 struct Replacer
<'tcx
> {
301 replacements
: Replacements
<'tcx
>,
302 place_elem_cache
: Vec
<PlaceElem
<'tcx
>>,
305 impl<'tcx
> MutVisitor
<'tcx
> for Replacer
<'tcx
> {
306 fn tcx
<'a
>(&'a
self) -> TyCtxt
<'tcx
> {
310 fn visit_local(&mut self, local
: &mut Local
, context
: PlaceContext
, location
: Location
) {
311 if context
.is_use() && self.replacements
.for_src(*local
).is_some() {
313 "use of local {:?} should have been replaced by visit_place; context={:?}, loc={:?}",
321 fn process_projection_elem(
323 elem
: PlaceElem
<'tcx
>,
325 ) -> Option
<PlaceElem
<'tcx
>> {
327 PlaceElem
::Index(local
) => {
328 if let Some(replacement
) = self.replacements
.for_src(local
) {
330 "cannot replace {:?} with {:?} in index projection {:?}",
343 fn visit_place(&mut self, place
: &mut Place
<'tcx
>, context
: PlaceContext
, location
: Location
) {
344 if let Some(replacement
) = self.replacements
.for_src(place
.local
) {
345 // Rebase `place`s projections onto `replacement`'s.
346 self.place_elem_cache
.clear();
347 self.place_elem_cache
.extend(replacement
.projection
.iter().chain(place
.projection
));
348 let projection
= self.tcx
.intern_place_elems(&self.place_elem_cache
);
349 let new_place
= Place { local: replacement.local, projection }
;
351 debug
!("Replacer: {:?} -> {:?}", place
, new_place
);
355 self.super_place(place
, context
, location
);
358 fn visit_statement(&mut self, statement
: &mut Statement
<'tcx
>, location
: Location
) {
359 self.super_statement(statement
, location
);
361 match &statement
.kind
{
362 // FIXME: Don't delete storage statements, merge the live ranges instead
363 StatementKind
::StorageDead(local
) | StatementKind
::StorageLive(local
)
364 if self.replacements
.kill
.contains(*local
) =>
369 StatementKind
::Assign(box (dest
, rvalue
)) => {
371 Rvalue
::Use(Operand
::Copy(place
) | Operand
::Move(place
)) => {
372 // These might've been turned into self-assignments by the replacement
373 // (this includes the original statement we wanted to eliminate).
375 debug
!("{:?} turned into self-assignment, deleting", location
);
376 statement
.make_nop();
388 struct Conflicts
<'a
> {
389 relevant_locals
: &'a BitSet
<Local
>,
391 /// The conflict matrix. It is always symmetric and the adjacency matrix of the corresponding
393 matrix
: BitMatrix
<Local
, Local
>,
395 /// Preallocated `BitSet` used by `unify`.
396 unify_cache
: BitSet
<Local
>,
398 /// Tracks locals that have been merged together to prevent cycles and propagate conflicts.
399 unified_locals
: InPlaceUnificationTable
<UnifyLocal
>,
405 body
: &'_ Body
<'tcx
>,
406 relevant_locals
: &'a BitSet
<Local
>,
408 // We don't have to look out for locals that have their address taken, since
409 // `find_candidates` already takes care of that.
411 let conflicts
= BitMatrix
::from_row_n(
412 &BitSet
::new_empty(body
.local_decls
.len()),
413 body
.local_decls
.len(),
416 let mut init
= MaybeInitializedLocals
417 .into_engine(tcx
, body
)
418 .iterate_to_fixpoint()
419 .into_results_cursor(body
);
421 MaybeLiveLocals
.into_engine(tcx
, body
).iterate_to_fixpoint().into_results_cursor(body
);
423 let mut reachable
= None
;
424 dump_mir(tcx
, None
, "DestinationPropagation-dataflow", &"", body
, |pass_where
, w
| {
425 let reachable
= reachable
.get_or_insert_with(|| traversal
::reachable_as_bitset(body
));
428 PassWhere
::BeforeLocation(loc
) if reachable
.contains(loc
.block
) => {
429 init
.seek_before_primary_effect(loc
);
430 live
.seek_after_primary_effect(loc
);
432 writeln
!(w
, " // init: {:?}", init
.get())?
;
433 writeln
!(w
, " // live: {:?}", live
.get())?
;
435 PassWhere
::AfterTerminator(bb
) if reachable
.contains(bb
) => {
436 let loc
= body
.terminator_loc(bb
);
437 init
.seek_after_primary_effect(loc
);
438 live
.seek_before_primary_effect(loc
);
440 writeln
!(w
, " // init: {:?}", init
.get())?
;
441 writeln
!(w
, " // live: {:?}", live
.get())?
;
444 PassWhere
::BeforeBlock(bb
) if reachable
.contains(bb
) => {
445 init
.seek_to_block_start(bb
);
446 live
.seek_to_block_start(bb
);
448 writeln
!(w
, " // init: {:?}", init
.get())?
;
449 writeln
!(w
, " // live: {:?}", live
.get())?
;
452 PassWhere
::BeforeCFG
| PassWhere
::AfterCFG
| PassWhere
::AfterLocation(_
) => {}
454 PassWhere
::BeforeLocation(_
) | PassWhere
::AfterTerminator(_
) => {
455 writeln
!(w
, " // init: <unreachable>")?
;
456 writeln
!(w
, " // live: <unreachable>")?
;
459 PassWhere
::BeforeBlock(_
) => {
460 writeln
!(w
, " // init: <unreachable>")?
;
461 writeln
!(w
, " // live: <unreachable>")?
;
468 let mut this
= Self {
471 unify_cache
: BitSet
::new_empty(body
.local_decls
.len()),
473 let mut table
= InPlaceUnificationTable
::new();
474 // Pre-fill table with all locals (this creates N nodes / "connected" components,
475 // "graph"-ically speaking).
476 for local
in 0..body
.local_decls
.len() {
477 assert_eq
!(table
.new_key(()), UnifyLocal(Local
::from_usize(local
)));
483 let mut live_and_init_locals
= Vec
::new();
485 // Visit only reachable basic blocks. The exact order is not important.
486 for (block
, data
) in traversal
::preorder(body
) {
487 // We need to observe the dataflow state *before* all possible locations (statement or
488 // terminator) in each basic block, and then observe the state *after* the terminator
489 // effect is applied. As long as neither `init` nor `borrowed` has a "before" effect,
490 // we will observe all possible dataflow states.
492 // Since liveness is a backwards analysis, we need to walk the results backwards. To do
493 // that, we first collect in the `MaybeInitializedLocals` results in a forwards
496 live_and_init_locals
.resize_with(data
.statements
.len() + 1, || {
497 BitSet
::new_empty(body
.local_decls
.len())
500 // First, go forwards for `MaybeInitializedLocals` and apply intra-statement/terminator
502 for (i
, statement
) in data
.statements
.iter().enumerate() {
503 this
.record_statement_conflicts(statement
);
505 let loc
= Location { block, statement_index: i }
;
506 init
.seek_before_primary_effect(loc
);
508 live_and_init_locals
[i
].clone_from(init
.get());
511 this
.record_terminator_conflicts(data
.terminator());
512 let term_loc
= Location { block, statement_index: data.statements.len() }
;
513 init
.seek_before_primary_effect(term_loc
);
514 live_and_init_locals
[term_loc
.statement_index
].clone_from(init
.get());
516 // Now, go backwards and union with the liveness results.
517 for statement_index
in (0..=data
.statements
.len()).rev() {
518 let loc
= Location { block, statement_index }
;
519 live
.seek_after_primary_effect(loc
);
521 live_and_init_locals
[statement_index
].intersect(live
.get());
523 trace
!("record conflicts at {:?}", loc
);
525 this
.record_dataflow_conflicts(&mut live_and_init_locals
[statement_index
]);
528 init
.seek_to_block_end(block
);
529 live
.seek_to_block_end(block
);
530 let mut conflicts
= init
.get().clone();
531 conflicts
.intersect(live
.get());
532 trace
!("record conflicts at end of {:?}", block
);
534 this
.record_dataflow_conflicts(&mut conflicts
);
540 fn record_dataflow_conflicts(&mut self, new_conflicts
: &mut BitSet
<Local
>) {
541 // Remove all locals that are not candidates.
542 new_conflicts
.intersect(self.relevant_locals
);
544 for local
in new_conflicts
.iter() {
545 self.matrix
.union_row_with(&new_conflicts
, local
);
549 fn record_local_conflict(&mut self, a
: Local
, b
: Local
, why
: &str) {
550 trace
!("conflict {:?} <-> {:?} due to {}", a
, b
, why
);
551 self.matrix
.insert(a
, b
);
552 self.matrix
.insert(b
, a
);
555 /// Records locals that must not overlap during the evaluation of `stmt`. These locals conflict
556 /// and must not be merged.
557 fn record_statement_conflicts(&mut self, stmt
: &Statement
<'_
>) {
559 // While the left and right sides of an assignment must not overlap, we do not mark
560 // conflicts here as that would make this optimization useless. When we optimize, we
561 // eliminate the resulting self-assignments automatically.
562 StatementKind
::Assign(_
) => {}
564 StatementKind
::LlvmInlineAsm(asm
) => {
565 // Inputs and outputs must not overlap.
566 for (_
, input
) in &*asm
.inputs
{
567 if let Some(in_place
) = input
.place() {
568 if !in_place
.is_indirect() {
569 for out_place
in &*asm
.outputs
{
570 if !out_place
.is_indirect() && !in_place
.is_indirect() {
571 self.record_local_conflict(
574 "aliasing llvm_asm! operands",
583 StatementKind
::SetDiscriminant { .. }
584 | StatementKind
::StorageLive(..)
585 | StatementKind
::StorageDead(..)
586 | StatementKind
::Retag(..)
587 | StatementKind
::FakeRead(..)
588 | StatementKind
::AscribeUserType(..)
589 | StatementKind
::Coverage(..)
590 | StatementKind
::CopyNonOverlapping(..)
591 | StatementKind
::Nop
=> {}
595 fn record_terminator_conflicts(&mut self, term
: &Terminator
<'_
>) {
597 TerminatorKind
::DropAndReplace
{
598 place
: dropped_place
,
603 if let Some(place
) = value
.place() {
604 if !place
.is_indirect() && !dropped_place
.is_indirect() {
605 self.record_local_conflict(
608 "DropAndReplace operand overlap",
613 TerminatorKind
::Yield { value, resume: _, resume_arg, drop: _ }
=> {
614 if let Some(place
) = value
.place() {
615 if !place
.is_indirect() && !resume_arg
.is_indirect() {
616 self.record_local_conflict(
619 "Yield operand overlap",
624 TerminatorKind
::Call
{
627 destination
: Some((dest_place
, _
)),
632 // No arguments may overlap with the destination.
633 for arg
in args
.iter().chain(Some(func
)) {
634 if let Some(place
) = arg
.place() {
635 if !place
.is_indirect() && !dest_place
.is_indirect() {
636 self.record_local_conflict(
639 "call dest/arg overlap",
645 TerminatorKind
::InlineAsm
{
652 // The intended semantics here aren't documented, we just assume that nothing that
653 // could be written to by the assembly may overlap with any other operands.
656 InlineAsmOperand
::Out { reg: _, late: _, place: Some(dest_place) }
657 | InlineAsmOperand
::InOut
{
661 out_place
: Some(dest_place
),
663 // For output place `place`, add all places accessed by the inline asm.
666 InlineAsmOperand
::In { reg: _, value }
=> {
667 if let Some(p
) = value
.place() {
668 if !p
.is_indirect() && !dest_place
.is_indirect() {
669 self.record_local_conflict(
672 "asm! operand overlap",
677 InlineAsmOperand
::Out
{
682 if !place
.is_indirect() && !dest_place
.is_indirect() {
683 self.record_local_conflict(
686 "asm! operand overlap",
690 InlineAsmOperand
::InOut
{
696 if let Some(place
) = in_value
.place() {
697 if !place
.is_indirect() && !dest_place
.is_indirect() {
698 self.record_local_conflict(
701 "asm! operand overlap",
706 if let Some(place
) = out_place
{
707 if !place
.is_indirect() && !dest_place
.is_indirect() {
708 self.record_local_conflict(
711 "asm! operand overlap",
716 InlineAsmOperand
::Out { reg: _, late: _, place: None }
717 | InlineAsmOperand
::Const { value: _ }
718 | InlineAsmOperand
::SymFn { value: _ }
719 | InlineAsmOperand
::SymStatic { def_id: _ }
=> {}
723 InlineAsmOperand
::Const { value }
=> {
724 assert
!(value
.place().is_none());
726 InlineAsmOperand
::InOut
{
732 | InlineAsmOperand
::In { reg: _, value: _ }
733 | InlineAsmOperand
::Out { reg: _, late: _, place: None }
734 | InlineAsmOperand
::SymFn { value: _ }
735 | InlineAsmOperand
::SymStatic { def_id: _ }
=> {}
740 TerminatorKind
::Goto { .. }
741 | TerminatorKind
::Call { destination: None, .. }
742 | TerminatorKind
::SwitchInt { .. }
743 | TerminatorKind
::Resume
744 | TerminatorKind
::Abort
745 | TerminatorKind
::Return
746 | TerminatorKind
::Unreachable
747 | TerminatorKind
::Drop { .. }
748 | TerminatorKind
::Assert { .. }
749 | TerminatorKind
::GeneratorDrop
750 | TerminatorKind
::FalseEdge { .. }
751 | TerminatorKind
::FalseUnwind { .. }
=> {}
755 /// Checks whether `a` and `b` may be merged. Returns `false` if there's a conflict.
756 fn can_unify(&mut self, a
: Local
, b
: Local
) -> bool
{
757 // After some locals have been unified, their conflicts are only tracked in the root key,
759 let a
= self.unified_locals
.find(a
).0;
760 let b
= self.unified_locals
.find(b
).0;
763 // Already merged (part of the same connected component).
767 if self.matrix
.contains(a
, b
) {
768 // Conflict (derived via dataflow, intra-statement conflicts, or inherited from another
769 // local during unification).
776 /// Merges the conflicts of `a` and `b`, so that each one inherits all conflicts of the other.
778 /// `can_unify` must have returned `true` for the same locals, or this may panic or lead to
781 /// This is called when the pass makes the decision to unify `a` and `b` (or parts of `a` and
782 /// `b`) and is needed to ensure that future unification decisions take potentially newly
783 /// introduced conflicts into account.
785 /// For an example, assume we have locals `_0`, `_1`, `_2`, and `_3`. There are these conflicts:
791 /// We then decide to merge `_2` with `_3` since they don't conflict. Then we decide to merge
792 /// `_2` with `_0`, which also doesn't have a conflict in the above list. However `_2` is now
793 /// `_3`, which does conflict with `_0`.
794 fn unify(&mut self, a
: Local
, b
: Local
) {
795 trace
!("unify({:?}, {:?})", a
, b
);
797 // Get the root local of the connected components. The root local stores the conflicts of
798 // all locals in the connected component (and *is stored* as the conflicting local of other
800 let a
= self.unified_locals
.find(a
).0;
801 let b
= self.unified_locals
.find(b
).0;
804 trace
!("roots: a={:?}, b={:?}", a
, b
);
805 trace
!("{:?} conflicts: {:?}", a
, self.matrix
.iter(a
).format(", "));
806 trace
!("{:?} conflicts: {:?}", b
, self.matrix
.iter(b
).format(", "));
808 self.unified_locals
.union(a
, b
);
810 let root
= self.unified_locals
.find(a
).0;
811 assert
!(root
== a
|| root
== b
);
813 // Make all locals that conflict with `a` also conflict with `b`, and vice versa.
814 self.unify_cache
.clear();
815 for conflicts_with_a
in self.matrix
.iter(a
) {
816 self.unify_cache
.insert(conflicts_with_a
);
818 for conflicts_with_b
in self.matrix
.iter(b
) {
819 self.unify_cache
.insert(conflicts_with_b
);
821 for conflicts_with_a_or_b
in self.unify_cache
.iter() {
822 // Set both `a` and `b` for this local's row.
823 self.matrix
.insert(conflicts_with_a_or_b
, a
);
824 self.matrix
.insert(conflicts_with_a_or_b
, b
);
827 // Write the locals `a` conflicts with to `b`'s row.
828 self.matrix
.union_rows(a
, b
);
829 // Write the locals `b` conflicts with to `a`'s row.
830 self.matrix
.union_rows(b
, a
);
834 /// A `dest = {move} src;` statement at `loc`.
836 /// We want to consider merging `dest` and `src` due to this assignment.
837 #[derive(Debug, Copy, Clone)]
838 struct CandidateAssignment
<'tcx
> {
839 /// Does not contain indirection or indexing (so the only local it contains is the place base).
845 /// Scans the MIR for assignments between locals that we might want to consider merging.
847 /// This will filter out assignments that do not match the right form (as described in the top-level
848 /// comment) and also throw out assignments that involve a local that has its address taken or is
849 /// otherwise ineligible (eg. locals used as array indices are ignored because we cannot propagate
850 /// arbitrary places into array indices).
851 fn find_candidates
<'a
, 'tcx
>(
853 body
: &'a Body
<'tcx
>,
854 ) -> Vec
<CandidateAssignment
<'tcx
>> {
855 let mut visitor
= FindAssignments
{
858 candidates
: Vec
::new(),
859 ever_borrowed_locals
: ever_borrowed_locals(body
),
860 locals_used_as_array_index
: locals_used_as_array_index(body
),
862 visitor
.visit_body(body
);
866 struct FindAssignments
<'a
, 'tcx
> {
868 body
: &'a Body
<'tcx
>,
869 candidates
: Vec
<CandidateAssignment
<'tcx
>>,
870 ever_borrowed_locals
: BitSet
<Local
>,
871 locals_used_as_array_index
: BitSet
<Local
>,
874 impl<'a
, 'tcx
> Visitor
<'tcx
> for FindAssignments
<'a
, 'tcx
> {
875 fn visit_statement(&mut self, statement
: &Statement
<'tcx
>, location
: Location
) {
876 if let StatementKind
::Assign(box (
878 Rvalue
::Use(Operand
::Copy(src
) | Operand
::Move(src
)),
881 // `dest` must not have pointer indirection.
882 if dest
.is_indirect() {
886 // `src` must be a plain local.
887 if !src
.projection
.is_empty() {
891 // Since we want to replace `src` with `dest`, `src` must not be required.
892 if is_local_required(src
.local
, self.body
) {
896 // Can't optimize if both locals ever have their address taken (can introduce
898 // FIXME: This can be smarter and take `StorageDead` into account (which
899 // invalidates borrows).
900 if self.ever_borrowed_locals
.contains(dest
.local
)
901 || self.ever_borrowed_locals
.contains(src
.local
)
906 assert_ne
!(dest
.local
, src
.local
, "self-assignments are UB");
908 // We can't replace locals occurring in `PlaceElem::Index` for now.
909 if self.locals_used_as_array_index
.contains(src
.local
) {
913 // Handle the "subtle case" described above by rejecting any `dest` that is or
914 // projects through a union.
915 let is_union
= |ty
: Ty
<'_
>| {
916 if let ty
::Adt(def
, _
) = ty
.kind() {
924 let mut place_ty
= PlaceTy
::from_ty(self.body
.local_decls
[dest
.local
].ty
);
925 if is_union(place_ty
.ty
) {
928 for elem
in dest
.projection
{
929 if let PlaceElem
::Index(_
) = elem
{
930 // `dest` contains an indexing projection.
934 place_ty
= place_ty
.projection_ty(self.tcx
, elem
);
935 if is_union(place_ty
.ty
) {
940 self.candidates
.push(CandidateAssignment
{
949 /// Some locals are part of the function's interface and can not be removed.
951 /// Note that these locals *can* still be merged with non-required locals by removing that other
953 fn is_local_required(local
: Local
, body
: &Body
<'_
>) -> bool
{
954 match body
.local_kind(local
) {
955 LocalKind
::Arg
| LocalKind
::ReturnPointer
=> true,
956 LocalKind
::Var
| LocalKind
::Temp
=> false,
960 /// Walks MIR to find all locals that have their address taken anywhere.
961 fn ever_borrowed_locals(body
: &Body
<'_
>) -> BitSet
<Local
> {
962 let mut visitor
= BorrowCollector { locals: BitSet::new_empty(body.local_decls.len()) }
;
963 visitor
.visit_body(body
);
967 struct BorrowCollector
{
968 locals
: BitSet
<Local
>,
971 impl<'tcx
> Visitor
<'tcx
> for BorrowCollector
{
972 fn visit_rvalue(&mut self, rvalue
: &Rvalue
<'tcx
>, location
: Location
) {
973 self.super_rvalue(rvalue
, location
);
976 Rvalue
::AddressOf(_
, borrowed_place
) | Rvalue
::Ref(_
, _
, borrowed_place
) => {
977 if !borrowed_place
.is_indirect() {
978 self.locals
.insert(borrowed_place
.local
);
986 | Rvalue
::BinaryOp(..)
987 | Rvalue
::CheckedBinaryOp(..)
988 | Rvalue
::NullaryOp(..)
989 | Rvalue
::UnaryOp(..)
990 | Rvalue
::Discriminant(..)
991 | Rvalue
::Aggregate(..)
992 | Rvalue
::ThreadLocalRef(..) => {}
996 fn visit_terminator(&mut self, terminator
: &Terminator
<'tcx
>, location
: Location
) {
997 self.super_terminator(terminator
, location
);
999 match terminator
.kind
{
1000 TerminatorKind
::Drop { place: dropped_place, .. }
1001 | TerminatorKind
::DropAndReplace { place: dropped_place, .. }
=> {
1002 self.locals
.insert(dropped_place
.local
);
1005 TerminatorKind
::Abort
1006 | TerminatorKind
::Assert { .. }
1007 | TerminatorKind
::Call { .. }
1008 | TerminatorKind
::FalseEdge { .. }
1009 | TerminatorKind
::FalseUnwind { .. }
1010 | TerminatorKind
::GeneratorDrop
1011 | TerminatorKind
::Goto { .. }
1012 | TerminatorKind
::Resume
1013 | TerminatorKind
::Return
1014 | TerminatorKind
::SwitchInt { .. }
1015 | TerminatorKind
::Unreachable
1016 | TerminatorKind
::Yield { .. }
1017 | TerminatorKind
::InlineAsm { .. }
=> {}
1022 /// `PlaceElem::Index` only stores a `Local`, so we can't replace that with a full `Place`.
1024 /// Collect locals used as indices so we don't generate candidates that are impossible to apply
1026 fn locals_used_as_array_index(body
: &Body
<'_
>) -> BitSet
<Local
> {
1027 let mut visitor
= IndexCollector { locals: BitSet::new_empty(body.local_decls.len()) }
;
1028 visitor
.visit_body(body
);
1032 struct IndexCollector
{
1033 locals
: BitSet
<Local
>,
1036 impl<'tcx
> Visitor
<'tcx
> for IndexCollector
{
1037 fn visit_projection_elem(
1040 proj_base
: &[PlaceElem
<'tcx
>],
1041 elem
: PlaceElem
<'tcx
>,
1042 context
: PlaceContext
,
1045 if let PlaceElem
::Index(i
) = elem
{
1046 self.locals
.insert(i
);
1048 self.super_projection_elem(local
, proj_base
, elem
, context
, location
);