]> git.proxmox.com Git - rustc.git/blob - compiler/rustc_mir/src/transform/dest_prop.rs
New upstream version 1.52.0~beta.3+dfsg1
[rustc.git] / compiler / rustc_mir / src / transform / dest_prop.rs
1 //! Propagates assignment destinations backwards in the CFG to eliminate redundant assignments.
2 //!
3 //! # Motivation
4 //!
5 //! MIR building can insert a lot of redundant copies, and Rust code in general often tends to move
6 //! values around a lot. The result is a lot of assignments of the form `dest = {move} src;` in MIR.
7 //! MIR building for constants in particular tends to create additional locals that are only used
8 //! inside a single block to shuffle a value around unnecessarily.
9 //!
10 //! LLVM by itself is not good enough at eliminating these redundant copies (eg. see
11 //! <https://github.com/rust-lang/rust/issues/32966>), so this leaves some performance on the table
12 //! that we can regain by implementing an optimization for removing these assign statements in rustc
13 //! itself. When this optimization runs fast enough, it can also speed up the constant evaluation
14 //! and code generation phases of rustc due to the reduced number of statements and locals.
15 //!
16 //! # The Optimization
17 //!
18 //! Conceptually, this optimization is "destination propagation". It is similar to the Named Return
19 //! Value Optimization, or NRVO, known from the C++ world, except that it isn't limited to return
20 //! values or the return place `_0`. On a very high level, independent of the actual implementation
21 //! details, it does the following:
22 //!
23 //! 1) Identify `dest = src;` statements that can be soundly eliminated.
24 //! 2) Replace all mentions of `src` with `dest` ("unifying" them and propagating the destination
25 //! backwards).
26 //! 3) Delete the `dest = src;` statement (by making it a `nop`).
27 //!
28 //! Step 1) is by far the hardest, so it is explained in more detail below.
29 //!
30 //! ## Soundness
31 //!
32 //! Given an `Assign` statement `dest = src;`, where `dest` is a `Place` and `src` is an `Rvalue`,
33 //! there are a few requirements that must hold for the optimization to be sound:
34 //!
35 //! * `dest` must not contain any *indirection* through a pointer. It must access part of the base
36 //! local. Otherwise it might point to arbitrary memory that is hard to track.
37 //!
38 //! It must also not contain any indexing projections, since those take an arbitrary `Local` as
39 //! the index, and that local might only be initialized shortly before `dest` is used.
40 //!
41 //! Subtle case: If `dest` is a, or projects through a union, then we have to make sure that there
42 //! remains an assignment to it, since that sets the "active field" of the union. But if `src` is
43 //! a ZST, it might not be initialized, so there might not be any use of it before the assignment,
44 //! and performing the optimization would simply delete the assignment, leaving `dest`
45 //! uninitialized.
46 //!
47 //! * `src` must be a bare `Local` without any indirections or field projections (FIXME: Is this a
48 //! fundamental restriction or just current impl state?). It can be copied or moved by the
49 //! assignment.
50 //!
51 //! * The `dest` and `src` locals must never be [*live*][liveness] at the same time. If they are, it
52 //! means that they both hold a (potentially different) value that is needed by a future use of
53 //! the locals. Unifying them would overwrite one of the values.
54 //!
55 //! Note that computing liveness of locals that have had their address taken is more difficult:
56 //! Short of doing full escape analysis on the address/pointer/reference, the pass would need to
57 //! assume that any operation that can potentially involve opaque user code (such as function
58 //! calls, destructors, and inline assembly) may access any local that had its address taken
59 //! before that point.
60 //!
61 //! Here, the first two conditions are simple structural requirements on the `Assign` statements
62 //! that can be trivially checked. The liveness requirement however is more difficult and costly to
63 //! check.
64 //!
65 //! ## Previous Work
66 //!
67 //! A [previous attempt] at implementing an optimization like this turned out to be a significant
68 //! regression in compiler performance. Fixing the regressions introduced a lot of undesirable
69 //! complexity to the implementation.
70 //!
71 //! A [subsequent approach] tried to avoid the costly computation by limiting itself to acyclic
72 //! CFGs, but still turned out to be far too costly to run due to suboptimal performance within
73 //! individual basic blocks, requiring a walk across the entire block for every assignment found
74 //! within the block. For the `tuple-stress` benchmark, which has 458745 statements in a single
75 //! block, this proved to be far too costly.
76 //!
77 //! Since the first attempt at this, the compiler has improved dramatically, and new analysis
78 //! frameworks have been added that should make this approach viable without requiring a limited
79 //! approach that only works for some classes of CFGs:
80 //! - rustc now has a powerful dataflow analysis framework that can handle forwards and backwards
81 //! analyses efficiently.
82 //! - Layout optimizations for generators have been added to improve code generation for
83 //! async/await, which are very similar in spirit to what this optimization does. Both walk the
84 //! MIR and record conflicting uses of locals in a `BitMatrix`.
85 //!
86 //! Also, rustc now has a simple NRVO pass (see `nrvo.rs`), which handles a subset of the cases that
87 //! this destination propagation pass handles, proving that similar optimizations can be performed
88 //! on MIR.
89 //!
90 //! ## Pre/Post Optimization
91 //!
92 //! It is recommended to run `SimplifyCfg` and then `SimplifyLocals` some time after this pass, as
93 //! it replaces the eliminated assign statements with `nop`s and leaves unused locals behind.
94 //!
95 //! [liveness]: https://en.wikipedia.org/wiki/Live_variable_analysis
96 //! [previous attempt]: https://github.com/rust-lang/rust/pull/47954
97 //! [subsequent approach]: https://github.com/rust-lang/rust/pull/71003
98
99 use crate::dataflow::impls::{MaybeInitializedLocals, MaybeLiveLocals};
100 use crate::dataflow::Analysis;
101 use crate::{
102 transform::MirPass,
103 util::{dump_mir, PassWhere},
104 };
105 use itertools::Itertools;
106 use rustc_data_structures::unify::{InPlaceUnificationTable, UnifyKey};
107 use rustc_index::{
108 bit_set::{BitMatrix, BitSet},
109 vec::IndexVec,
110 };
111 use rustc_middle::mir::tcx::PlaceTy;
112 use rustc_middle::mir::visit::{MutVisitor, PlaceContext, Visitor};
113 use rustc_middle::mir::{
114 traversal, Body, InlineAsmOperand, Local, LocalKind, Location, Operand, Place, PlaceElem,
115 Rvalue, Statement, StatementKind, Terminator, TerminatorKind,
116 };
117 use rustc_middle::ty::{self, Ty, TyCtxt};
118
119 // Empirical measurements have resulted in some observations:
120 // - Running on a body with a single block and 500 locals takes barely any time
121 // - Running on a body with ~400 blocks and ~300 relevant locals takes "too long"
122 // ...so we just limit both to somewhat reasonable-ish looking values.
123 const MAX_LOCALS: usize = 500;
124 const MAX_BLOCKS: usize = 250;
125
126 pub struct DestinationPropagation;
127
128 impl<'tcx> MirPass<'tcx> for DestinationPropagation {
129 fn run_pass(&self, tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
130 // FIXME(#79191, #82678)
131 if !tcx.sess.opts.debugging_opts.unsound_mir_opts {
132 return;
133 }
134
135 // Only run at mir-opt-level=3 or higher for now (we don't fix up debuginfo and remove
136 // storage statements at the moment).
137 if tcx.sess.mir_opt_level() < 3 {
138 return;
139 }
140
141 let def_id = body.source.def_id();
142
143 let candidates = find_candidates(tcx, body);
144 if candidates.is_empty() {
145 debug!("{:?}: no dest prop candidates, done", def_id);
146 return;
147 }
148
149 // Collect all locals we care about. We only compute conflicts for these to save time.
150 let mut relevant_locals = BitSet::new_empty(body.local_decls.len());
151 for CandidateAssignment { dest, src, loc: _ } in &candidates {
152 relevant_locals.insert(dest.local);
153 relevant_locals.insert(*src);
154 }
155
156 // This pass unfortunately has `O(l² * s)` performance, where `l` is the number of locals
157 // and `s` is the number of statements and terminators in the function.
158 // To prevent blowing up compile times too much, we bail out when there are too many locals.
159 let relevant = relevant_locals.count();
160 debug!(
161 "{:?}: {} locals ({} relevant), {} blocks",
162 def_id,
163 body.local_decls.len(),
164 relevant,
165 body.basic_blocks().len()
166 );
167 if relevant > MAX_LOCALS {
168 warn!(
169 "too many candidate locals in {:?} ({}, max is {}), not optimizing",
170 def_id, relevant, MAX_LOCALS
171 );
172 return;
173 }
174 if body.basic_blocks().len() > MAX_BLOCKS {
175 warn!(
176 "too many blocks in {:?} ({}, max is {}), not optimizing",
177 def_id,
178 body.basic_blocks().len(),
179 MAX_BLOCKS
180 );
181 return;
182 }
183
184 let mut conflicts = Conflicts::build(tcx, body, &relevant_locals);
185
186 let mut replacements = Replacements::new(body.local_decls.len());
187 for candidate @ CandidateAssignment { dest, src, loc } in candidates {
188 // Merge locals that don't conflict.
189 if !conflicts.can_unify(dest.local, src) {
190 debug!("at assignment {:?}, conflict {:?} vs. {:?}", loc, dest.local, src);
191 continue;
192 }
193
194 if replacements.for_src(candidate.src).is_some() {
195 debug!("src {:?} already has replacement", candidate.src);
196 continue;
197 }
198
199 if !tcx.consider_optimizing(|| {
200 format!("DestinationPropagation {:?} {:?}", def_id, candidate)
201 }) {
202 break;
203 }
204
205 replacements.push(candidate);
206 conflicts.unify(candidate.src, candidate.dest.local);
207 }
208
209 replacements.flatten(tcx);
210
211 debug!("replacements {:?}", replacements.map);
212
213 Replacer { tcx, replacements, place_elem_cache: Vec::new() }.visit_body(body);
214
215 // FIXME fix debug info
216 }
217 }
218
219 #[derive(Debug, Eq, PartialEq, Copy, Clone)]
220 struct UnifyLocal(Local);
221
222 impl From<Local> for UnifyLocal {
223 fn from(l: Local) -> Self {
224 Self(l)
225 }
226 }
227
228 impl UnifyKey for UnifyLocal {
229 type Value = ();
230 fn index(&self) -> u32 {
231 self.0.as_u32()
232 }
233 fn from_index(u: u32) -> Self {
234 Self(Local::from_u32(u))
235 }
236 fn tag() -> &'static str {
237 "UnifyLocal"
238 }
239 }
240
241 struct Replacements<'tcx> {
242 /// Maps locals to their replacement.
243 map: IndexVec<Local, Option<Place<'tcx>>>,
244
245 /// Whose locals' live ranges to kill.
246 kill: BitSet<Local>,
247 }
248
249 impl Replacements<'tcx> {
250 fn new(locals: usize) -> Self {
251 Self { map: IndexVec::from_elem_n(None, locals), kill: BitSet::new_empty(locals) }
252 }
253
254 fn push(&mut self, candidate: CandidateAssignment<'tcx>) {
255 trace!("Replacements::push({:?})", candidate);
256 let entry = &mut self.map[candidate.src];
257 assert!(entry.is_none());
258
259 *entry = Some(candidate.dest);
260 self.kill.insert(candidate.src);
261 self.kill.insert(candidate.dest.local);
262 }
263
264 /// Applies the stored replacements to all replacements, until no replacements would result in
265 /// locals that need further replacements when applied.
266 fn flatten(&mut self, tcx: TyCtxt<'tcx>) {
267 // Note: This assumes that there are no cycles in the replacements, which is enforced via
268 // `self.unified_locals`. Otherwise this can cause an infinite loop.
269
270 for local in self.map.indices() {
271 if let Some(replacement) = self.map[local] {
272 // Substitute the base local of `replacement` until fixpoint.
273 let mut base = replacement.local;
274 let mut reversed_projection_slices = Vec::with_capacity(1);
275 while let Some(replacement_for_replacement) = self.map[base] {
276 base = replacement_for_replacement.local;
277 reversed_projection_slices.push(replacement_for_replacement.projection);
278 }
279
280 let projection: Vec<_> = reversed_projection_slices
281 .iter()
282 .rev()
283 .flat_map(|projs| projs.iter())
284 .chain(replacement.projection.iter())
285 .collect();
286 let projection = tcx.intern_place_elems(&projection);
287
288 // Replace with the final `Place`.
289 self.map[local] = Some(Place { local: base, projection });
290 }
291 }
292 }
293
294 fn for_src(&self, src: Local) -> Option<Place<'tcx>> {
295 self.map[src]
296 }
297 }
298
299 struct Replacer<'tcx> {
300 tcx: TyCtxt<'tcx>,
301 replacements: Replacements<'tcx>,
302 place_elem_cache: Vec<PlaceElem<'tcx>>,
303 }
304
305 impl<'tcx> MutVisitor<'tcx> for Replacer<'tcx> {
306 fn tcx<'a>(&'a self) -> TyCtxt<'tcx> {
307 self.tcx
308 }
309
310 fn visit_local(&mut self, local: &mut Local, context: PlaceContext, location: Location) {
311 if context.is_use() && self.replacements.for_src(*local).is_some() {
312 bug!(
313 "use of local {:?} should have been replaced by visit_place; context={:?}, loc={:?}",
314 local,
315 context,
316 location,
317 );
318 }
319 }
320
321 fn process_projection_elem(
322 &mut self,
323 elem: PlaceElem<'tcx>,
324 _: Location,
325 ) -> Option<PlaceElem<'tcx>> {
326 match elem {
327 PlaceElem::Index(local) => {
328 if let Some(replacement) = self.replacements.for_src(local) {
329 bug!(
330 "cannot replace {:?} with {:?} in index projection {:?}",
331 local,
332 replacement,
333 elem,
334 );
335 } else {
336 None
337 }
338 }
339 _ => None,
340 }
341 }
342
343 fn visit_place(&mut self, place: &mut Place<'tcx>, context: PlaceContext, location: Location) {
344 if let Some(replacement) = self.replacements.for_src(place.local) {
345 // Rebase `place`s projections onto `replacement`'s.
346 self.place_elem_cache.clear();
347 self.place_elem_cache.extend(replacement.projection.iter().chain(place.projection));
348 let projection = self.tcx.intern_place_elems(&self.place_elem_cache);
349 let new_place = Place { local: replacement.local, projection };
350
351 debug!("Replacer: {:?} -> {:?}", place, new_place);
352 *place = new_place;
353 }
354
355 self.super_place(place, context, location);
356 }
357
358 fn visit_statement(&mut self, statement: &mut Statement<'tcx>, location: Location) {
359 self.super_statement(statement, location);
360
361 match &statement.kind {
362 // FIXME: Don't delete storage statements, merge the live ranges instead
363 StatementKind::StorageDead(local) | StatementKind::StorageLive(local)
364 if self.replacements.kill.contains(*local) =>
365 {
366 statement.make_nop()
367 }
368
369 StatementKind::Assign(box (dest, rvalue)) => {
370 match rvalue {
371 Rvalue::Use(Operand::Copy(place) | Operand::Move(place)) => {
372 // These might've been turned into self-assignments by the replacement
373 // (this includes the original statement we wanted to eliminate).
374 if dest == place {
375 debug!("{:?} turned into self-assignment, deleting", location);
376 statement.make_nop();
377 }
378 }
379 _ => {}
380 }
381 }
382
383 _ => {}
384 }
385 }
386 }
387
388 struct Conflicts<'a> {
389 relevant_locals: &'a BitSet<Local>,
390
391 /// The conflict matrix. It is always symmetric and the adjacency matrix of the corresponding
392 /// conflict graph.
393 matrix: BitMatrix<Local, Local>,
394
395 /// Preallocated `BitSet` used by `unify`.
396 unify_cache: BitSet<Local>,
397
398 /// Tracks locals that have been merged together to prevent cycles and propagate conflicts.
399 unified_locals: InPlaceUnificationTable<UnifyLocal>,
400 }
401
402 impl Conflicts<'a> {
403 fn build<'tcx>(
404 tcx: TyCtxt<'tcx>,
405 body: &'_ Body<'tcx>,
406 relevant_locals: &'a BitSet<Local>,
407 ) -> Self {
408 // We don't have to look out for locals that have their address taken, since
409 // `find_candidates` already takes care of that.
410
411 let conflicts = BitMatrix::from_row_n(
412 &BitSet::new_empty(body.local_decls.len()),
413 body.local_decls.len(),
414 );
415
416 let mut init = MaybeInitializedLocals
417 .into_engine(tcx, body)
418 .iterate_to_fixpoint()
419 .into_results_cursor(body);
420 let mut live =
421 MaybeLiveLocals.into_engine(tcx, body).iterate_to_fixpoint().into_results_cursor(body);
422
423 let mut reachable = None;
424 dump_mir(tcx, None, "DestinationPropagation-dataflow", &"", body, |pass_where, w| {
425 let reachable = reachable.get_or_insert_with(|| traversal::reachable_as_bitset(body));
426
427 match pass_where {
428 PassWhere::BeforeLocation(loc) if reachable.contains(loc.block) => {
429 init.seek_before_primary_effect(loc);
430 live.seek_after_primary_effect(loc);
431
432 writeln!(w, " // init: {:?}", init.get())?;
433 writeln!(w, " // live: {:?}", live.get())?;
434 }
435 PassWhere::AfterTerminator(bb) if reachable.contains(bb) => {
436 let loc = body.terminator_loc(bb);
437 init.seek_after_primary_effect(loc);
438 live.seek_before_primary_effect(loc);
439
440 writeln!(w, " // init: {:?}", init.get())?;
441 writeln!(w, " // live: {:?}", live.get())?;
442 }
443
444 PassWhere::BeforeBlock(bb) if reachable.contains(bb) => {
445 init.seek_to_block_start(bb);
446 live.seek_to_block_start(bb);
447
448 writeln!(w, " // init: {:?}", init.get())?;
449 writeln!(w, " // live: {:?}", live.get())?;
450 }
451
452 PassWhere::BeforeCFG | PassWhere::AfterCFG | PassWhere::AfterLocation(_) => {}
453
454 PassWhere::BeforeLocation(_) | PassWhere::AfterTerminator(_) => {
455 writeln!(w, " // init: <unreachable>")?;
456 writeln!(w, " // live: <unreachable>")?;
457 }
458
459 PassWhere::BeforeBlock(_) => {
460 writeln!(w, " // init: <unreachable>")?;
461 writeln!(w, " // live: <unreachable>")?;
462 }
463 }
464
465 Ok(())
466 });
467
468 let mut this = Self {
469 relevant_locals,
470 matrix: conflicts,
471 unify_cache: BitSet::new_empty(body.local_decls.len()),
472 unified_locals: {
473 let mut table = InPlaceUnificationTable::new();
474 // Pre-fill table with all locals (this creates N nodes / "connected" components,
475 // "graph"-ically speaking).
476 for local in 0..body.local_decls.len() {
477 assert_eq!(table.new_key(()), UnifyLocal(Local::from_usize(local)));
478 }
479 table
480 },
481 };
482
483 let mut live_and_init_locals = Vec::new();
484
485 // Visit only reachable basic blocks. The exact order is not important.
486 for (block, data) in traversal::preorder(body) {
487 // We need to observe the dataflow state *before* all possible locations (statement or
488 // terminator) in each basic block, and then observe the state *after* the terminator
489 // effect is applied. As long as neither `init` nor `borrowed` has a "before" effect,
490 // we will observe all possible dataflow states.
491
492 // Since liveness is a backwards analysis, we need to walk the results backwards. To do
493 // that, we first collect in the `MaybeInitializedLocals` results in a forwards
494 // traversal.
495
496 live_and_init_locals.resize_with(data.statements.len() + 1, || {
497 BitSet::new_empty(body.local_decls.len())
498 });
499
500 // First, go forwards for `MaybeInitializedLocals` and apply intra-statement/terminator
501 // conflicts.
502 for (i, statement) in data.statements.iter().enumerate() {
503 this.record_statement_conflicts(statement);
504
505 let loc = Location { block, statement_index: i };
506 init.seek_before_primary_effect(loc);
507
508 live_and_init_locals[i].clone_from(init.get());
509 }
510
511 this.record_terminator_conflicts(data.terminator());
512 let term_loc = Location { block, statement_index: data.statements.len() };
513 init.seek_before_primary_effect(term_loc);
514 live_and_init_locals[term_loc.statement_index].clone_from(init.get());
515
516 // Now, go backwards and union with the liveness results.
517 for statement_index in (0..=data.statements.len()).rev() {
518 let loc = Location { block, statement_index };
519 live.seek_after_primary_effect(loc);
520
521 live_and_init_locals[statement_index].intersect(live.get());
522
523 trace!("record conflicts at {:?}", loc);
524
525 this.record_dataflow_conflicts(&mut live_and_init_locals[statement_index]);
526 }
527
528 init.seek_to_block_end(block);
529 live.seek_to_block_end(block);
530 let mut conflicts = init.get().clone();
531 conflicts.intersect(live.get());
532 trace!("record conflicts at end of {:?}", block);
533
534 this.record_dataflow_conflicts(&mut conflicts);
535 }
536
537 this
538 }
539
540 fn record_dataflow_conflicts(&mut self, new_conflicts: &mut BitSet<Local>) {
541 // Remove all locals that are not candidates.
542 new_conflicts.intersect(self.relevant_locals);
543
544 for local in new_conflicts.iter() {
545 self.matrix.union_row_with(&new_conflicts, local);
546 }
547 }
548
549 fn record_local_conflict(&mut self, a: Local, b: Local, why: &str) {
550 trace!("conflict {:?} <-> {:?} due to {}", a, b, why);
551 self.matrix.insert(a, b);
552 self.matrix.insert(b, a);
553 }
554
555 /// Records locals that must not overlap during the evaluation of `stmt`. These locals conflict
556 /// and must not be merged.
557 fn record_statement_conflicts(&mut self, stmt: &Statement<'_>) {
558 match &stmt.kind {
559 // While the left and right sides of an assignment must not overlap, we do not mark
560 // conflicts here as that would make this optimization useless. When we optimize, we
561 // eliminate the resulting self-assignments automatically.
562 StatementKind::Assign(_) => {}
563
564 StatementKind::LlvmInlineAsm(asm) => {
565 // Inputs and outputs must not overlap.
566 for (_, input) in &*asm.inputs {
567 if let Some(in_place) = input.place() {
568 if !in_place.is_indirect() {
569 for out_place in &*asm.outputs {
570 if !out_place.is_indirect() && !in_place.is_indirect() {
571 self.record_local_conflict(
572 in_place.local,
573 out_place.local,
574 "aliasing llvm_asm! operands",
575 );
576 }
577 }
578 }
579 }
580 }
581 }
582
583 StatementKind::SetDiscriminant { .. }
584 | StatementKind::StorageLive(..)
585 | StatementKind::StorageDead(..)
586 | StatementKind::Retag(..)
587 | StatementKind::FakeRead(..)
588 | StatementKind::AscribeUserType(..)
589 | StatementKind::Coverage(..)
590 | StatementKind::CopyNonOverlapping(..)
591 | StatementKind::Nop => {}
592 }
593 }
594
595 fn record_terminator_conflicts(&mut self, term: &Terminator<'_>) {
596 match &term.kind {
597 TerminatorKind::DropAndReplace {
598 place: dropped_place,
599 value,
600 target: _,
601 unwind: _,
602 } => {
603 if let Some(place) = value.place() {
604 if !place.is_indirect() && !dropped_place.is_indirect() {
605 self.record_local_conflict(
606 place.local,
607 dropped_place.local,
608 "DropAndReplace operand overlap",
609 );
610 }
611 }
612 }
613 TerminatorKind::Yield { value, resume: _, resume_arg, drop: _ } => {
614 if let Some(place) = value.place() {
615 if !place.is_indirect() && !resume_arg.is_indirect() {
616 self.record_local_conflict(
617 place.local,
618 resume_arg.local,
619 "Yield operand overlap",
620 );
621 }
622 }
623 }
624 TerminatorKind::Call {
625 func,
626 args,
627 destination: Some((dest_place, _)),
628 cleanup: _,
629 from_hir_call: _,
630 fn_span: _,
631 } => {
632 // No arguments may overlap with the destination.
633 for arg in args.iter().chain(Some(func)) {
634 if let Some(place) = arg.place() {
635 if !place.is_indirect() && !dest_place.is_indirect() {
636 self.record_local_conflict(
637 dest_place.local,
638 place.local,
639 "call dest/arg overlap",
640 );
641 }
642 }
643 }
644 }
645 TerminatorKind::InlineAsm {
646 template: _,
647 operands,
648 options: _,
649 line_spans: _,
650 destination: _,
651 } => {
652 // The intended semantics here aren't documented, we just assume that nothing that
653 // could be written to by the assembly may overlap with any other operands.
654 for op in operands {
655 match op {
656 InlineAsmOperand::Out { reg: _, late: _, place: Some(dest_place) }
657 | InlineAsmOperand::InOut {
658 reg: _,
659 late: _,
660 in_value: _,
661 out_place: Some(dest_place),
662 } => {
663 // For output place `place`, add all places accessed by the inline asm.
664 for op in operands {
665 match op {
666 InlineAsmOperand::In { reg: _, value } => {
667 if let Some(p) = value.place() {
668 if !p.is_indirect() && !dest_place.is_indirect() {
669 self.record_local_conflict(
670 p.local,
671 dest_place.local,
672 "asm! operand overlap",
673 );
674 }
675 }
676 }
677 InlineAsmOperand::Out {
678 reg: _,
679 late: _,
680 place: Some(place),
681 } => {
682 if !place.is_indirect() && !dest_place.is_indirect() {
683 self.record_local_conflict(
684 place.local,
685 dest_place.local,
686 "asm! operand overlap",
687 );
688 }
689 }
690 InlineAsmOperand::InOut {
691 reg: _,
692 late: _,
693 in_value,
694 out_place,
695 } => {
696 if let Some(place) = in_value.place() {
697 if !place.is_indirect() && !dest_place.is_indirect() {
698 self.record_local_conflict(
699 place.local,
700 dest_place.local,
701 "asm! operand overlap",
702 );
703 }
704 }
705
706 if let Some(place) = out_place {
707 if !place.is_indirect() && !dest_place.is_indirect() {
708 self.record_local_conflict(
709 place.local,
710 dest_place.local,
711 "asm! operand overlap",
712 );
713 }
714 }
715 }
716 InlineAsmOperand::Out { reg: _, late: _, place: None }
717 | InlineAsmOperand::Const { value: _ }
718 | InlineAsmOperand::SymFn { value: _ }
719 | InlineAsmOperand::SymStatic { def_id: _ } => {}
720 }
721 }
722 }
723 InlineAsmOperand::Const { value } => {
724 assert!(value.place().is_none());
725 }
726 InlineAsmOperand::InOut {
727 reg: _,
728 late: _,
729 in_value: _,
730 out_place: None,
731 }
732 | InlineAsmOperand::In { reg: _, value: _ }
733 | InlineAsmOperand::Out { reg: _, late: _, place: None }
734 | InlineAsmOperand::SymFn { value: _ }
735 | InlineAsmOperand::SymStatic { def_id: _ } => {}
736 }
737 }
738 }
739
740 TerminatorKind::Goto { .. }
741 | TerminatorKind::Call { destination: None, .. }
742 | TerminatorKind::SwitchInt { .. }
743 | TerminatorKind::Resume
744 | TerminatorKind::Abort
745 | TerminatorKind::Return
746 | TerminatorKind::Unreachable
747 | TerminatorKind::Drop { .. }
748 | TerminatorKind::Assert { .. }
749 | TerminatorKind::GeneratorDrop
750 | TerminatorKind::FalseEdge { .. }
751 | TerminatorKind::FalseUnwind { .. } => {}
752 }
753 }
754
755 /// Checks whether `a` and `b` may be merged. Returns `false` if there's a conflict.
756 fn can_unify(&mut self, a: Local, b: Local) -> bool {
757 // After some locals have been unified, their conflicts are only tracked in the root key,
758 // so look that up.
759 let a = self.unified_locals.find(a).0;
760 let b = self.unified_locals.find(b).0;
761
762 if a == b {
763 // Already merged (part of the same connected component).
764 return false;
765 }
766
767 if self.matrix.contains(a, b) {
768 // Conflict (derived via dataflow, intra-statement conflicts, or inherited from another
769 // local during unification).
770 return false;
771 }
772
773 true
774 }
775
776 /// Merges the conflicts of `a` and `b`, so that each one inherits all conflicts of the other.
777 ///
778 /// `can_unify` must have returned `true` for the same locals, or this may panic or lead to
779 /// miscompiles.
780 ///
781 /// This is called when the pass makes the decision to unify `a` and `b` (or parts of `a` and
782 /// `b`) and is needed to ensure that future unification decisions take potentially newly
783 /// introduced conflicts into account.
784 ///
785 /// For an example, assume we have locals `_0`, `_1`, `_2`, and `_3`. There are these conflicts:
786 ///
787 /// * `_0` <-> `_1`
788 /// * `_1` <-> `_2`
789 /// * `_3` <-> `_0`
790 ///
791 /// We then decide to merge `_2` with `_3` since they don't conflict. Then we decide to merge
792 /// `_2` with `_0`, which also doesn't have a conflict in the above list. However `_2` is now
793 /// `_3`, which does conflict with `_0`.
794 fn unify(&mut self, a: Local, b: Local) {
795 trace!("unify({:?}, {:?})", a, b);
796
797 // Get the root local of the connected components. The root local stores the conflicts of
798 // all locals in the connected component (and *is stored* as the conflicting local of other
799 // locals).
800 let a = self.unified_locals.find(a).0;
801 let b = self.unified_locals.find(b).0;
802 assert_ne!(a, b);
803
804 trace!("roots: a={:?}, b={:?}", a, b);
805 trace!("{:?} conflicts: {:?}", a, self.matrix.iter(a).format(", "));
806 trace!("{:?} conflicts: {:?}", b, self.matrix.iter(b).format(", "));
807
808 self.unified_locals.union(a, b);
809
810 let root = self.unified_locals.find(a).0;
811 assert!(root == a || root == b);
812
813 // Make all locals that conflict with `a` also conflict with `b`, and vice versa.
814 self.unify_cache.clear();
815 for conflicts_with_a in self.matrix.iter(a) {
816 self.unify_cache.insert(conflicts_with_a);
817 }
818 for conflicts_with_b in self.matrix.iter(b) {
819 self.unify_cache.insert(conflicts_with_b);
820 }
821 for conflicts_with_a_or_b in self.unify_cache.iter() {
822 // Set both `a` and `b` for this local's row.
823 self.matrix.insert(conflicts_with_a_or_b, a);
824 self.matrix.insert(conflicts_with_a_or_b, b);
825 }
826
827 // Write the locals `a` conflicts with to `b`'s row.
828 self.matrix.union_rows(a, b);
829 // Write the locals `b` conflicts with to `a`'s row.
830 self.matrix.union_rows(b, a);
831 }
832 }
833
834 /// A `dest = {move} src;` statement at `loc`.
835 ///
836 /// We want to consider merging `dest` and `src` due to this assignment.
837 #[derive(Debug, Copy, Clone)]
838 struct CandidateAssignment<'tcx> {
839 /// Does not contain indirection or indexing (so the only local it contains is the place base).
840 dest: Place<'tcx>,
841 src: Local,
842 loc: Location,
843 }
844
845 /// Scans the MIR for assignments between locals that we might want to consider merging.
846 ///
847 /// This will filter out assignments that do not match the right form (as described in the top-level
848 /// comment) and also throw out assignments that involve a local that has its address taken or is
849 /// otherwise ineligible (eg. locals used as array indices are ignored because we cannot propagate
850 /// arbitrary places into array indices).
851 fn find_candidates<'a, 'tcx>(
852 tcx: TyCtxt<'tcx>,
853 body: &'a Body<'tcx>,
854 ) -> Vec<CandidateAssignment<'tcx>> {
855 let mut visitor = FindAssignments {
856 tcx,
857 body,
858 candidates: Vec::new(),
859 ever_borrowed_locals: ever_borrowed_locals(body),
860 locals_used_as_array_index: locals_used_as_array_index(body),
861 };
862 visitor.visit_body(body);
863 visitor.candidates
864 }
865
866 struct FindAssignments<'a, 'tcx> {
867 tcx: TyCtxt<'tcx>,
868 body: &'a Body<'tcx>,
869 candidates: Vec<CandidateAssignment<'tcx>>,
870 ever_borrowed_locals: BitSet<Local>,
871 locals_used_as_array_index: BitSet<Local>,
872 }
873
874 impl<'a, 'tcx> Visitor<'tcx> for FindAssignments<'a, 'tcx> {
875 fn visit_statement(&mut self, statement: &Statement<'tcx>, location: Location) {
876 if let StatementKind::Assign(box (
877 dest,
878 Rvalue::Use(Operand::Copy(src) | Operand::Move(src)),
879 )) = &statement.kind
880 {
881 // `dest` must not have pointer indirection.
882 if dest.is_indirect() {
883 return;
884 }
885
886 // `src` must be a plain local.
887 if !src.projection.is_empty() {
888 return;
889 }
890
891 // Since we want to replace `src` with `dest`, `src` must not be required.
892 if is_local_required(src.local, self.body) {
893 return;
894 }
895
896 // Can't optimize if both locals ever have their address taken (can introduce
897 // aliasing).
898 // FIXME: This can be smarter and take `StorageDead` into account (which
899 // invalidates borrows).
900 if self.ever_borrowed_locals.contains(dest.local)
901 || self.ever_borrowed_locals.contains(src.local)
902 {
903 return;
904 }
905
906 assert_ne!(dest.local, src.local, "self-assignments are UB");
907
908 // We can't replace locals occurring in `PlaceElem::Index` for now.
909 if self.locals_used_as_array_index.contains(src.local) {
910 return;
911 }
912
913 // Handle the "subtle case" described above by rejecting any `dest` that is or
914 // projects through a union.
915 let is_union = |ty: Ty<'_>| {
916 if let ty::Adt(def, _) = ty.kind() {
917 if def.is_union() {
918 return true;
919 }
920 }
921
922 false
923 };
924 let mut place_ty = PlaceTy::from_ty(self.body.local_decls[dest.local].ty);
925 if is_union(place_ty.ty) {
926 return;
927 }
928 for elem in dest.projection {
929 if let PlaceElem::Index(_) = elem {
930 // `dest` contains an indexing projection.
931 return;
932 }
933
934 place_ty = place_ty.projection_ty(self.tcx, elem);
935 if is_union(place_ty.ty) {
936 return;
937 }
938 }
939
940 self.candidates.push(CandidateAssignment {
941 dest: *dest,
942 src: src.local,
943 loc: location,
944 });
945 }
946 }
947 }
948
949 /// Some locals are part of the function's interface and can not be removed.
950 ///
951 /// Note that these locals *can* still be merged with non-required locals by removing that other
952 /// local.
953 fn is_local_required(local: Local, body: &Body<'_>) -> bool {
954 match body.local_kind(local) {
955 LocalKind::Arg | LocalKind::ReturnPointer => true,
956 LocalKind::Var | LocalKind::Temp => false,
957 }
958 }
959
960 /// Walks MIR to find all locals that have their address taken anywhere.
961 fn ever_borrowed_locals(body: &Body<'_>) -> BitSet<Local> {
962 let mut visitor = BorrowCollector { locals: BitSet::new_empty(body.local_decls.len()) };
963 visitor.visit_body(body);
964 visitor.locals
965 }
966
967 struct BorrowCollector {
968 locals: BitSet<Local>,
969 }
970
971 impl<'tcx> Visitor<'tcx> for BorrowCollector {
972 fn visit_rvalue(&mut self, rvalue: &Rvalue<'tcx>, location: Location) {
973 self.super_rvalue(rvalue, location);
974
975 match rvalue {
976 Rvalue::AddressOf(_, borrowed_place) | Rvalue::Ref(_, _, borrowed_place) => {
977 if !borrowed_place.is_indirect() {
978 self.locals.insert(borrowed_place.local);
979 }
980 }
981
982 Rvalue::Cast(..)
983 | Rvalue::Use(..)
984 | Rvalue::Repeat(..)
985 | Rvalue::Len(..)
986 | Rvalue::BinaryOp(..)
987 | Rvalue::CheckedBinaryOp(..)
988 | Rvalue::NullaryOp(..)
989 | Rvalue::UnaryOp(..)
990 | Rvalue::Discriminant(..)
991 | Rvalue::Aggregate(..)
992 | Rvalue::ThreadLocalRef(..) => {}
993 }
994 }
995
996 fn visit_terminator(&mut self, terminator: &Terminator<'tcx>, location: Location) {
997 self.super_terminator(terminator, location);
998
999 match terminator.kind {
1000 TerminatorKind::Drop { place: dropped_place, .. }
1001 | TerminatorKind::DropAndReplace { place: dropped_place, .. } => {
1002 self.locals.insert(dropped_place.local);
1003 }
1004
1005 TerminatorKind::Abort
1006 | TerminatorKind::Assert { .. }
1007 | TerminatorKind::Call { .. }
1008 | TerminatorKind::FalseEdge { .. }
1009 | TerminatorKind::FalseUnwind { .. }
1010 | TerminatorKind::GeneratorDrop
1011 | TerminatorKind::Goto { .. }
1012 | TerminatorKind::Resume
1013 | TerminatorKind::Return
1014 | TerminatorKind::SwitchInt { .. }
1015 | TerminatorKind::Unreachable
1016 | TerminatorKind::Yield { .. }
1017 | TerminatorKind::InlineAsm { .. } => {}
1018 }
1019 }
1020 }
1021
1022 /// `PlaceElem::Index` only stores a `Local`, so we can't replace that with a full `Place`.
1023 ///
1024 /// Collect locals used as indices so we don't generate candidates that are impossible to apply
1025 /// later.
1026 fn locals_used_as_array_index(body: &Body<'_>) -> BitSet<Local> {
1027 let mut visitor = IndexCollector { locals: BitSet::new_empty(body.local_decls.len()) };
1028 visitor.visit_body(body);
1029 visitor.locals
1030 }
1031
1032 struct IndexCollector {
1033 locals: BitSet<Local>,
1034 }
1035
1036 impl<'tcx> Visitor<'tcx> for IndexCollector {
1037 fn visit_projection_elem(
1038 &mut self,
1039 local: Local,
1040 proj_base: &[PlaceElem<'tcx>],
1041 elem: PlaceElem<'tcx>,
1042 context: PlaceContext,
1043 location: Location,
1044 ) {
1045 if let PlaceElem::Index(i) = elem {
1046 self.locals.insert(i);
1047 }
1048 self.super_projection_elem(local, proj_base, elem, context, location);
1049 }
1050 }