7 The *balancer* can optimize the placement of PGs across OSDs in
8 order to achieve a balanced distribution, either automatically or in a
14 The current status of the balancer can be checked at any time with::
22 The automatic balancing feature is enabled by default in ``upmap``
23 mode. Please refer to :ref:`upmap` for more details. The balancer can be
28 The balancer mode can be changed to ``crush-compat`` mode, which is
29 backward compatible with older clients, and will make small changes to
30 the data distribution over time to ensure that OSDs are equally utilized.
36 No adjustments will be made to the PG distribution if the cluster is
37 degraded (e.g., because an OSD has failed and the system has not yet
40 When the cluster is healthy, the balancer will throttle its changes
41 such that the percentage of PGs that are misplaced (i.e., that need to
42 be moved) is below a threshold of (by default) 5%. The
43 ``target_max_misplaced_ratio`` threshold can be adjusted with::
45 ceph config set mgr target_max_misplaced_ratio .07 # 7%
47 Set the number of seconds to sleep in between runs of the automatic balancer::
49 ceph config set mgr mgr/balancer/sleep_interval 60
51 Set the time of day to begin automatic balancing in HHMM format::
53 ceph config set mgr mgr/balancer/begin_time 0000
55 Set the time of day to finish automatic balancing in HHMM format::
57 ceph config set mgr mgr/balancer/end_time 2400
59 Restrict automatic balancing to this day of the week or later.
60 Uses the same conventions as crontab, 0 or 7 is Sunday, 1 is Monday, and so on::
62 ceph config set mgr mgr/balancer/begin_weekday 0
64 Restrict automatic balancing to this day of the week or earlier.
65 Uses the same conventions as crontab, 0 or 7 is Sunday, 1 is Monday, and so on::
67 ceph config set mgr mgr/balancer/end_weekday 7
69 Pool IDs to which the automatic balancing will be limited.
70 The default for this is an empty string, meaning all pools will be balanced.
71 The numeric pool IDs can be gotten with the :command:`ceph osd pool ls detail` command::
73 ceph config set mgr mgr/balancer/pool_ids 1,2,3
79 There are currently two supported balancer modes:
81 #. **crush-compat**. The CRUSH compat mode uses the compat weight-set
82 feature (introduced in Luminous) to manage an alternative set of
83 weights for devices in the CRUSH hierarchy. The normal weights
84 should remain set to the size of the device to reflect the target
85 amount of data that we want to store on the device. The balancer
86 then optimizes the weight-set values, adjusting them up or down in
87 small increments, in order to achieve a distribution that matches
88 the target distribution as closely as possible. (Because PG
89 placement is a pseudorandom process, there is a natural amount of
90 variation in the placement; by optimizing the weights we
91 counter-act that natural variation.)
93 Notably, this mode is *fully backwards compatible* with older
94 clients: when an OSDMap and CRUSH map is shared with older clients,
95 we present the optimized weights as the "real" weights.
97 The primary restriction of this mode is that the balancer cannot
98 handle multiple CRUSH hierarchies with different placement rules if
99 the subtrees of the hierarchy share any OSDs. (This is normally
100 not the case, and is generally not a recommended configuration
101 because it is hard to manage the space utilization on the shared
104 #. **upmap**. Starting with Luminous, the OSDMap can store explicit
105 mappings for individual OSDs as exceptions to the normal CRUSH
106 placement calculation. These `upmap` entries provide fine-grained
107 control over the PG mapping. This CRUSH mode will optimize the
108 placement of individual PGs in order to achieve a balanced
109 distribution. In most cases, this distribution is "perfect," which
110 an equal number of PGs on each OSD (+/-1 PG, since they might not
113 Note that using upmap requires that all clients be Luminous or newer.
115 The default mode is ``upmap``. The mode can be adjusted with::
117 ceph balancer mode crush-compat
119 Supervised optimization
120 -----------------------
122 The balancer operation is broken into a few distinct phases:
125 #. evaluating the quality of the data distribution, either for the current PG distribution, or the PG distribution that would result after executing a *plan*
126 #. executing the *plan*
128 To evaluate and score the current distribution::
132 You can also evaluate the distribution for a single pool with::
134 ceph balancer eval <pool-name>
136 Greater detail for the evaluation can be seen with::
138 ceph balancer eval-verbose ...
140 The balancer can generate a plan, using the currently configured mode, with::
142 ceph balancer optimize <plan-name>
144 The name is provided by the user and can be any useful identifying string. The contents of a plan can be seen with::
146 ceph balancer show <plan-name>
148 All plans can be shown with::
152 Old plans can be discarded with::
154 ceph balancer rm <plan-name>
156 Currently recorded plans are shown as part of the status command::
160 The quality of the distribution that would result after executing a plan can be calculated with::
162 ceph balancer eval <plan-name>
164 Assuming the plan is expected to improve the distribution (i.e., it has a lower score than the current cluster state), the user can execute that plan with::
166 ceph balancer execute <plan-name>