ceph/doc/rados/operations/balancer.rst

   1 .. _balancer:
   2
   3 Balancer Module
   4 =======================
   5
   6 The *balancer* can optimize the allocation of placement groups (PGs) across
   7 OSDs in order to achieve a balanced distribution. The balancer can operate
   8 either automatically or in a supervised fashion.
   9
  10
  11 Status
  12 ------
  13
  14 To check the current status of the balancer, run the following command:
  15
  16    .. prompt:: bash $
  17
  18       ceph balancer status
  19
  20
  21 Automatic balancing
  22 -------------------
  23
  24 When the balancer is in ``upmap`` mode, the automatic balancing feature is
  25 enabled by default. For more details, see :ref:`upmap`.  To disable the
  26 balancer, run the following command:
  27
  28    .. prompt:: bash $
  29
  30       ceph balancer off
  31
  32 The balancer mode can be changed from ``upmap`` mode to ``crush-compat`` mode.
  33 ``crush-compat`` mode is backward compatible with older clients.  In
  34 ``crush-compat`` mode, the balancer automatically makes small changes to the
  35 data distribution in order to ensure that OSDs are utilized equally.
  36
  37
  38 Throttling
  39 ----------
  40
  41 If the cluster is degraded (that is, if an OSD has failed and the system hasn't
  42 healed itself yet), then the balancer will not make any adjustments to the PG
  43 distribution.
  44
  45 When the cluster is healthy, the balancer will incrementally move a small
  46 fraction of unbalanced PGs in order to improve distribution.  This fraction
  47 will not exceed a certain threshold that defaults to 5%. To adjust this
  48 ``target_max_misplaced_ratio`` threshold setting, run the following command:
  49
  50    .. prompt:: bash $
  51
  52       ceph config set mgr target_max_misplaced_ratio .07   # 7%
  53
  54 The balancer sleeps between runs. To set the number of seconds for this
  55 interval of sleep, run the following command:
  56
  57    .. prompt:: bash $
  58
  59       ceph config set mgr mgr/balancer/sleep_interval 60
  60
  61 To set the time of day (in HHMM format) at which automatic balancing begins,
  62 run the following command:
  63
  64    .. prompt:: bash $
  65
  66       ceph config set mgr mgr/balancer/begin_time 0000
  67
  68 To set the time of day (in HHMM format) at which automatic balancing ends, run
  69 the following command:
  70
  71    .. prompt:: bash $
  72
  73       ceph config set mgr mgr/balancer/end_time 2359
  74
  75 Automatic balancing can be restricted to certain days of the week.  To restrict
  76 it to a specific day of the week or later (as with crontab, ``0`` is Sunday,
  77 ``1`` is Monday, and so on), run the following command:
  78
  79    .. prompt:: bash $
  80
  81       ceph config set mgr mgr/balancer/begin_weekday 0
  82
  83 To restrict automatic balancing to a specific day of the week or earlier
  84 (again, ``0`` is Sunday, ``1`` is Monday, and so on), run the following
  85 command:
  86
  87    .. prompt:: bash $
  88
  89       ceph config set mgr mgr/balancer/end_weekday 6
  90
  91 Automatic balancing can be restricted to certain pools. By default, the value
  92 of this setting is an empty string, so that all pools are automatically
  93 balanced.  To restrict automatic balancing to specific pools, retrieve their
  94 numeric pool IDs (by running the :command:`ceph osd pool ls detail` command),
  95 and then run the following command:
  96
  97    .. prompt:: bash $
  98
  99       ceph config set mgr mgr/balancer/pool_ids 1,2,3
 100
 101
 102 Modes
 103 -----
 104
 105 There are two supported balancer modes:
 106
 107 #. **crush-compat**. This mode uses the compat weight-set feature (introduced
 108    in Luminous) to manage an alternative set of weights for devices in the
 109    CRUSH hierarchy. When the balancer is operating in this mode, the normal
 110    weights should remain set to the size of the device in order to reflect the
 111    target amount of data intended to be stored on the device. The balancer will
 112    then optimize the weight-set values, adjusting them up or down in small
 113    increments, in order to achieve a distribution that matches the target
 114    distribution as closely as possible. (Because PG placement is a pseudorandom
 115    process, it is subject to a natural amount of variation; optimizing the
 116    weights serves to counteract that natural variation.)
 117
 118    Note that this mode is *fully backward compatible* with older clients: when
 119    an OSD Map and CRUSH map are shared with older clients, Ceph presents the
 120    optimized weights as the "real" weights.
 121
 122    The primary limitation of this mode is that the balancer cannot handle
 123    multiple CRUSH hierarchies with different placement rules if the subtrees of
 124    the hierarchy share any OSDs. (Such sharing of OSDs is not typical and,
 125    because of the difficulty of managing the space utilization on the shared
 126    OSDs, is generally not recommended.)
 127
 128 #. **upmap**. In Luminous and later releases, the OSDMap can store explicit
 129    mappings for individual OSDs as exceptions to the normal CRUSH placement
 130    calculation. These ``upmap`` entries provide fine-grained control over the
 131    PG mapping. This balancer mode optimizes the placement of individual PGs in
 132    order to achieve a balanced distribution.  In most cases, the resulting
 133    distribution is nearly perfect: that is, there is an equal number of PGs on
 134    each OSD (±1 PG, since the total number might not divide evenly).
 135
 136    To use ``upmap``, all clients must be Luminous or newer.
 137
 138 The default mode is ``upmap``. The mode can be changed to ``crush-compat`` by
 139 running the following command:
 140
 141    .. prompt:: bash $
 142
 143       ceph balancer mode crush-compat
 144
 145 Supervised optimization
 146 -----------------------
 147
 148 Supervised use of the balancer can be understood in terms of three distinct
 149 phases:
 150
 151 #. building a plan
 152 #. evaluating the quality of the data distribution, either for the current PG
 153    distribution or for the PG distribution that would result after executing a
 154    plan
 155 #. executing the plan
 156
 157 To evaluate the current distribution, run the following command:
 158
 159    .. prompt:: bash $
 160
 161       ceph balancer eval
 162
 163 To evaluate the distribution for a single pool, run the following command:
 164
 165    .. prompt:: bash $
 166
 167       ceph balancer eval <pool-name>
 168
 169 To see the evaluation in greater detail, run the following command:
 170
 171    .. prompt:: bash $
 172
 173       ceph balancer eval-verbose ...
 174
 175 To instruct the balancer to generate a plan (using the currently configured
 176 mode), make up a name (any useful identifying string) for the plan, and run the
 177 following command:
 178
 179    .. prompt:: bash $
 180
 181       ceph balancer optimize <plan-name>
 182
 183 To see the contents of a plan, run the following command:
 184
 185    .. prompt:: bash $
 186
 187       ceph balancer show <plan-name>
 188
 189 To display all plans, run the following command:
 190
 191    .. prompt:: bash $
 192
 193       ceph balancer ls
 194
 195 To discard an old plan, run the following command:
 196
 197    .. prompt:: bash $
 198
 199       ceph balancer rm <plan-name>
 200
 201 To see currently recorded plans, examine the output of the following status
 202 command:
 203
 204    .. prompt:: bash $
 205
 206       ceph balancer status
 207
 208 To evaluate the distribution that would result from executing a specific plan,
 209 run the following command:
 210
 211    .. prompt:: bash $
 212
 213       ceph balancer eval <plan-name>
 214
 215 If a plan is expected to improve the distribution (that is, the plan's score is
 216 lower than the current cluster state's score), you can execute that plan by
 217 running the following command:
 218
 219    .. prompt:: bash $
 220
 221       ceph balancer execute <plan-name>