]> git.proxmox.com Git - ceph.git/blob - ceph/doc/mgr/balancer.rst
update sources to 12.2.7
[ceph.git] / ceph / doc / mgr / balancer.rst
1 Balancer plugin
2 ===============
3
4 The *balancer* plugin can optimize the placement of PGs across OSDs in
5 order to achieve a balanced distribution, either automatically or in a
6 supervised fashion.
7
8 Enabling
9 --------
10
11 The *balancer* module is enabled with::
12
13 ceph mgr module enable balancer
14
15 (It is enabled by default.)
16
17 Status
18 ------
19
20 The current status of the balancer can be check at any time with::
21
22 ceph balancer status
23
24
25 Automatic balancing
26 -------------------
27
28 The automatic balancing can be enabled, using the default settings, with::
29
30 ceph balancer on
31
32 The balancer can be turned back off again with::
33
34 ceph balancer off
35
36 This will use the ``crush-compat`` mode, which is backward compatible
37 with older clients, and will make small changes to the data
38 distribution over time to ensure that OSDs are equally utilized.
39
40
41 Throttling
42 ----------
43
44 No adjustments will be made to the PG distribution if the cluster is
45 degraded (e.g., because an OSD has failed and the system has not yet
46 healed itself).
47
48 When the cluster is healthy, the balancer will throttle its changes
49 such that the percentage of PGs that are misplaced (i.e., that need to
50 be moved) is below a threshold of (by default) 5%. The
51 ``max_misplaced`` threshold can be adjusted with::
52
53 ceph config-key set mgr/balancer/max_misplaced .07 # 7%
54
55
56 Modes
57 -----
58
59 There are currently two supported balancer modes:
60
61 #. **crush-compat**. The CRUSH compat mode uses the compat weight-set
62 feature (introduced in Luminous) to manage an alternative set of
63 weights for devices in the CRUSH hierarchy. The normal weights
64 should remain set to the size of the device to reflect the target
65 amount of data that we want to store on the device. The balancer
66 then optimizes the weight-set values, adjusting them up or down in
67 small increments, in order to achieve a distribution that matches
68 the target distribution as closely as possible. (Because PG
69 placement is a pseudorandom process, there is a natural amount of
70 variation in the placement; by optimizing the weights we
71 counter-act that natural variation.)
72
73 Notably, this mode is *fully backwards compatible* with older
74 clients: when an OSDMap and CRUSH map is shared with older clients,
75 we present the optimized weights as the "real" weights.
76
77 The primary restriction of this mode is that the balancer cannot
78 handle multiple CRUSH hierarchies with different placement rules if
79 the subtrees of the hierarchy share any OSDs. (This is normally
80 not the case, and is generally not a recommended configuration
81 because it is hard to manage the space utilization on the shared
82 OSDs.)
83
84 #. **upmap**. Starting with Luminous, the OSDMap can store explicit
85 mappings for individual OSDs as exceptions to the normal CRUSH
86 placement calculation. These `upmap` entries provide fine-grained
87 control over the PG mapping. This CRUSH mode will optimize the
88 placement of individual PGs in order to achieve a balanced
89 distribution. In most cases, this distribution is "perfect," which
90 an equal number of PGs on each OSD (+/-1 PG, since they might not
91 divide evenly).
92
93 Note that using upmap requires that all clients be Luminous or newer.
94
95 The default mode is ``crush-compat``. The mode can be adjusted with::
96
97 ceph balancer mode upmap
98
99 or::
100
101 ceph balancer mode crush-compat
102
103 Supervised optimization
104 -----------------------
105
106 The balancer operation is broken into a few distinct phases:
107
108 #. building a *plan*
109 #. evaluating the quality of the data distribution, either for the current PG distribution, or the PG distribution that would result after executing a *plan*
110 #. executing the *plan*
111
112 To evautate and score the current distribution,::
113
114 ceph balancer eval
115
116 You can also evaluate the distribution for a single pool with::
117
118 ceph balancer eval <pool-name>
119
120 Greater detail for the evaluation can be seen with::
121
122 ceph balancer eval-verbose ...
123
124 The balancer can generate a plan, using the currently configured mode, with::
125
126 ceph balancer optimize <plan-name>
127
128 The name is provided by the user and can be any useful identifying string. The contents of a plan can be seen with::
129
130 ceph balancer show <plan-name>
131
132 Old plans can be discarded with::
133
134 ceph balancer rm <plan-name>
135
136 Currently recorded plans are shown as part of the status command::
137
138 ceph balancer status
139
140 The quality of the distribution that would result after executing a plan can be calculated with::
141
142 ceph balancer eval <plan-name>
143
144 Assuming the plan is expected to improve the distribution (i.e., it has a lower score than the current cluster state), the user can execute that plan with::
145
146 ceph balancer execute <plan-name>