]> git.proxmox.com Git - ceph.git/blob - ceph/doc/rados/operations/upmap.rst
update ceph source to reef 18.2.1
[ceph.git] / ceph / doc / rados / operations / upmap.rst
1 .. _upmap:
2
3 =======================================
4 Using pg-upmap
5 =======================================
6
7 In Luminous v12.2.z and later releases, there is a *pg-upmap* exception table
8 in the OSDMap that allows the cluster to explicitly map specific PGs to
9 specific OSDs. This allows the cluster to fine-tune the data distribution to,
10 in most cases, uniformly distribute PGs across OSDs.
11
12 However, there is an important caveat when it comes to this new feature: it
13 requires all clients to understand the new *pg-upmap* structure in the OSDMap.
14
15 Online Optimization
16 ===================
17
18 Enabling
19 --------
20
21 In order to use ``pg-upmap``, the cluster cannot have any pre-Luminous clients.
22 By default, new clusters enable the *balancer module*, which makes use of
23 ``pg-upmap``. If you want to use a different balancer or you want to make your
24 own custom ``pg-upmap`` entries, you might want to turn off the balancer in
25 order to avoid conflict:
26
27 .. prompt:: bash $
28
29 ceph balancer off
30
31 To allow use of the new feature on an existing cluster, you must restrict the
32 cluster to supporting only Luminous (and newer) clients. To do so, run the
33 following command:
34
35 .. prompt:: bash $
36
37 ceph osd set-require-min-compat-client luminous
38
39 This command will fail if any pre-Luminous clients or daemons are connected to
40 the monitors. To see which client versions are in use, run the following
41 command:
42
43 .. prompt:: bash $
44
45 ceph features
46
47 Balancer Module
48 ---------------
49
50 The `balancer` module for ``ceph-mgr`` will automatically balance the number of
51 PGs per OSD. See :ref:`balancer`
52
53 Offline Optimization
54 ====================
55
56 Upmap entries are updated with an offline optimizer that is built into the
57 :ref:`osdmaptool`.
58
59 #. Grab the latest copy of your osdmap:
60
61 .. prompt:: bash $
62
63 ceph osd getmap -o om
64
65 #. Run the optimizer:
66
67 .. prompt:: bash $
68
69 osdmaptool om --upmap out.txt [--upmap-pool <pool>] \
70 [--upmap-max <max-optimizations>] \
71 [--upmap-deviation <max-deviation>] \
72 [--upmap-active]
73
74 It is highly recommended that optimization be done for each pool
75 individually, or for sets of similarly utilized pools. You can specify the
76 ``--upmap-pool`` option multiple times. "Similarly utilized pools" means
77 pools that are mapped to the same devices and that store the same kind of
78 data (for example, RBD image pools are considered to be similarly utilized;
79 an RGW index pool and an RGW data pool are not considered to be similarly
80 utilized).
81
82 The ``max-optimizations`` value determines the maximum number of upmap
83 entries to identify. The default is `10` (as is the case with the
84 ``ceph-mgr`` balancer module), but you should use a larger number if you are
85 doing offline optimization. If it cannot find any additional changes to
86 make (that is, if the pool distribution is perfect), it will stop early.
87
88 The ``max-deviation`` value defaults to `5`. If an OSD's PG count varies
89 from the computed target number by no more than this amount it will be
90 considered perfect.
91
92 The ``--upmap-active`` option simulates the behavior of the active balancer
93 in upmap mode. It keeps cycling until the OSDs are balanced and reports how
94 many rounds have occurred and how long each round takes. The elapsed time
95 for rounds indicates the CPU load that ``ceph-mgr`` consumes when it computes
96 the next optimization plan.
97
98 #. Apply the changes:
99
100 .. prompt:: bash $
101
102 source out.txt
103
104 In the above example, the proposed changes are written to the output file
105 ``out.txt``. The commands in this procedure are normal Ceph CLI commands
106 that can be run in order to apply the changes to the cluster.
107
108 The above steps can be repeated as many times as necessary to achieve a perfect
109 distribution of PGs for each set of pools.
110
111 To see some (gory) details about what the tool is doing, you can pass
112 ``--debug-osd 10`` to ``osdmaptool``. To see even more details, pass
113 ``--debug-crush 10`` to ``osdmaptool``.