]> git.proxmox.com Git - ceph.git/blame - ceph/doc/rados/operations/upmap.rst
update ceph source to reef 18.2.1
[ceph.git] / ceph / doc / rados / operations / upmap.rst
CommitLineData
f67539c2
TL
1.. _upmap:
2
aee94f69 3=======================================
1e59de90 4Using pg-upmap
aee94f69 5=======================================
c07f9fc5 6
1e59de90 7In Luminous v12.2.z and later releases, there is a *pg-upmap* exception table
c07f9fc5 8in the OSDMap that allows the cluster to explicitly map specific PGs to
1e59de90
TL
9specific OSDs. This allows the cluster to fine-tune the data distribution to,
10in most cases, uniformly distribute PGs across OSDs.
c07f9fc5 11
1e59de90
TL
12However, there is an important caveat when it comes to this new feature: it
13requires all clients to understand the new *pg-upmap* structure in the OSDMap.
c07f9fc5 14
aee94f69
TL
15Online Optimization
16===================
17
c07f9fc5
FG
18Enabling
19--------
20
1e59de90
TL
21In order to use ``pg-upmap``, the cluster cannot have any pre-Luminous clients.
22By default, new clusters enable the *balancer module*, which makes use of
23``pg-upmap``. If you want to use a different balancer or you want to make your
24own custom ``pg-upmap`` entries, you might want to turn off the balancer in
25order to avoid conflict:
f67539c2 26
39ae355f
TL
27.. prompt:: bash $
28
29 ceph balancer off
f67539c2 30
1e59de90
TL
31To allow use of the new feature on an existing cluster, you must restrict the
32cluster to supporting only Luminous (and newer) clients. To do so, run the
33following command:
39ae355f
TL
34
35.. prompt:: bash $
c07f9fc5 36
39ae355f 37 ceph osd set-require-min-compat-client luminous
c07f9fc5 38
1e59de90
TL
39This command will fail if any pre-Luminous clients or daemons are connected to
40the monitors. To see which client versions are in use, run the following
41command:
39ae355f
TL
42
43.. prompt:: bash $
c07f9fc5 44
39ae355f 45 ceph features
c07f9fc5 46
aee94f69 47Balancer Module
1e59de90 48---------------
c07f9fc5 49
1e59de90
TL
50The `balancer` module for ``ceph-mgr`` will automatically balance the number of
51PGs per OSD. See :ref:`balancer`
c07f9fc5 52
aee94f69
TL
53Offline Optimization
54====================
c07f9fc5 55
aee94f69
TL
56Upmap entries are updated with an offline optimizer that is built into the
57:ref:`osdmaptool`.
c07f9fc5 58
39ae355f 59#. Grab the latest copy of your osdmap:
c07f9fc5 60
39ae355f 61 .. prompt:: bash $
c07f9fc5 62
39ae355f 63 ceph osd getmap -o om
c07f9fc5 64
39ae355f
TL
65#. Run the optimizer:
66
67 .. prompt:: bash $
68
69 osdmaptool om --upmap out.txt [--upmap-pool <pool>] \
70 [--upmap-max <max-optimizations>] \
71 [--upmap-deviation <max-deviation>] \
72 [--upmap-active]
c07f9fc5
FG
73
74 It is highly recommended that optimization be done for each pool
1e59de90
TL
75 individually, or for sets of similarly utilized pools. You can specify the
76 ``--upmap-pool`` option multiple times. "Similarly utilized pools" means
77 pools that are mapped to the same devices and that store the same kind of
78 data (for example, RBD image pools are considered to be similarly utilized;
79 an RGW index pool and an RGW data pool are not considered to be similarly
80 utilized).
81
82 The ``max-optimizations`` value determines the maximum number of upmap
83 entries to identify. The default is `10` (as is the case with the
84 ``ceph-mgr`` balancer module), but you should use a larger number if you are
85 doing offline optimization. If it cannot find any additional changes to
86 make (that is, if the pool distribution is perfect), it will stop early.
87
88 The ``max-deviation`` value defaults to `5`. If an OSD's PG count varies
89 from the computed target number by no more than this amount it will be
90 considered perfect.
91
92 The ``--upmap-active`` option simulates the behavior of the active balancer
93 in upmap mode. It keeps cycling until the OSDs are balanced and reports how
94 many rounds have occurred and how long each round takes. The elapsed time
95 for rounds indicates the CPU load that ``ceph-mgr`` consumes when it computes
96 the next optimization plan.
92f5a8d4 97
39ae355f
TL
98#. Apply the changes:
99
100 .. prompt:: bash $
c07f9fc5 101
39ae355f 102 source out.txt
c07f9fc5 103
1e59de90
TL
104 In the above example, the proposed changes are written to the output file
105 ``out.txt``. The commands in this procedure are normal Ceph CLI commands
106 that can be run in order to apply the changes to the cluster.
92f5a8d4 107
1e59de90
TL
108The above steps can be repeated as many times as necessary to achieve a perfect
109distribution of PGs for each set of pools.
c07f9fc5 110
1e59de90
TL
111To see some (gory) details about what the tool is doing, you can pass
112``--debug-osd 10`` to ``osdmaptool``. To see even more details, pass
113``--debug-crush 10`` to ``osdmaptool``.