5 ======================================================
6 osdmaptool -- ceph osd cluster map manipulation tool
7 ======================================================
9 .. program:: osdmaptool
14 | **osdmaptool** *mapfilename* [--print] [--createsimple *numosd*
15 [--pgbits *bitsperosd* ] ] [--clobber]
16 | **osdmaptool** *mapfilename* [--import-crush *crushmap*]
17 | **osdmaptool** *mapfilename* [--export-crush *crushmap*]
18 | **osdmaptool** *mapfilename* [--upmap *file*] [--upmap-max *max-optimizations*]
19 [--upmap-deviation *max-deviation*] [--upmap-pool *poolname*]
20 [--save] [--upmap-active]
21 | **osdmaptool** *mapfilename* [--upmap-cleanup] [--upmap *file*]
27 **osdmaptool** is a utility that lets you create, view, and manipulate
28 OSD cluster maps from the Ceph distributed storage system. Notably, it
29 lets you extract the embedded CRUSH map or import a new CRUSH map.
30 It can also simulate the upmap balancer mode so you can get a sense of
31 what is needed to balance your PGs.
39 will simply make the tool print a plaintext dump of the map, after
40 any modifications are made.
42 .. option:: --dump <format>
44 displays the map in plain text when <format> is 'plain', 'json' if specified
45 format is not supported. This is an alternative to the print option.
49 will allow osdmaptool to overwrite mapfilename if changes are made.
51 .. option:: --import-crush mapfile
53 will load the CRUSH map from mapfile and embed it in the OSD map.
55 .. option:: --export-crush mapfile
57 will extract the CRUSH map from the OSD map and write it to
60 .. option:: --createsimple numosd [--pg-bits bitsperosd] [--pgp-bits bits]
62 will create a relatively generic OSD map with the numosd devices.
63 If --pg-bits is specified, the initial placement group counts will
64 be set with bitsperosd bits per OSD. That is, the pg_num map
65 attribute will be set to numosd shifted by bitsperosd.
66 If --pgp-bits is specified, then the pgp_num map attribute will
67 be set to numosd shifted by bits.
69 .. option:: --create-from-conf
71 creates an osd map with default configurations.
73 .. option:: --test-map-pgs [--pool poolid] [--range-first <first> --range-last <last>]
75 will print out the mappings from placement groups to OSDs.
76 If range is specified, then it iterates from first to last in the directory
77 specified by argument to osdmaptool.
78 Eg: **osdmaptool --test-map-pgs --range-first 0 --range-last 2 osdmap_dir**.
79 This will iterate through the files named 0,1,2 in osdmap_dir.
81 .. option:: --test-map-pgs-dump [--pool poolid] [--range-first <first> --range-last <last>]
83 will print out the summary of all placement groups and the mappings from them to the mapped OSDs.
84 If range is specified, then it iterates from first to last in the directory
85 specified by argument to osdmaptool.
86 Eg: **osdmaptool --test-map-pgs-dump --range-first 0 --range-last 2 osdmap_dir**.
87 This will iterate through the files named 0,1,2 in osdmap_dir.
89 .. option:: --test-map-pgs-dump-all [--pool poolid] [--range-first <first> --range-last <last>]
91 will print out the summary of all placement groups and the mappings
92 from them to all the OSDs.
93 If range is specified, then it iterates from first to last in the directory
94 specified by argument to osdmaptool.
95 Eg: **osdmaptool --test-map-pgs-dump-all --range-first 0 --range-last 2 osdmap_dir**.
96 This will iterate through the files named 0,1,2 in osdmap_dir.
98 .. option:: --test-random
100 does a random mapping of placement groups to the OSDs.
102 .. option:: --test-map-pg <pgid>
104 map a particular placement group(specified by pgid) to the OSDs.
106 .. option:: --test-map-object <objectname> [--pool <poolid>]
108 map a particular placement group(specified by objectname) to the OSDs.
110 .. option:: --test-crush [--range-first <first> --range-last <last>]
112 map placement groups to acting OSDs.
113 If range is specified, then it iterates from first to last in the directory
114 specified by argument to osdmaptool.
115 Eg: **osdmaptool --test-crush --range-first 0 --range-last 2 osdmap_dir**.
116 This will iterate through the files named 0,1,2 in osdmap_dir.
118 .. option:: --mark-up-in
120 mark osds up and in (but do not persist).
122 .. option:: --mark-out
124 mark an osd as out (but do not persist)
126 .. option:: --mark-up <osdid>
128 mark an osd as up (but do not persist)
130 .. option:: --mark-in <osdid>
132 mark an osd as in (but do not persist)
136 Displays a hierarchical tree of the map.
138 .. option:: --clear-temp
140 clears pg_temp and primary_temp variables.
142 .. option:: --clean-temps
150 .. option:: --with-default-pool
152 include default pool when creating map
154 .. option:: --upmap-cleanup <file>
156 clean up pg_upmap[_items] entries, writing commands to <file> [default: - for stdout]
158 .. option:: --upmap <file>
160 calculate pg upmap entries to balance pg layout writing commands to <file> [default: - for stdout]
162 .. option:: --upmap-max <max-optimizations>
164 set max upmap entries to calculate [default: 10]
166 .. option:: --upmap-deviation <max-deviation>
168 max deviation from target [default: 5]
170 .. option:: --upmap-pool <poolname>
172 restrict upmap balancing to 1 pool or the option can be repeated for multiple pools
174 .. option:: --upmap-active
176 Act like an active balancer, keep applying changes until balanced
178 .. option:: --adjust-crush-weight <osdid:weight>[,<osdid:weight>,<...>]
180 Change CRUSH weight of <osdid>
184 write modified osdmap with upmap or crush-adjust changes
186 .. option:: --read <file>
188 calculate pg upmap entries to balance pg primaries
190 .. option:: --read-pool <poolname>
192 specify which pool the read balancer should adjust
196 prefix upmap and read output with './bin/'
201 To create a simple map with 16 devices::
203 osdmaptool --createsimple 16 osdmap --clobber
207 osdmaptool --print osdmap
209 To view the mappings of placement groups for pool 1::
211 osdmaptool osdmap --test-map-pgs-dump --pool 1
222 #osd count first primary c wt wt
227 avg 8 stddev 0 (0x) (expected 2.3094 0.288675x))
236 #. pool 1 has 8 placement groups. And two tables follow:
237 #. A table for placement groups. Each row presents a placement group. With columns of:
239 * placement group id,
242 #. A table for all OSDs. Each row presents an OSD. With columns of:
244 * count of placement groups being mapped to this OSD,
245 * count of placement groups where this OSD is the first one in their acting sets,
246 * count of placement groups where this OSD is the primary of them,
247 * the CRUSH weight of this OSD, and
248 * the weight of this OSD.
249 #. Looking at the number of placement groups held by 3 OSDs. We have
251 * average, stddev, stddev/average, expected stddev, expected stddev / average
253 #. The number of placement groups mapping to n OSDs. In this case, all 8 placement
254 groups are mapping to 3 different OSDs.
256 In a less-balanced cluster, we could have following output for the statistics of
257 placement group distribution, whose standard deviation is 1.41421::
259 #osd count first primary c wt wt
264 #osd count first primary c wt wt
265 osd.0 33 9 9 0.0145874 1
266 osd.1 34 14 14 0.0145874 1
267 osd.2 31 7 7 0.0145874 1
268 osd.3 31 13 13 0.0145874 1
269 osd.4 30 14 14 0.0145874 1
270 osd.5 33 7 7 0.0145874 1
272 avg 32 stddev 1.41421 (0.0441942x) (expected 5.16398 0.161374x))
280 To simulate the active balancer in upmap mode::
282 osdmaptool --upmap upmaps.out --upmap-active --upmap-deviation 6 --upmap-max 11 osdmap
284 osdmaptool: osdmap file 'osdmap'
285 writing upmap command output to: upmaps.out
286 checking for upmap cleanups
287 upmap, max-count 11, max deviation 6
288 pools movies photos metadata data
289 prepared 11/11 changes
290 Time elapsed 0.00310404 secs
291 pools movies photos metadata data
292 prepared 11/11 changes
293 Time elapsed 0.00283402 secs
294 pools data metadata movies photos
295 prepared 11/11 changes
296 Time elapsed 0.003122 secs
297 pools photos metadata data movies
298 prepared 11/11 changes
299 Time elapsed 0.00324372 secs
300 pools movies metadata data photos
301 prepared 1/11 changes
302 Time elapsed 0.00222609 secs
303 pools data movies photos metadata
304 prepared 0/11 changes
305 Time elapsed 0.00209916 secs
306 Unable to find further optimization, or distribution is already perfect
328 Total time elapsed 0.0167765 secs, 5 rounds
330 To simulate the active balancer in read mode, first make sure capacity is balanced
331 by running the balancer in upmap mode. Then, balance the reads on a replicated pool with::
333 osdmaptool osdmap --read read.out --read-pool <pool name>
335 ./bin/osdmaptool: osdmap file 'om'
336 writing upmap command output to: read.out
338 ---------- BEFORE ------------
339 osd.0 | primary affinity: 1 | number of prims: 3
340 osd.1 | primary affinity: 1 | number of prims: 10
341 osd.2 | primary affinity: 1 | number of prims: 3
343 read_balance_score of 'cephfs.a.meta': 1.88
346 ---------- AFTER ------------
347 osd.0 | primary affinity: 1 | number of prims: 5
348 osd.1 | primary affinity: 1 | number of prims: 5
349 osd.2 | primary affinity: 1 | number of prims: 6
351 read_balance_score of 'cephfs.a.meta': 1.13
359 **osdmaptool** is part of Ceph, a massively scalable, open-source, distributed storage system. Please
360 refer to the Ceph documentation at https://docs.ceph.com for more
367 :doc:`ceph <ceph>`\(8),
368 :doc:`crushtool <crushtool>`\(8),