]> git.proxmox.com Git - ceph.git/blob - ceph/doc/releases/bobtail.rst
a6695bf7726ec1886cc257f4b288938cdd65334a
[ceph.git] / ceph / doc / releases / bobtail.rst
1 =======
2 Bobtail
3 =======
4
5 Bobtail is the second stable release of Ceph. It is named after the
6 bobtail squid (order Sepiolida), a group of cephalopods closely related to cuttlefish.
7
8 v0.56.7 "bobtail"
9 =================
10
11 This bobtail update fixes a range of radosgw bugs (including an easily
12 triggered crash from multi-delete), a possible data corruption issue
13 with power failure on XFS, and several OSD problems, including a
14 memory "leak" that will affect aged clusters.
15
16 Notable changes
17 ---------------
18
19 * ceph-fuse: create finisher flags after fork()
20 * debian: fix prerm/postinst hooks; do not restart daemons on upgrade
21 * librados: fix async aio completion wakeup (manifests as rbd hang)
22 * librados: fix hang when osd becomes full and then not full
23 * librados: fix locking for aio completion refcounting
24 * librbd python bindings: fix stripe_unit, stripe_count
25 * librbd: make image creation default configurable
26 * mon: fix validation of mds ids in mon commands
27 * osd: avoid excessive disk updates during peering
28 * osd: avoid excessive memory usage on scrub
29 * osd: avoid heartbeat failure/suicide when scrubbing
30 * osd: misc minor bug fixes
31 * osd: use fdatasync instead of sync_file_range (may avoid xfs power-loss corruption)
32 * rgw: escape prefix correctly when listing objects
33 * rgw: fix copy attrs
34 * rgw: fix crash on multi delete
35 * rgw: fix locking/crash when using ops log socket
36 * rgw: fix usage logging
37 * rgw: handle deep uri resources
38
39 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.7.txt>`.
40
41
42 v0.56.6 "bobtail"
43 =================
44
45 Notable changes
46 ---------------
47
48 * rgw: fix garbage collection
49 * rpm: fix package dependencies
50
51 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.6.txt>`.
52
53
54 v0.56.5 "bobtail"
55 =================
56
57 Upgrading
58 ---------
59
60 * ceph-disk[-prepare,-activate] behavior has changed in various ways.
61 There should not be any compatibility issues, but chef users should
62 be aware.
63
64 Notable changes
65 ---------------
66
67 * mon: fix recording of quorum feature set (important for argonaut -> bobtail -> cuttlefish mon upgrades)
68 * osd: minor peering bug fixes
69 * osd: fix a few bugs when pools are renamed
70 * osd: fix occasionally corrupted pg stats
71 * osd: fix behavior when broken v0.56[.0] clients connect
72 * rbd: avoid FIEMAP ioctl on import (it is broken on some kernels)
73 * librbd: fixes for several request/reply ordering bugs
74 * librbd: only set STRIPINGV2 feature on new images when needed
75 * librbd: new async flush method to resolve qemu hangs (requires QEMU update as well)
76 * librbd: a few fixes to flatten
77 * ceph-disk: support for dm-crypt
78 * ceph-disk: many backports to allow bobtail deployments with ceph-deploy, chef
79 * sysvinit: do not stop starting daemons on first failure
80 * udev: fixed rules for redhat-based distros
81 * build fixes for raring
82
83 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.5.txt>`.
84
85 v0.56.4 "bobtail"
86 =================
87
88 Upgrading
89 ---------
90
91 * There is a fix in the syntax for the output of 'ceph osd tree --format=json'.
92
93 * The MDS disk format has changed from prior releases *and* from v0.57. In particular,
94 upgrades to v0.56.4 are safe, but you cannot move from v0.56.4 to v0.57 if you are using
95 the MDS for CephFS; you must upgrade directly to v0.58 (or later) instead.
96
97 Notable changes
98 ---------------
99
100 * mon: fix bug in bringup with IPv6
101 * reduce default memory utilization by internal logging (all daemons)
102 * rgw: fix for bucket removal
103 * rgw: reopen logs after log rotation
104 * rgw: fix multipat upload listing
105 * rgw: don't copy object when copied onto self
106 * osd: fix caps parsing for pools with - or _
107 * osd: allow pg log trimming when degraded, scrubbing, recoverying (reducing memory consumption)
108 * osd: fix potential deadlock when 'journal aio = true'
109 * osd: various fixes for collection creation/removal, rename, temp collections
110 * osd: various fixes for PG split
111 * osd: deep-scrub omap key/value data
112 * osd: fix rare bug in journal replay
113 * osd: misc fixes for snapshot tracking
114 * osd: fix leak in recovery reservations on pool deletion
115 * osd: fix bug in connection management
116 * osd: fix for op ordering when rebalancing
117 * ceph-fuse: report file system size with correct units
118 * mds: get and set directory layout policies via virtual xattrs
119 * mds: on-disk format revision (see upgrading note above)
120 * mkcephfs, init-ceph: close potential security issues with predictable filenames
121
122 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.4.txt>`.
123
124 v0.56.3 "bobtail"
125 =================
126
127 This release has several bug fixes surrounding OSD stability. Most
128 significantly, an issue with OSDs being unresponsive shortly after
129 startup (and occasionally crashing due to an internal heartbeat check)
130 is resolved. Please upgrade.
131
132 Upgrading
133 ---------
134
135 * A bug was fixed in which the OSDMap epoch for PGs without any IO
136 requests was not recorded. If there are pools in the cluster that
137 are completely idle (for example, the ``data`` and ``metadata``
138 pools normally used by CephFS), and a large number of OSDMap epochs
139 have elapsed since the ``ceph-osd`` daemon was last restarted, those
140 maps will get reprocessed when the daemon restarts. This process
141 can take a while if there are a lot of maps. A workaround is to
142 'touch' any idle pools with IO prior to restarting the daemons after
143 packages are upgraded::
144
145 rados bench 10 write -t 1 -b 4096 -p {POOLNAME}
146
147 This will typically generate enough IO to touch every PG in the pool
148 without generating significant cluster load, and also cleans up any
149 temporary objects it creates.
150
151 Notable changes
152 ---------------
153
154 * osd: flush peering work queue prior to start
155 * osd: persist osdmap epoch for idle PGs
156 * osd: fix and simplify connection handling for heartbeats
157 * osd: avoid crash on invalid admin command
158 * mon: fix rare races with monitor elections and commands
159 * mon: enforce that OSD reweights be between 0 and 1 (NOTE: not CRUSH weights)
160 * mon: approximate client, recovery bandwidth logging
161 * radosgw: fixed some XML formatting to conform to Swift API inconsistency
162 * radosgw: fix usage accounting bug; add repair tool
163 * radosgw: make fallback URI configurable (necessary on some web servers)
164 * librbd: fix handling for interrupted 'unprotect' operations
165 * mds, ceph-fuse: allow file and directory layouts to be modified via virtual xattrs
166
167 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.3.txt>`.
168
169
170 v0.56.2 "bobtail"
171 =================
172
173 This release has a wide range of bug fixes, stability improvements, and some performance improvements. Please upgrade.
174
175 Upgrading
176 ---------
177
178 * The meaning of the 'osd scrub min interval' and 'osd scrub max
179 interval' has changed slightly. The min interval used to be
180 meaningless, while the max interval would only trigger a scrub if
181 the load was sufficiently low. Now, the min interval option works
182 the way the old max interval did (it will trigger a scrub after this
183 amount of time if the load is low), while the max interval will
184 force a scrub regardless of load. The default options have been
185 adjusted accordingly. If you have customized these in ceph.conf,
186 please review their values when upgrading.
187
188 * CRUSH maps that are generated by default when calling ``ceph-mon
189 --mkfs`` directly now distribute replicas across hosts instead of
190 across OSDs. Any provisioning tools that are being used by Ceph may
191 be affected, although probably for the better, as distributing across
192 hosts is a much more commonly sought behavior. If you use
193 ``mkcephfs`` to create the cluster, the default CRUSH rule is still
194 inferred by the number of hosts and/or racks in the initial ceph.conf.
195
196 Notable changes
197 ---------------
198
199 * osd: snapshot trimming fixes
200 * osd: scrub snapshot metadata
201 * osd: fix osdmap trimming
202 * osd: misc peering fixes
203 * osd: stop heartbeating with peers if internal threads are stuck/hung
204 * osd: PG removal is friendlier to other workloads
205 * osd: fix recovery start delay (was causing very slow recovery)
206 * osd: fix scheduling of explicitly requested scrubs
207 * osd: fix scrub interval config options
208 * osd: improve recovery vs client io tuning
209 * osd: improve 'slow request' warning detail for better diagnosis
210 * osd: default CRUSH map now distributes across hosts, not OSDs
211 * osd: fix crash on 32-bit hosts triggered by librbd clients
212 * librbd: fix error handling when talking to older OSDs
213 * mon: fix a few rare crashes
214 * ceph command: ability to easily adjust CRUSH tunables
215 * radosgw: object copy does not copy source ACLs
216 * rados command: fix omap command usage
217 * sysvinit script: set ulimit -n properly on remote hosts
218 * msgr: fix narrow race with message queuing
219 * fixed compilation on some old distros (e.g., RHEL 5.x)
220
221 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.2.txt>`.
222
223
224 v0.56.1 "bobtail"
225 =================
226
227 This release has two critical fixes. Please upgrade.
228
229 Upgrading
230 ---------
231
232 * There is a protocol compatibility problem between v0.56 and any
233 other version that is now fixed. If your radosgw or RBD clients are
234 running v0.56, they will need to be upgraded too. If they are
235 running a version prior to v0.56, they can be left as is.
236
237 Notable changes
238 ---------------
239 * osd: fix commit sequence for XFS, ext4 (or any other non-btrfs) to prevent data loss on power cycle or kernel panic
240 * osd: fix compatibility for CALL operation
241 * osd: process old osdmaps prior to joining cluster (fixes slow startup)
242 * osd: fix a couple of recovery-related crashes
243 * osd: fix large io requests when journal is in (non-default) aio mode
244 * log: fix possible deadlock in logging code
245
246 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.1.txt>`.
247
248 v0.56 "bobtail"
249 ===============
250
251 Bobtail is the second stable release of Ceph, named in honor of the
252 `Bobtail Squid`: https://en.wikipedia.org/wiki/Bobtail_squid.
253
254 Key features since v0.48 "argonaut"
255 -----------------------------------
256
257 * Object Storage Daemon (OSD): improved threading, small-io performance, and performance during recovery
258 * Object Storage Daemon (OSD): regular "deep" scrubbing of all stored data to detect latent disk errors
259 * RADOS Block Device (RBD): support for copy-on-write clones of images.
260 * RADOS Block Device (RBD): better client-side caching.
261 * RADOS Block Device (RBD): advisory image locking
262 * Rados Gateway (RGW): support for efficient usage logging/scraping (for billing purposes)
263 * Rados Gateway (RGW): expanded S3 and Swift API coverage (e.g., POST, multi-object delete)
264 * Rados Gateway (RGW): improved striping for large objects
265 * Rados Gateway (RGW): OpenStack Keystone integration
266 * RPM packages for Fedora, RHEL/CentOS, OpenSUSE, and SLES
267 * mkcephfs: support for automatically formatting and mounting XFS and ext4 (in addition to btrfs)
268
269 Upgrading
270 ---------
271
272 Please refer to the document `Upgrading from Argonaut to Bobtail`_ for details.
273
274 .. _Upgrading from Argonaut to Bobtail: ../install/upgrading-ceph/#upgrading-from-argonaut-to-bobtail
275
276 * Cephx authentication is now enabled by default (since v0.55).
277 Upgrading a cluster without adjusting the Ceph configuration will
278 likely prevent the system from starting up on its own. We recommend
279 first modifying the configuration to indicate that authentication is
280 disabled, and only then upgrading to the latest version::
281
282 auth client required = none
283 auth service required = none
284 auth cluster required = none
285
286 * Ceph daemons can be upgraded one-by-one while the cluster is online
287 and in service.
288
289 * The ``ceph-osd`` daemons must be upgraded and restarted *before* any
290 ``radosgw`` daemons are restarted, as they depend on some new
291 ceph-osd functionality. (The ``ceph-mon``, ``ceph-osd``, and
292 ``ceph-mds`` daemons can be upgraded and restarted in any order.)
293
294 * Once each individual daemon has been upgraded and restarted, it
295 cannot be downgraded.
296
297 * The cluster of ``ceph-mon`` daemons will migrate to a new internal
298 on-wire protocol once all daemons in the quorum have been upgraded.
299 Upgrading only a majority of the nodes (e.g., two out of three) may
300 expose the cluster to a situation where a single additional failure
301 may compromise availability (because the non-upgraded daemon cannot
302 participate in the new protocol). We recommend not waiting for an
303 extended period of time between ``ceph-mon`` upgrades.
304
305 * The ops log and usage log for radosgw are now off by default. If
306 you need these logs (e.g., for billing purposes), you must enable
307 them explicitly. For logging of all operations to objects in the
308 ``.log`` pool (see ``radosgw-admin log ...``)::
309
310 rgw enable ops log = true
311
312 For usage logging of aggregated bandwidth usage (see ``radosgw-admin
313 usage ...``)::
314
315 rgw enable usage log = true
316
317 * You should not create or use "format 2" RBD images until after all
318 ``ceph-osd`` daemons have been upgraded. Note that "format 1" is
319 still the default. You can use the new ``ceph osd ls`` and
320 ``ceph tell osd.N version`` commands to doublecheck your cluster.
321 ``ceph osd ls`` will give a list of all OSD IDs that are part of the
322 cluster, and you can use that to write a simple shell loop to display
323 all the OSD version strings: ::
324
325 for i in $(ceph osd ls); do
326 ceph tell osd.${i} version
327 done
328
329
330 Compatibility changes
331 ---------------------
332
333 * The 'ceph osd create [<uuid>]' command now rejects an argument that
334 is not a UUID. (Previously it would take take an optional integer
335 OSD id.) This correct syntax has been 'ceph osd create [<uuid>]'
336 since v0.47, but the older calling convention was being silently
337 ignored.
338
339 * The CRUSH map root nodes now have type ``root`` instead of type
340 ``pool``. This avoids confusion with RADOS pools, which are not
341 directly related. Any scripts or tools that use the ``ceph osd
342 crush ...`` commands may need to be adjusted accordingly.
343
344 * The ``ceph osd pool create <poolname> <pgnum>`` command now requires
345 the ``pgnum`` argument. Previously this was optional, and would
346 default to 8, which was almost never a good number.
347
348 * Degraded mode (when there fewer than the desired number of replicas)
349 is now more configurable on a per-pool basis, with the min_size
350 parameter. By default, with min_size 0, this allows I/O to objects
351 with N - floor(N/2) replicas, where N is the total number of
352 expected copies. Argonaut behavior was equivalent to having min_size
353 = 1, so I/O would always be possible if any completely up to date
354 copy remained. min_size = 1 could result in lower overall
355 availability in certain cases, such as flapping network partitions.
356
357 * The sysvinit start/stop script now defaults to adjusting the max
358 open files ulimit to 16384. On most systems the default is 1024, so
359 this is an increase and won't break anything. If some system has a
360 higher initial value, however, this change will lower the limit.
361 The value can be adjusted explicitly by adding an entry to the
362 ``ceph.conf`` file in the appropriate section. For example::
363
364 [global]
365 max open files = 32768
366
367 * 'rbd lock list' and 'rbd showmapped' no longer use tabs as
368 separators in their output.
369
370 * There is configurable limit on the number of PGs when creating a new
371 pool, to prevent a user from accidentally specifying a ridiculous
372 number for pg_num. It can be adjusted via the 'mon max pool pg num'
373 option on the monitor, and defaults to 65536 (the current max
374 supported by the Linux kernel client).
375
376 * The osd capabilities associated with a rados user have changed
377 syntax since 0.48 argonaut. The new format is mostly backwards
378 compatible, but there are two backwards-incompatible changes:
379
380 * specifying a list of pools in one grant, i.e.
381 'allow r pool=foo,bar' is now done in separate grants, i.e.
382 'allow r pool=foo, allow r pool=bar'.
383
384 * restricting pool access by pool owner ('allow r uid=foo') is
385 removed. This feature was not very useful and unused in practice.
386
387 The new format is documented in the ceph-authtool man page.
388
389 * 'rbd cp' and 'rbd rename' use rbd as the default destination pool,
390 regardless of what pool the source image is in. Previously they
391 would default to the same pool as the source image.
392
393 * 'rbd export' no longer prints a message for each object written. It
394 just reports percent complete like other long-lasting operations.
395
396 * 'ceph osd tree' now uses 4 decimal places for weight so output is
397 nicer for humans
398
399 * Several monitor operations are now idempotent:
400
401 * ceph osd pool create
402 * ceph osd pool delete
403 * ceph osd pool mksnap
404 * ceph osd rm
405 * ceph pg <pgid> revert
406
407 Notable changes
408 ---------------
409
410 * auth: enable cephx by default
411 * auth: expanded authentication settings for greater flexibility
412 * auth: sign messages when using cephx
413 * build fixes for Fedora 18, CentOS/RHEL 6
414 * ceph: new 'osd ls' and 'osd tell <osd.N> version' commands
415 * ceph-debugpack: misc improvements
416 * ceph-disk-prepare: creates and labels GPT partitions
417 * ceph-disk-prepare: support for external journals, default mount/mkfs options, etc.
418 * ceph-fuse/libcephfs: many misc fixes, admin socket debugging
419 * ceph-fuse: fix handling for .. in root directory
420 * ceph-fuse: many fixes (including memory leaks, hangs)
421 * ceph-fuse: mount helper (mount.fuse.ceph) for use with /etc/fstab
422 * ceph.spec: misc packaging fixes
423 * common: thread pool sizes can now be adjusted at runtime
424 * config: $pid is now available as a metavariable
425 * crush: default root of tree type is now 'root' instead of 'pool' (to avoid confusiong wrt rados pools)
426 * crush: fixed retry behavior with chooseleaf via tunable
427 * crush: tunables documented; feature bit now present and enforced
428 * libcephfs: java wrapper
429 * librados: several bug fixes (rare races, locking errors)
430 * librados: some locking fixes
431 * librados: watch/notify fixes, misc memory leaks
432 * librbd: a few fixes to 'discard' support
433 * librbd: fine-grained striping feature
434 * librbd: fixed memory leaks
435 * librbd: fully functional and documented image cloning
436 * librbd: image (advisory) locking
437 * librbd: improved caching (of object non-existence)
438 * librbd: 'flatten' command to sever clone parent relationship
439 * librbd: 'protect'/'unprotect' commands to prevent clone parent from being deleted
440 * librbd: clip requests past end-of-image.
441 * librbd: fixes an issue with some windows guests running in qemu (remove floating point usage)
442 * log: fix in-memory buffering behavior (to only write log messages on crash)
443 * mds: fix ino release on abort session close, relative getattr path, mds shutdown, other misc items
444 * mds: misc fixes
445 * mkcephfs: fix for default keyring, osd data/journal locations
446 * mkcephfs: support for formatting xfs, ext4 (as well as btrfs)
447 * init: support for automatically mounting xfs and ext4 osd data directories
448 * mon, radosgw, ceph-fuse: fixed memory leaks
449 * mon: improved ENOSPC, fs error checking
450 * mon: less-destructive ceph-mon --mkfs behavior
451 * mon: misc fixes
452 * mon: more informative info about stuck PGs in 'health detail'
453 * mon: information about recovery and backfill in 'pg <pgid> query'
454 * mon: new 'osd crush create-or-move ...' command
455 * mon: new 'osd crush move ...' command lets you rearrange your CRUSH hierarchy
456 * mon: optionally dump 'osd tree' in json
457 * mon: configurable cap on maximum osd number (mon max osd)
458 * mon: many bug fixes (various races causing ceph-mon crashes)
459 * mon: new on-disk metadata to facilitate future mon changes (post-bobtail)
460 * mon: election bug fixes
461 * mon: throttle client messages (limit memory consumption)
462 * mon: throttle osd flapping based on osd history (limits osdmap ΄thrashing' on overloaded or unhappy clusters)
463 * mon: 'report' command for dumping detailed cluster status (e.g., for use when reporting bugs)
464 * mon: osdmap flags like noup, noin now cause a health warning
465 * msgr: improved failure handling code
466 * msgr: many bug fixes
467 * osd, mon: honor new 'nobackfill' and 'norecover' osdmap flags
468 * osd, mon: use feature bits to lock out clients lacking CRUSH tunables when they are in use
469 * osd: backfill reservation framework (to avoid flooding new osds with backfill data)
470 * osd: backfill target reservations (improve performance during recovery)
471 * osd: better tracking of recent slow operations
472 * osd: capability grammar improvements, bug fixes
473 * osd: client vs recovery io prioritization
474 * osd: crush performance improvements
475 * osd: default journal size to 5 GB
476 * osd: experimental support for PG "splitting" (pg_num adjustment for existing pools)
477 * osd: fix memory leak on certain error paths
478 * osd: fixed detection of EIO errors from fs on read
479 * osd: major refactor of PG peering and threading
480 * osd: many bug fixes
481 * osd: more/better dump info about in-progress operations
482 * osd: new caps structure (see compatibility notes)
483 * osd: new 'deep scrub' will compare object content across replicas (once per week by default)
484 * osd: new 'lock' rados class for generic object locking
485 * osd: optional 'min' pg size
486 * osd: recovery reservations
487 * osd: scrub efficiency improvement
488 * osd: several out of order reply bug fixes
489 * osd: several rare peering cases fixed
490 * osd: some performance improvements related to request queuing
491 * osd: use entire device if journal is a block device
492 * osd: use syncfs(2) when kernel supports it, even if glibc does not
493 * osd: various fixes for out-of-order op replies
494 * rados: ability to copy, rename pools
495 * rados: bench command now cleans up after itself
496 * rados: 'cppool' command to copy rados pools
497 * rados: 'rm' now accepts a list of objects to be removed
498 * radosgw: POST support
499 * radosgw: REST API for managing usage stats
500 * radosgw: fix bug in bucket stat updates
501 * radosgw: fix copy-object vs attributes
502 * radosgw: fix range header for large objects, ETag quoting, GMT dates, other compatibility fixes
503 * radosgw: improved garbage collection framework
504 * radosgw: many small fixes, cleanups
505 * radosgw: openstack keystone integration
506 * radosgw: stripe large (non-multipart) objects
507 * radosgw: support for multi-object deletes
508 * radosgw: support for swift manifest objects
509 * radosgw: vanity bucket dns names
510 * radosgw: various API compatibility fixes
511 * rbd: import from stdin, export to stdout
512 * rbd: new 'ls -l' option to view images with metadata
513 * rbd: use generic id and keyring options for 'rbd map'
514 * rbd: don't issue usage on errors
515 * udev: fix symlink creation for rbd images containing partitions
516 * upstart: job files for all daemon types (not enabled by default)
517 * wireshark: ceph protocol dissector patch updated
518
519
520 v0.54
521 =====
522
523 Upgrading
524 ---------
525
526 * The osd capabilities associated with a rados user have changed
527 syntax since 0.48 argonaut. The new format is mostly backwards
528 compatible, but there are two backwards-incompatible changes:
529
530 * specifying a list of pools in one grant, i.e.
531 'allow r pool=foo,bar' is now done in separate grants, i.e.
532 'allow r pool=foo, allow r pool=bar'.
533
534 * restricting pool access by pool owner ('allow r uid=foo') is
535 removed. This feature was not very useful and unused in practice.
536
537 The new format is documented in the ceph-authtool man page.
538
539 * Bug fixes to the new osd capability format parsing properly validate
540 the allowed operations. If an existing rados user gets permissions
541 errors after upgrading, its capabilities were probably
542 misconfigured. See the ceph-authtool man page for details on osd
543 capabilities.
544
545 * 'rbd lock list' and 'rbd showmapped' no longer use tabs as
546 separators in their output.