]> git.proxmox.com Git - ceph.git/blob - ceph/doc/releases/bobtail.rst
update sources to ceph Nautilus 14.2.1
[ceph.git] / ceph / doc / releases / bobtail.rst
1 v0.56.7 "bobtail"
2 =================
3
4 This bobtail update fixes a range of radosgw bugs (including an easily
5 triggered crash from multi-delete), a possible data corruption issue
6 with power failure on XFS, and several OSD problems, including a
7 memory "leak" that will affect aged clusters.
8
9 Notable changes
10 ---------------
11
12 * ceph-fuse: create finisher flags after fork()
13 * debian: fix prerm/postinst hooks; do not restart daemons on upgrade
14 * librados: fix async aio completion wakeup (manifests as rbd hang)
15 * librados: fix hang when osd becomes full and then not full
16 * librados: fix locking for aio completion refcounting
17 * librbd python bindings: fix stripe_unit, stripe_count
18 * librbd: make image creation default configurable
19 * mon: fix validation of mds ids in mon commands
20 * osd: avoid excessive disk updates during peering
21 * osd: avoid excessive memory usage on scrub
22 * osd: avoid heartbeat failure/suicide when scrubbing
23 * osd: misc minor bug fixes
24 * osd: use fdatasync instead of sync_file_range (may avoid xfs power-loss corruption)
25 * rgw: escape prefix correctly when listing objects
26 * rgw: fix copy attrs
27 * rgw: fix crash on multi delete
28 * rgw: fix locking/crash when using ops log socket
29 * rgw: fix usage logging
30 * rgw: handle deep uri resources
31
32 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.7.txt>`.
33
34
35 v0.56.6 "bobtail"
36 =================
37
38 Notable changes
39 ---------------
40
41 * rgw: fix garbage collection
42 * rpm: fix package dependencies
43
44 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.6.txt>`.
45
46
47 v0.56.5 "bobtail"
48 =================
49
50 Upgrading
51 ---------
52
53 * ceph-disk[-prepare,-activate] behavior has changed in various ways.
54 There should not be any compatibility issues, but chef users should
55 be aware.
56
57 Notable changes
58 ---------------
59
60 * mon: fix recording of quorum feature set (important for argonaut -> bobtail -> cuttlefish mon upgrades)
61 * osd: minor peering bug fixes
62 * osd: fix a few bugs when pools are renamed
63 * osd: fix occasionally corrupted pg stats
64 * osd: fix behavior when broken v0.56[.0] clients connect
65 * rbd: avoid FIEMAP ioctl on import (it is broken on some kernels)
66 * librbd: fixes for several request/reply ordering bugs
67 * librbd: only set STRIPINGV2 feature on new images when needed
68 * librbd: new async flush method to resolve qemu hangs (requires QEMU update as well)
69 * librbd: a few fixes to flatten
70 * ceph-disk: support for dm-crypt
71 * ceph-disk: many backports to allow bobtail deployments with ceph-deploy, chef
72 * sysvinit: do not stop starting daemons on first failure
73 * udev: fixed rules for redhat-based distros
74 * build fixes for raring
75
76 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.5.txt>`.
77
78 v0.56.4 "bobtail"
79 =================
80
81 Upgrading
82 ---------
83
84 * There is a fix in the syntax for the output of 'ceph osd tree --format=json'.
85
86 * The MDS disk format has changed from prior releases *and* from v0.57. In particular,
87 upgrades to v0.56.4 are safe, but you cannot move from v0.56.4 to v0.57 if you are using
88 the MDS for CephFS; you must upgrade directly to v0.58 (or later) instead.
89
90 Notable changes
91 ---------------
92
93 * mon: fix bug in bringup with IPv6
94 * reduce default memory utilization by internal logging (all daemons)
95 * rgw: fix for bucket removal
96 * rgw: reopen logs after log rotation
97 * rgw: fix multipat upload listing
98 * rgw: don't copy object when copied onto self
99 * osd: fix caps parsing for pools with - or _
100 * osd: allow pg log trimming when degraded, scrubbing, recoverying (reducing memory consumption)
101 * osd: fix potential deadlock when 'journal aio = true'
102 * osd: various fixes for collection creation/removal, rename, temp collections
103 * osd: various fixes for PG split
104 * osd: deep-scrub omap key/value data
105 * osd: fix rare bug in journal replay
106 * osd: misc fixes for snapshot tracking
107 * osd: fix leak in recovery reservations on pool deletion
108 * osd: fix bug in connection management
109 * osd: fix for op ordering when rebalancing
110 * ceph-fuse: report file system size with correct units
111 * mds: get and set directory layout policies via virtual xattrs
112 * mds: on-disk format revision (see upgrading note above)
113 * mkcephfs, init-ceph: close potential security issues with predictable filenames
114
115 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.4.txt>`.
116
117 v0.56.3 "bobtail"
118 =================
119
120 This release has several bug fixes surrounding OSD stability. Most
121 significantly, an issue with OSDs being unresponsive shortly after
122 startup (and occasionally crashing due to an internal heartbeat check)
123 is resolved. Please upgrade.
124
125 Upgrading
126 ---------
127
128 * A bug was fixed in which the OSDMap epoch for PGs without any IO
129 requests was not recorded. If there are pools in the cluster that
130 are completely idle (for example, the ``data`` and ``metadata``
131 pools normally used by CephFS), and a large number of OSDMap epochs
132 have elapsed since the ``ceph-osd`` daemon was last restarted, those
133 maps will get reprocessed when the daemon restarts. This process
134 can take a while if there are a lot of maps. A workaround is to
135 'touch' any idle pools with IO prior to restarting the daemons after
136 packages are upgraded::
137
138 rados bench 10 write -t 1 -b 4096 -p {POOLNAME}
139
140 This will typically generate enough IO to touch every PG in the pool
141 without generating significant cluster load, and also cleans up any
142 temporary objects it creates.
143
144 Notable changes
145 ---------------
146
147 * osd: flush peering work queue prior to start
148 * osd: persist osdmap epoch for idle PGs
149 * osd: fix and simplify connection handling for heartbeats
150 * osd: avoid crash on invalid admin command
151 * mon: fix rare races with monitor elections and commands
152 * mon: enforce that OSD reweights be between 0 and 1 (NOTE: not CRUSH weights)
153 * mon: approximate client, recovery bandwidth logging
154 * radosgw: fixed some XML formatting to conform to Swift API inconsistency
155 * radosgw: fix usage accounting bug; add repair tool
156 * radosgw: make fallback URI configurable (necessary on some web servers)
157 * librbd: fix handling for interrupted 'unprotect' operations
158 * mds, ceph-fuse: allow file and directory layouts to be modified via virtual xattrs
159
160 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.3.txt>`.
161
162
163 v0.56.2 "bobtail"
164 =================
165
166 This release has a wide range of bug fixes, stability improvements, and some performance improvements. Please upgrade.
167
168 Upgrading
169 ---------
170
171 * The meaning of the 'osd scrub min interval' and 'osd scrub max
172 interval' has changed slightly. The min interval used to be
173 meaningless, while the max interval would only trigger a scrub if
174 the load was sufficiently low. Now, the min interval option works
175 the way the old max interval did (it will trigger a scrub after this
176 amount of time if the load is low), while the max interval will
177 force a scrub regardless of load. The default options have been
178 adjusted accordingly. If you have customized these in ceph.conf,
179 please review their values when upgrading.
180
181 * CRUSH maps that are generated by default when calling ``ceph-mon
182 --mkfs`` directly now distribute replicas across hosts instead of
183 across OSDs. Any provisioning tools that are being used by Ceph may
184 be affected, although probably for the better, as distributing across
185 hosts is a much more commonly sought behavior. If you use
186 ``mkcephfs`` to create the cluster, the default CRUSH rule is still
187 inferred by the number of hosts and/or racks in the initial ceph.conf.
188
189 Notable changes
190 ---------------
191
192 * osd: snapshot trimming fixes
193 * osd: scrub snapshot metadata
194 * osd: fix osdmap trimming
195 * osd: misc peering fixes
196 * osd: stop heartbeating with peers if internal threads are stuck/hung
197 * osd: PG removal is friendlier to other workloads
198 * osd: fix recovery start delay (was causing very slow recovery)
199 * osd: fix scheduling of explicitly requested scrubs
200 * osd: fix scrub interval config options
201 * osd: improve recovery vs client io tuning
202 * osd: improve 'slow request' warning detail for better diagnosis
203 * osd: default CRUSH map now distributes across hosts, not OSDs
204 * osd: fix crash on 32-bit hosts triggered by librbd clients
205 * librbd: fix error handling when talking to older OSDs
206 * mon: fix a few rare crashes
207 * ceph command: ability to easily adjust CRUSH tunables
208 * radosgw: object copy does not copy source ACLs
209 * rados command: fix omap command usage
210 * sysvinit script: set ulimit -n properly on remote hosts
211 * msgr: fix narrow race with message queuing
212 * fixed compilation on some old distros (e.g., RHEL 5.x)
213
214 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.2.txt>`.
215
216
217 v0.56.1 "bobtail"
218 =================
219
220 This release has two critical fixes. Please upgrade.
221
222 Upgrading
223 ---------
224
225 * There is a protocol compatibility problem between v0.56 and any
226 other version that is now fixed. If your radosgw or RBD clients are
227 running v0.56, they will need to be upgraded too. If they are
228 running a version prior to v0.56, they can be left as is.
229
230 Notable changes
231 ---------------
232 * osd: fix commit sequence for XFS, ext4 (or any other non-btrfs) to prevent data loss on power cycle or kernel panic
233 * osd: fix compatibility for CALL operation
234 * osd: process old osdmaps prior to joining cluster (fixes slow startup)
235 * osd: fix a couple of recovery-related crashes
236 * osd: fix large io requests when journal is in (non-default) aio mode
237 * log: fix possible deadlock in logging code
238
239 For more detailed information, see :download:`the complete changelog <../changelog/v0.56.1.txt>`.
240
241 v0.56 "bobtail"
242 ===============
243
244 Bobtail is the second stable release of Ceph, named in honor of the
245 `Bobtail Squid`: https://en.wikipedia.org/wiki/Bobtail_squid.
246
247 Key features since v0.48 "argonaut"
248 -----------------------------------
249
250 * Object Storage Daemon (OSD): improved threading, small-io performance, and performance during recovery
251 * Object Storage Daemon (OSD): regular "deep" scrubbing of all stored data to detect latent disk errors
252 * RADOS Block Device (RBD): support for copy-on-write clones of images.
253 * RADOS Block Device (RBD): better client-side caching.
254 * RADOS Block Device (RBD): advisory image locking
255 * Rados Gateway (RGW): support for efficient usage logging/scraping (for billing purposes)
256 * Rados Gateway (RGW): expanded S3 and Swift API coverage (e.g., POST, multi-object delete)
257 * Rados Gateway (RGW): improved striping for large objects
258 * Rados Gateway (RGW): OpenStack Keystone integration
259 * RPM packages for Fedora, RHEL/CentOS, OpenSUSE, and SLES
260 * mkcephfs: support for automatically formatting and mounting XFS and ext4 (in addition to btrfs)
261
262 Upgrading
263 ---------
264
265 Please refer to the document `Upgrading from Argonaut to Bobtail`_ for details.
266
267 .. _Upgrading from Argonaut to Bobtail: ../install/upgrading-ceph/#upgrading-from-argonaut-to-bobtail
268
269 * Cephx authentication is now enabled by default (since v0.55).
270 Upgrading a cluster without adjusting the Ceph configuration will
271 likely prevent the system from starting up on its own. We recommend
272 first modifying the configuration to indicate that authentication is
273 disabled, and only then upgrading to the latest version::
274
275 auth client required = none
276 auth service required = none
277 auth cluster required = none
278
279 * Ceph daemons can be upgraded one-by-one while the cluster is online
280 and in service.
281
282 * The ``ceph-osd`` daemons must be upgraded and restarted *before* any
283 ``radosgw`` daemons are restarted, as they depend on some new
284 ceph-osd functionality. (The ``ceph-mon``, ``ceph-osd``, and
285 ``ceph-mds`` daemons can be upgraded and restarted in any order.)
286
287 * Once each individual daemon has been upgraded and restarted, it
288 cannot be downgraded.
289
290 * The cluster of ``ceph-mon`` daemons will migrate to a new internal
291 on-wire protocol once all daemons in the quorum have been upgraded.
292 Upgrading only a majority of the nodes (e.g., two out of three) may
293 expose the cluster to a situation where a single additional failure
294 may compromise availability (because the non-upgraded daemon cannot
295 participate in the new protocol). We recommend not waiting for an
296 extended period of time between ``ceph-mon`` upgrades.
297
298 * The ops log and usage log for radosgw are now off by default. If
299 you need these logs (e.g., for billing purposes), you must enable
300 them explicitly. For logging of all operations to objects in the
301 ``.log`` pool (see ``radosgw-admin log ...``)::
302
303 rgw enable ops log = true
304
305 For usage logging of aggregated bandwidth usage (see ``radosgw-admin
306 usage ...``)::
307
308 rgw enable usage log = true
309
310 * You should not create or use "format 2" RBD images until after all
311 ``ceph-osd`` daemons have been upgraded. Note that "format 1" is
312 still the default. You can use the new ``ceph osd ls`` and
313 ``ceph tell osd.N version`` commands to doublecheck your cluster.
314 ``ceph osd ls`` will give a list of all OSD IDs that are part of the
315 cluster, and you can use that to write a simple shell loop to display
316 all the OSD version strings: ::
317
318 for i in $(ceph osd ls); do
319 ceph tell osd.${i} version
320 done
321
322
323 Compatibility changes
324 ---------------------
325
326 * The 'ceph osd create [<uuid>]' command now rejects an argument that
327 is not a UUID. (Previously it would take take an optional integer
328 OSD id.) This correct syntax has been 'ceph osd create [<uuid>]'
329 since v0.47, but the older calling convention was being silently
330 ignored.
331
332 * The CRUSH map root nodes now have type ``root`` instead of type
333 ``pool``. This avoids confusion with RADOS pools, which are not
334 directly related. Any scripts or tools that use the ``ceph osd
335 crush ...`` commands may need to be adjusted accordingly.
336
337 * The ``ceph osd pool create <poolname> <pgnum>`` command now requires
338 the ``pgnum`` argument. Previously this was optional, and would
339 default to 8, which was almost never a good number.
340
341 * Degraded mode (when there fewer than the desired number of replicas)
342 is now more configurable on a per-pool basis, with the min_size
343 parameter. By default, with min_size 0, this allows I/O to objects
344 with N - floor(N/2) replicas, where N is the total number of
345 expected copies. Argonaut behavior was equivalent to having min_size
346 = 1, so I/O would always be possible if any completely up to date
347 copy remained. min_size = 1 could result in lower overall
348 availability in certain cases, such as flapping network partitions.
349
350 * The sysvinit start/stop script now defaults to adjusting the max
351 open files ulimit to 16384. On most systems the default is 1024, so
352 this is an increase and won't break anything. If some system has a
353 higher initial value, however, this change will lower the limit.
354 The value can be adjusted explicitly by adding an entry to the
355 ``ceph.conf`` file in the appropriate section. For example::
356
357 [global]
358 max open files = 32768
359
360 * 'rbd lock list' and 'rbd showmapped' no longer use tabs as
361 separators in their output.
362
363 * There is configurable limit on the number of PGs when creating a new
364 pool, to prevent a user from accidentally specifying a ridiculous
365 number for pg_num. It can be adjusted via the 'mon max pool pg num'
366 option on the monitor, and defaults to 65536 (the current max
367 supported by the Linux kernel client).
368
369 * The osd capabilities associated with a rados user have changed
370 syntax since 0.48 argonaut. The new format is mostly backwards
371 compatible, but there are two backwards-incompatible changes:
372
373 * specifying a list of pools in one grant, i.e.
374 'allow r pool=foo,bar' is now done in separate grants, i.e.
375 'allow r pool=foo, allow r pool=bar'.
376
377 * restricting pool access by pool owner ('allow r uid=foo') is
378 removed. This feature was not very useful and unused in practice.
379
380 The new format is documented in the ceph-authtool man page.
381
382 * 'rbd cp' and 'rbd rename' use rbd as the default destination pool,
383 regardless of what pool the source image is in. Previously they
384 would default to the same pool as the source image.
385
386 * 'rbd export' no longer prints a message for each object written. It
387 just reports percent complete like other long-lasting operations.
388
389 * 'ceph osd tree' now uses 4 decimal places for weight so output is
390 nicer for humans
391
392 * Several monitor operations are now idempotent:
393
394 * ceph osd pool create
395 * ceph osd pool delete
396 * ceph osd pool mksnap
397 * ceph osd rm
398 * ceph pg <pgid> revert
399
400 Notable changes
401 ---------------
402
403 * auth: enable cephx by default
404 * auth: expanded authentication settings for greater flexibility
405 * auth: sign messages when using cephx
406 * build fixes for Fedora 18, CentOS/RHEL 6
407 * ceph: new 'osd ls' and 'osd tell <osd.N> version' commands
408 * ceph-debugpack: misc improvements
409 * ceph-disk-prepare: creates and labels GPT partitions
410 * ceph-disk-prepare: support for external journals, default mount/mkfs options, etc.
411 * ceph-fuse/libcephfs: many misc fixes, admin socket debugging
412 * ceph-fuse: fix handling for .. in root directory
413 * ceph-fuse: many fixes (including memory leaks, hangs)
414 * ceph-fuse: mount helper (mount.fuse.ceph) for use with /etc/fstab
415 * ceph.spec: misc packaging fixes
416 * common: thread pool sizes can now be adjusted at runtime
417 * config: $pid is now available as a metavariable
418 * crush: default root of tree type is now 'root' instead of 'pool' (to avoid confusiong wrt rados pools)
419 * crush: fixed retry behavior with chooseleaf via tunable
420 * crush: tunables documented; feature bit now present and enforced
421 * libcephfs: java wrapper
422 * librados: several bug fixes (rare races, locking errors)
423 * librados: some locking fixes
424 * librados: watch/notify fixes, misc memory leaks
425 * librbd: a few fixes to 'discard' support
426 * librbd: fine-grained striping feature
427 * librbd: fixed memory leaks
428 * librbd: fully functional and documented image cloning
429 * librbd: image (advisory) locking
430 * librbd: improved caching (of object non-existence)
431 * librbd: 'flatten' command to sever clone parent relationship
432 * librbd: 'protect'/'unprotect' commands to prevent clone parent from being deleted
433 * librbd: clip requests past end-of-image.
434 * librbd: fixes an issue with some windows guests running in qemu (remove floating point usage)
435 * log: fix in-memory buffering behavior (to only write log messages on crash)
436 * mds: fix ino release on abort session close, relative getattr path, mds shutdown, other misc items
437 * mds: misc fixes
438 * mkcephfs: fix for default keyring, osd data/journal locations
439 * mkcephfs: support for formatting xfs, ext4 (as well as btrfs)
440 * init: support for automatically mounting xfs and ext4 osd data directories
441 * mon, radosgw, ceph-fuse: fixed memory leaks
442 * mon: improved ENOSPC, fs error checking
443 * mon: less-destructive ceph-mon --mkfs behavior
444 * mon: misc fixes
445 * mon: more informative info about stuck PGs in 'health detail'
446 * mon: information about recovery and backfill in 'pg <pgid> query'
447 * mon: new 'osd crush create-or-move ...' command
448 * mon: new 'osd crush move ...' command lets you rearrange your CRUSH hierarchy
449 * mon: optionally dump 'osd tree' in json
450 * mon: configurable cap on maximum osd number (mon max osd)
451 * mon: many bug fixes (various races causing ceph-mon crashes)
452 * mon: new on-disk metadata to facilitate future mon changes (post-bobtail)
453 * mon: election bug fixes
454 * mon: throttle client messages (limit memory consumption)
455 * mon: throttle osd flapping based on osd history (limits osdmap ΄thrashing' on overloaded or unhappy clusters)
456 * mon: 'report' command for dumping detailed cluster status (e.g., for use when reporting bugs)
457 * mon: osdmap flags like noup, noin now cause a health warning
458 * msgr: improved failure handling code
459 * msgr: many bug fixes
460 * osd, mon: honor new 'nobackfill' and 'norecover' osdmap flags
461 * osd, mon: use feature bits to lock out clients lacking CRUSH tunables when they are in use
462 * osd: backfill reservation framework (to avoid flooding new osds with backfill data)
463 * osd: backfill target reservations (improve performance during recovery)
464 * osd: better tracking of recent slow operations
465 * osd: capability grammar improvements, bug fixes
466 * osd: client vs recovery io prioritization
467 * osd: crush performance improvements
468 * osd: default journal size to 5 GB
469 * osd: experimental support for PG "splitting" (pg_num adjustment for existing pools)
470 * osd: fix memory leak on certain error paths
471 * osd: fixed detection of EIO errors from fs on read
472 * osd: major refactor of PG peering and threading
473 * osd: many bug fixes
474 * osd: more/better dump info about in-progress operations
475 * osd: new caps structure (see compatibility notes)
476 * osd: new 'deep scrub' will compare object content across replicas (once per week by default)
477 * osd: new 'lock' rados class for generic object locking
478 * osd: optional 'min' pg size
479 * osd: recovery reservations
480 * osd: scrub efficiency improvement
481 * osd: several out of order reply bug fixes
482 * osd: several rare peering cases fixed
483 * osd: some performance improvements related to request queuing
484 * osd: use entire device if journal is a block device
485 * osd: use syncfs(2) when kernel supports it, even if glibc does not
486 * osd: various fixes for out-of-order op replies
487 * rados: ability to copy, rename pools
488 * rados: bench command now cleans up after itself
489 * rados: 'cppool' command to copy rados pools
490 * rados: 'rm' now accepts a list of objects to be removed
491 * radosgw: POST support
492 * radosgw: REST API for managing usage stats
493 * radosgw: fix bug in bucket stat updates
494 * radosgw: fix copy-object vs attributes
495 * radosgw: fix range header for large objects, ETag quoting, GMT dates, other compatibility fixes
496 * radosgw: improved garbage collection framework
497 * radosgw: many small fixes, cleanups
498 * radosgw: openstack keystone integration
499 * radosgw: stripe large (non-multipart) objects
500 * radosgw: support for multi-object deletes
501 * radosgw: support for swift manifest objects
502 * radosgw: vanity bucket dns names
503 * radosgw: various API compatibility fixes
504 * rbd: import from stdin, export to stdout
505 * rbd: new 'ls -l' option to view images with metadata
506 * rbd: use generic id and keyring options for 'rbd map'
507 * rbd: don't issue usage on errors
508 * udev: fix symlink creation for rbd images containing partitions
509 * upstart: job files for all daemon types (not enabled by default)
510 * wireshark: ceph protocol dissector patch updated
511
512
513 v0.54
514 =====
515
516 Upgrading
517 ---------
518
519 * The osd capabilities associated with a rados user have changed
520 syntax since 0.48 argonaut. The new format is mostly backwards
521 compatible, but there are two backwards-incompatible changes:
522
523 * specifying a list of pools in one grant, i.e.
524 'allow r pool=foo,bar' is now done in separate grants, i.e.
525 'allow r pool=foo, allow r pool=bar'.
526
527 * restricting pool access by pool owner ('allow r uid=foo') is
528 removed. This feature was not very useful and unused in practice.
529
530 The new format is documented in the ceph-authtool man page.
531
532 * Bug fixes to the new osd capability format parsing properly validate
533 the allowed operations. If an existing rados user gets permissions
534 errors after upgrading, its capabilities were probably
535 misconfigured. See the ceph-authtool man page for details on osd
536 capabilities.
537
538 * 'rbd lock list' and 'rbd showmapped' no longer use tabs as
539 separators in their output.