]> git.proxmox.com Git - ceph.git/blame - ceph/doc/cephfs/full.rst
update source to Ceph Pacific 16.2.2
[ceph.git] / ceph / doc / cephfs / full.rst
CommitLineData
7c673cae 1
9f95a23c
TL
2Handling a full Ceph file system
3================================
7c673cae
FG
4
5When a RADOS cluster reaches its ``mon_osd_full_ratio`` (default
695%) capacity, it is marked with the OSD full flag. This flag causes
7most normal RADOS clients to pause all operations until it is resolved
8(for example by adding more capacity to the cluster).
9
9f95a23c 10The file system has some special handling of the full flag, explained below.
7c673cae
FG
11
12Hammer and later
13----------------
14
9f95a23c 15Since the hammer release, a full file system will lead to ENOSPC
7c673cae
FG
16results from:
17
18 * Data writes on the client
19 * Metadata operations other than deletes and truncates
20
21Because the full condition may not be encountered until
22data is flushed to disk (sometime after a ``write`` call has already
23returned 0), the ENOSPC error may not be seen until the application
24calls ``fsync`` or ``fclose`` (or equivalent) on the file handle.
25
26Calling ``fsync`` is guaranteed to reliably indicate whether the data
27made it to disk, and will return an error if it doesn't. ``fclose`` will
28only return an error if buffered data happened to be flushed since
29the last write -- a successful ``fclose`` does not guarantee that the
30data made it to disk, and in a full-space situation, buffered data
31may be discarded after an ``fclose`` if no space is available to persist it.
32
33.. warning::
9f95a23c 34 If an application appears to be misbehaving on a full file system,
7c673cae
FG
35 check that it is performing ``fsync()`` calls as necessary to ensure
36 data is on disk before proceeding.
37
38Data writes may be cancelled by the client if they are in flight at the
39time the OSD full flag is sent. Clients update the ``osd_epoch_barrier``
40when releasing capabilities on files affected by cancelled operations, in
41order to ensure that these cancelled operations do not interfere with
42subsequent access to the data objects by the MDS or other clients. For
f67539c2 43more on the epoch barrier mechanism, see :ref:`background_blocklisting_and_osd_epoch_barrier`.
7c673cae
FG
44
45Legacy (pre-hammer) behavior
46----------------------------
47
48In versions of Ceph earlier than hammer, the MDS would ignore
49the full status of the RADOS cluster, and any data writes from
50clients would stall until the cluster ceased to be full.
51
52There are two dangerous conditions to watch for with this behaviour:
53
54* If a client had pending writes to a file, then it was not possible
55 for the client to release the file to the MDS for deletion: this could
9f95a23c 56 lead to difficulty clearing space on a full file system
7c673cae
FG
57* If clients continued to create a large number of empty files, the
58 resulting metadata writes from the MDS could lead to total exhaustion
59 of space on the OSDs such that no further deletions could be performed.
60