]>
Commit | Line | Data |
---|---|---|
7c673cae | 1 | |
9f95a23c TL |
2 | Handling a full Ceph file system |
3 | ================================ | |
7c673cae FG |
4 | |
5 | When a RADOS cluster reaches its ``mon_osd_full_ratio`` (default | |
6 | 95%) capacity, it is marked with the OSD full flag. This flag causes | |
7 | most normal RADOS clients to pause all operations until it is resolved | |
8 | (for example by adding more capacity to the cluster). | |
9 | ||
9f95a23c | 10 | The file system has some special handling of the full flag, explained below. |
7c673cae FG |
11 | |
12 | Hammer and later | |
13 | ---------------- | |
14 | ||
9f95a23c | 15 | Since the hammer release, a full file system will lead to ENOSPC |
7c673cae FG |
16 | results from: |
17 | ||
18 | * Data writes on the client | |
19 | * Metadata operations other than deletes and truncates | |
20 | ||
21 | Because the full condition may not be encountered until | |
22 | data is flushed to disk (sometime after a ``write`` call has already | |
23 | returned 0), the ENOSPC error may not be seen until the application | |
24 | calls ``fsync`` or ``fclose`` (or equivalent) on the file handle. | |
25 | ||
26 | Calling ``fsync`` is guaranteed to reliably indicate whether the data | |
27 | made it to disk, and will return an error if it doesn't. ``fclose`` will | |
28 | only return an error if buffered data happened to be flushed since | |
29 | the last write -- a successful ``fclose`` does not guarantee that the | |
30 | data made it to disk, and in a full-space situation, buffered data | |
31 | may be discarded after an ``fclose`` if no space is available to persist it. | |
32 | ||
33 | .. warning:: | |
9f95a23c | 34 | If an application appears to be misbehaving on a full file system, |
7c673cae FG |
35 | check that it is performing ``fsync()`` calls as necessary to ensure |
36 | data is on disk before proceeding. | |
37 | ||
38 | Data writes may be cancelled by the client if they are in flight at the | |
39 | time the OSD full flag is sent. Clients update the ``osd_epoch_barrier`` | |
40 | when releasing capabilities on files affected by cancelled operations, in | |
41 | order to ensure that these cancelled operations do not interfere with | |
42 | subsequent access to the data objects by the MDS or other clients. For | |
f67539c2 | 43 | more on the epoch barrier mechanism, see :ref:`background_blocklisting_and_osd_epoch_barrier`. |
7c673cae FG |
44 | |
45 | Legacy (pre-hammer) behavior | |
46 | ---------------------------- | |
47 | ||
48 | In versions of Ceph earlier than hammer, the MDS would ignore | |
49 | the full status of the RADOS cluster, and any data writes from | |
50 | clients would stall until the cluster ceased to be full. | |
51 | ||
52 | There are two dangerous conditions to watch for with this behaviour: | |
53 | ||
54 | * If a client had pending writes to a file, then it was not possible | |
55 | for the client to release the file to the MDS for deletion: this could | |
9f95a23c | 56 | lead to difficulty clearing space on a full file system |
7c673cae FG |
57 | * If clients continued to create a large number of empty files, the |
58 | resulting metadata writes from the MDS could lead to total exhaustion | |
59 | of space on the OSDs such that no further deletions could be performed. | |
60 |