]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | |
2 | Handling a full Ceph filesystem | |
3 | =============================== | |
4 | ||
5 | When a RADOS cluster reaches its ``mon_osd_full_ratio`` (default | |
6 | 95%) capacity, it is marked with the OSD full flag. This flag causes | |
7 | most normal RADOS clients to pause all operations until it is resolved | |
8 | (for example by adding more capacity to the cluster). | |
9 | ||
10 | The filesystem has some special handling of the full flag, explained below. | |
11 | ||
12 | Hammer and later | |
13 | ---------------- | |
14 | ||
15 | Since the hammer release, a full filesystem will lead to ENOSPC | |
16 | results from: | |
17 | ||
18 | * Data writes on the client | |
19 | * Metadata operations other than deletes and truncates | |
20 | ||
21 | Because the full condition may not be encountered until | |
22 | data is flushed to disk (sometime after a ``write`` call has already | |
23 | returned 0), the ENOSPC error may not be seen until the application | |
24 | calls ``fsync`` or ``fclose`` (or equivalent) on the file handle. | |
25 | ||
26 | Calling ``fsync`` is guaranteed to reliably indicate whether the data | |
27 | made it to disk, and will return an error if it doesn't. ``fclose`` will | |
28 | only return an error if buffered data happened to be flushed since | |
29 | the last write -- a successful ``fclose`` does not guarantee that the | |
30 | data made it to disk, and in a full-space situation, buffered data | |
31 | may be discarded after an ``fclose`` if no space is available to persist it. | |
32 | ||
33 | .. warning:: | |
34 | If an application appears to be misbehaving on a full filesystem, | |
35 | check that it is performing ``fsync()`` calls as necessary to ensure | |
36 | data is on disk before proceeding. | |
37 | ||
38 | Data writes may be cancelled by the client if they are in flight at the | |
39 | time the OSD full flag is sent. Clients update the ``osd_epoch_barrier`` | |
40 | when releasing capabilities on files affected by cancelled operations, in | |
41 | order to ensure that these cancelled operations do not interfere with | |
42 | subsequent access to the data objects by the MDS or other clients. For | |
43 | more on the epoch barrier mechanism, see :doc:`eviction`. | |
44 | ||
45 | Legacy (pre-hammer) behavior | |
46 | ---------------------------- | |
47 | ||
48 | In versions of Ceph earlier than hammer, the MDS would ignore | |
49 | the full status of the RADOS cluster, and any data writes from | |
50 | clients would stall until the cluster ceased to be full. | |
51 | ||
52 | There are two dangerous conditions to watch for with this behaviour: | |
53 | ||
54 | * If a client had pending writes to a file, then it was not possible | |
55 | for the client to release the file to the MDS for deletion: this could | |
56 | lead to difficulty clearing space on a full filesystem | |
57 | * If clients continued to create a large number of empty files, the | |
58 | resulting metadata writes from the MDS could lead to total exhaustion | |
59 | of space on the OSDs such that no further deletions could be performed. | |
60 |