]>
Commit | Line | Data |
---|---|---|
11fdf7f2 | 1 | .. _cephfs-disaster-recovery: |
7c673cae FG |
2 | |
3 | Disaster recovery | |
4 | ================= | |
5 | ||
11fdf7f2 TL |
6 | Metadata damage and repair |
7 | -------------------------- | |
7c673cae | 8 | |
9f95a23c | 9 | If a file system has inconsistent or missing metadata, it is considered |
11fdf7f2 TL |
10 | *damaged*. You may find out about damage from a health message, or in some |
11 | unfortunate cases from an assertion in a running MDS daemon. | |
7c673cae | 12 | |
11fdf7f2 TL |
13 | Metadata damage can result either from data loss in the underlying RADOS |
14 | layer (e.g. multiple disk failures that lose all copies of a PG), or from | |
15 | software bugs. | |
7c673cae | 16 | |
9f95a23c | 17 | CephFS includes some tools that may be able to recover a damaged file system, |
11fdf7f2 TL |
18 | but to use them safely requires a solid understanding of CephFS internals. |
19 | The documentation for these potentially dangerous operations is on a | |
20 | separate page: :ref:`disaster-recovery-experts`. | |
7c673cae | 21 | |
11fdf7f2 TL |
22 | Data pool damage (files affected by lost data PGs) |
23 | -------------------------------------------------- | |
7c673cae | 24 | |
9f95a23c | 25 | If a PG is lost in a *data* pool, then the file system will continue |
11fdf7f2 TL |
26 | to operate normally, but some parts of some files will simply |
27 | be missing (reads will return zeros). | |
7c673cae FG |
28 | |
29 | Losing a data PG may affect many files. Files are split into many objects, | |
30 | so identifying which files are affected by loss of particular PGs requires | |
31 | a full scan over all object IDs that may exist within the size of a file. | |
32 | This type of scan may be useful for identifying which files require | |
33 | restoring from a backup. | |
34 | ||
35 | .. danger:: | |
36 | ||
37 | This command does not repair any metadata, so when restoring files in | |
38 | this case you must *remove* the damaged file, and replace it in order | |
39 | to have a fresh inode. Do not overwrite damaged files in place. | |
40 | ||
41 | If you know that objects have been lost from PGs, use the ``pg_files`` | |
42 | subcommand to scan for files that may have been damaged as a result: | |
43 | ||
44 | :: | |
45 | ||
46 | cephfs-data-scan pg_files <path> <pg id> [<pg id>...] | |
47 | ||
48 | For example, if you have lost data from PGs 1.4 and 4.5, and you would like | |
49 | to know which files under /home/bob might have been damaged: | |
50 | ||
51 | :: | |
52 | ||
53 | cephfs-data-scan pg_files /home/bob 1.4 4.5 | |
54 | ||
55 | The output will be a list of paths to potentially damaged files, one | |
56 | per line. | |
57 | ||
58 | Note that this command acts as a normal CephFS client to find all the | |
9f95a23c | 59 | files in the file system and read their layouts, so the MDS must be |
7c673cae FG |
60 | up and running. |
61 |