]> git.proxmox.com Git - ceph.git/blame - ceph/doc/cephfs/scrub.rst
import ceph quincy 17.2.6
[ceph.git] / ceph / doc / cephfs / scrub.rst
CommitLineData
11fdf7f2
TL
1.. _mds-scrub:
2
9f95a23c
TL
3======================
4Ceph File System Scrub
5======================
11fdf7f2 6
9f95a23c 7CephFS provides the cluster admin (operator) to check consistency of a file system
11fdf7f2
TL
8via a set of scrub commands. Scrub can be classified into two parts:
9
9f95a23c 10#. Forward Scrub: In which the scrub operation starts at the root of the file system
11fdf7f2
TL
11 (or a sub directory) and looks at everything that can be touched in the hierarchy
12 to ensure consistency.
13
14#. Backward Scrub: In which the scrub operation looks at every RADOS object in the
9f95a23c 15 file system pools and maps it back to the file system hierarchy.
11fdf7f2
TL
16
17This document details commands to initiate and control forward scrub (referred as
18scrub thereafter).
19
f67539c2
TL
20.. warning::
21
22 CephFS forward scrubs are started and manipulated on rank 0. All scrub
23 commands must be directed at rank 0.
24
9f95a23c
TL
25Initiate File System Scrub
26==========================
11fdf7f2 27
f67539c2 28To start a scrub operation for a directory tree use the following command::
11fdf7f2 29
f67539c2 30 ceph tell mds.<fsname>:0 scrub start <path> [scrubopts] [tag]
11fdf7f2 31
f67539c2
TL
32where ``scrubopts`` is a comma delimited list of ``recursive``, ``force``, or
33``repair`` and ``tag`` is an optional custom string tag (the default is a generated
34UUID). An example command is::
35
36 ceph tell mds.cephfs:0 scrub start / recursive
11fdf7f2
TL
37 {
38 "return_code": 0,
39 "scrub_tag": "6f0d204c-6cfd-4300-9e02-73f382fd23c1",
40 "mode": "asynchronous"
41 }
42
f67539c2
TL
43Recursive scrub is asynchronous (as hinted by `mode` in the output above).
44Asynchronous scrubs must be polled using ``scrub status`` to determine the
45status.
11fdf7f2 46
f67539c2
TL
47The scrub tag is used to differentiate scrubs and also to mark each inode's
48first data object in the default data pool (where the backtrace information is
49stored) with a ``scrub_tag`` extended attribute with the value of the tag. You
50can verify an inode was scrubbed by looking at the extended attribute using the
51RADOS utilities.
11fdf7f2 52
f67539c2
TL
53Scrubs work for multiple active MDS (multiple ranks). The scrub is managed by
54rank 0 and distributed across MDS as appropriate.
11fdf7f2
TL
55
56
9f95a23c
TL
57Monitor (ongoing) File System Scrubs
58====================================
11fdf7f2 59
f67539c2
TL
60Status of ongoing scrubs can be monitored and polled using in `scrub status`
61command. This commands lists out ongoing scrubs (identified by the tag) along
62with the path and options used to initiate the scrub::
11fdf7f2 63
f67539c2 64 ceph tell mds.cephfs:0 scrub status
11fdf7f2
TL
65 {
66 "status": "scrub active (85 inodes in the stack)",
67 "scrubs": {
68 "6f0d204c-6cfd-4300-9e02-73f382fd23c1": {
69 "path": "/",
70 "options": "recursive"
71 }
72 }
73 }
74
75`status` shows the number of inodes that are scheduled to be scrubbed at any point in time,
9f95a23c
TL
76hence, can change on subsequent `scrub status` invocations. Also, a high level summary of
77scrub operation (which includes the operation state and paths on which scrub is triggered)
f67539c2 78gets displayed in `ceph status`::
9f95a23c
TL
79
80 ceph status
81 [...]
82
83 task status:
84 scrub status:
85 mds.0: active [paths:/]
86
87 [...]
11fdf7f2 88
f67539c2
TL
89A scrub is complete when it no longer shows up in this list (although that may
90change in future releases). Any damage will be reported via cluster health warnings.
91
9f95a23c
TL
92Control (ongoing) File System Scrubs
93====================================
11fdf7f2
TL
94
95- Pause: Pausing ongoing scrub operations results in no new or pending inodes being
96 scrubbed after in-flight RADOS ops (for the inodes that are currently being scrubbed)
f67539c2 97 finish::
11fdf7f2 98
f67539c2 99 ceph tell mds.cephfs:0 scrub pause
11fdf7f2
TL
100 {
101 "return_code": 0
102 }
103
f67539c2
TL
104 The ``scrub status`` after pausing reflects the paused state. At this point,
105 initiating new scrub operations (via ``scrub start``) would just queue the
106 inode for scrub::
11fdf7f2 107
f67539c2 108 ceph tell mds.cephfs:0 scrub status
11fdf7f2
TL
109 {
110 "status": "PAUSED (66 inodes in the stack)",
111 "scrubs": {
112 "6f0d204c-6cfd-4300-9e02-73f382fd23c1": {
113 "path": "/",
114 "options": "recursive"
115 }
116 }
117 }
118
f67539c2 119- Resume: Resuming kick starts a paused scrub operation::
11fdf7f2 120
f67539c2 121 ceph tell mds.cephfs:0 scrub resume
11fdf7f2
TL
122 {
123 "return_code": 0
124 }
125
126- Abort: Aborting ongoing scrub operations removes pending inodes from the scrub
127 queue (thereby aborting the scrub) after in-flight RADOS ops (for the inodes that
f67539c2 128 are currently being scrubbed) finish::
11fdf7f2 129
f67539c2 130 ceph tell mds.cephfs:0 scrub abort
11fdf7f2
TL
131 {
132 "return_code": 0
133 }
39ae355f
TL
134
135Damages
136=======
137
138The types of damage that can be reported and repaired by File System Scrub are:
139
140* DENTRY : Inode's dentry is missing.
141
142* DIR_FRAG : Inode's directory fragment(s) is missing.
143
144* BACKTRACE : Inode's backtrace in the data pool is corrupted.
145