]>
Commit | Line | Data |
---|---|---|
eafe8130 TL |
1 | .. _fs-volumes-and-subvolumes: |
2 | ||
3 | FS volumes and subvolumes | |
4 | ========================= | |
5 | ||
6 | A single source of truth for CephFS exports is implemented in the volumes | |
7 | module of the :term:`Ceph Manager` daemon (ceph-mgr). The OpenStack shared | |
9f95a23c | 8 | file system service (manila_), Ceph Container Storage Interface (CSI_), |
eafe8130 TL |
9 | storage administrators among others can use the common CLI provided by the |
10 | ceph-mgr volumes module to manage the CephFS exports. | |
11 | ||
12 | The ceph-mgr volumes module implements the following file system export | |
13 | abstactions: | |
14 | ||
15 | * FS volumes, an abstraction for CephFS file systems | |
16 | ||
17 | * FS subvolumes, an abstraction for independent CephFS directory trees | |
18 | ||
19 | * FS subvolume groups, an abstraction for a directory level higher than FS | |
20 | subvolumes to effect policies (e.g., :doc:`/cephfs/file-layouts`) across a | |
21 | set of subvolumes | |
22 | ||
23 | Some possible use-cases for the export abstractions: | |
24 | ||
25 | * FS subvolumes used as manila shares or CSI volumes | |
26 | ||
27 | * FS subvolume groups used as manila share groups | |
28 | ||
29 | Requirements | |
30 | ------------ | |
31 | ||
32 | * Nautilus (14.2.x) or a later version of Ceph | |
33 | ||
34 | * Cephx client user (see :doc:`/rados/operations/user-management`) with | |
35 | the following minimum capabilities:: | |
36 | ||
37 | mon 'allow r' | |
38 | mgr 'allow rw' | |
39 | ||
40 | ||
41 | FS Volumes | |
42 | ---------- | |
43 | ||
44 | Create a volume using:: | |
45 | ||
9f95a23c | 46 | $ ceph fs volume create <vol_name> [<placement>] |
eafe8130 | 47 | |
f91f0fd5 TL |
48 | This creates a CephFS file system and its data and metadata pools. It can also |
49 | try to create MDSes for the filesystem using the enabled ceph-mgr orchestrator | |
50 | module (see :doc:`/mgr/orchestrator`), e.g. rook. | |
51 | ||
52 | <vol_name> is the volume name (an arbitrary string), and | |
53 | ||
54 | <placement> is an optional string signifying which hosts should have NFS Ganesha | |
55 | daemon containers running on them and, optionally, the total number of NFS | |
56 | Ganesha daemons the cluster (should you want to have more than one NFS Ganesha | |
57 | daemon running per node). For example, the following placement string means | |
58 | "deploy NFS Ganesha daemons on nodes host1 and host2 (one daemon per host): | |
59 | ||
60 | "host1,host2" | |
61 | ||
62 | and this placement specification says to deploy two NFS Ganesha daemons each | |
63 | on nodes host1 and host2 (for a total of four NFS Ganesha daemons in the | |
64 | cluster): | |
65 | ||
66 | "4 host1,host2" | |
67 | ||
68 | For more details on placement specification refer to the `orchestrator doc | |
69 | <https://docs.ceph.com/docs/master/mgr/orchestrator/#placement-specification>`_ | |
70 | but keep in mind that specifying the placement via a YAML file is not supported. | |
eafe8130 TL |
71 | |
72 | Remove a volume using:: | |
73 | ||
74 | $ ceph fs volume rm <vol_name> [--yes-i-really-mean-it] | |
75 | ||
76 | This removes a file system and its data and metadata pools. It also tries to | |
77 | remove MDSes using the enabled ceph-mgr orchestrator module. | |
78 | ||
79 | List volumes using:: | |
80 | ||
81 | $ ceph fs volume ls | |
82 | ||
83 | FS Subvolume groups | |
84 | ------------------- | |
85 | ||
86 | Create a subvolume group using:: | |
87 | ||
adb31ebb | 88 | $ ceph fs subvolumegroup create <vol_name> <group_name> [--pool_layout <data_pool_name>] [--uid <uid>] [--gid <gid>] [--mode <octal_mode>] |
eafe8130 TL |
89 | |
90 | The command succeeds even if the subvolume group already exists. | |
91 | ||
92 | When creating a subvolume group you can specify its data pool layout (see | |
92f5a8d4 TL |
93 | :doc:`/cephfs/file-layouts`), uid, gid, and file mode in octal numerals. By default, the |
94 | subvolume group is created with an octal file mode '755', uid '0', gid '0' and data pool | |
95 | layout of its parent directory. | |
eafe8130 TL |
96 | |
97 | ||
98 | Remove a subvolume group using:: | |
99 | ||
100 | $ ceph fs subvolumegroup rm <vol_name> <group_name> [--force] | |
101 | ||
9f95a23c TL |
102 | The removal of a subvolume group fails if it is not empty or non-existent. |
103 | '--force' flag allows the non-existent subvolume group remove command to succeed. | |
eafe8130 TL |
104 | |
105 | ||
106 | Fetch the absolute path of a subvolume group using:: | |
107 | ||
108 | $ ceph fs subvolumegroup getpath <vol_name> <group_name> | |
109 | ||
9f95a23c TL |
110 | List subvolume groups using:: |
111 | ||
112 | $ ceph fs subvolumegroup ls <vol_name> | |
113 | ||
adb31ebb TL |
114 | .. note:: Subvolume group snapshot feature is no longer supported in mainline CephFS (existing group |
115 | snapshots can still be listed and deleted) | |
eafe8130 TL |
116 | |
117 | Remove a snapshot of a subvolume group using:: | |
118 | ||
119 | $ ceph fs subvolumegroup snapshot rm <vol_name> <group_name> <snap_name> [--force] | |
120 | ||
121 | Using the '--force' flag allows the command to succeed that would otherwise | |
122 | fail if the snapshot did not exist. | |
123 | ||
9f95a23c TL |
124 | List snapshots of a subvolume group using:: |
125 | ||
126 | $ ceph fs subvolumegroup snapshot ls <vol_name> <group_name> | |
127 | ||
eafe8130 TL |
128 | |
129 | FS Subvolumes | |
130 | ------------- | |
131 | ||
132 | Create a subvolume using:: | |
133 | ||
adb31ebb | 134 | $ ceph fs subvolume create <vol_name> <subvol_name> [--size <size_in_bytes>] [--group_name <subvol_group_name>] [--pool_layout <data_pool_name>] [--uid <uid>] [--gid <gid>] [--mode <octal_mode>] [--namespace-isolated] |
eafe8130 TL |
135 | |
136 | ||
137 | The command succeeds even if the subvolume already exists. | |
138 | ||
139 | When creating a subvolume you can specify its subvolume group, data pool layout, | |
92f5a8d4 | 140 | uid, gid, file mode in octal numerals, and size in bytes. The size of the subvolume is |
e306af50 TL |
141 | specified by setting a quota on it (see :doc:`/cephfs/quota`). The subvolume can be |
142 | created in a separate RADOS namespace by specifying --namespace-isolated option. By | |
143 | default a subvolume is created within the default subvolume group, and with an octal file | |
92f5a8d4 TL |
144 | mode '755', uid of its subvolume group, gid of its subvolume group, data pool layout of |
145 | its parent directory and no size limit. | |
eafe8130 | 146 | |
9f95a23c | 147 | Remove a subvolume using:: |
eafe8130 | 148 | |
adb31ebb | 149 | $ ceph fs subvolume rm <vol_name> <subvol_name> [--group_name <subvol_group_name>] [--force] [--retain-snapshots] |
eafe8130 TL |
150 | |
151 | ||
152 | The command removes the subvolume and its contents. It does this in two steps. | |
adb31ebb | 153 | First, it moves the subvolume to a trash folder, and then asynchronously purges |
eafe8130 TL |
154 | its contents. |
155 | ||
156 | The removal of a subvolume fails if it has snapshots, or is non-existent. | |
9f95a23c | 157 | '--force' flag allows the non-existent subvolume remove command to succeed. |
eafe8130 | 158 | |
adb31ebb TL |
159 | A subvolume can be removed retaining existing snapshots of the subvolume using the |
160 | '--retain-snapshots' option. If snapshots are retained, the subvolume is considered | |
161 | empty for all operations not involving the retained snapshots. | |
162 | ||
163 | .. note:: Snapshot retained subvolumes can be recreated using 'ceph fs subvolume create' | |
164 | ||
165 | .. note:: Retained snapshots can be used as a clone source to recreate the subvolume, or clone to a newer subvolume. | |
166 | ||
92f5a8d4 TL |
167 | Resize a subvolume using:: |
168 | ||
169 | $ ceph fs subvolume resize <vol_name> <subvol_name> <new_size> [--group_name <subvol_group_name>] [--no_shrink] | |
170 | ||
171 | The command resizes the subvolume quota using the size specified by 'new_size'. | |
172 | '--no_shrink' flag prevents the subvolume to shrink below the current used size of the subvolume. | |
173 | ||
174 | The subvolume can be resized to an infinite size by passing 'inf' or 'infinite' as the new_size. | |
eafe8130 | 175 | |
cd265ab1 TL |
176 | Authorize cephx auth IDs, the read/read-write access to fs subvolumes:: |
177 | ||
178 | $ ceph fs subvolume authorize <vol_name> <sub_name> <auth_id> [--group_name=<group_name>] [--access_level=<access_level>] | |
179 | ||
180 | The 'access_level' takes 'r' or 'rw' as value. | |
181 | ||
182 | Deauthorize cephx auth IDs, the read/read-write access to fs subvolumes:: | |
183 | ||
184 | $ ceph fs subvolume deauthorize <vol_name> <sub_name> <auth_id> [--group_name=<group_name>] | |
185 | ||
186 | List cephx auth IDs authorized to access fs subvolume:: | |
187 | ||
188 | $ ceph fs subvolume authorized_list <vol_name> <sub_name> [--group_name=<group_name>] | |
189 | ||
190 | Evict fs clients based on auth ID and subvolume mounted:: | |
191 | ||
192 | $ ceph fs subvolume evict <vol_name> <sub_name> <auth_id> [--group_name=<group_name>] | |
193 | ||
eafe8130 TL |
194 | Fetch the absolute path of a subvolume using:: |
195 | ||
196 | $ ceph fs subvolume getpath <vol_name> <subvol_name> [--group_name <subvol_group_name>] | |
197 | ||
1911f103 TL |
198 | Fetch the metadata of a subvolume using:: |
199 | ||
200 | $ ceph fs subvolume info <vol_name> <subvol_name> [--group_name <subvol_group_name>] | |
201 | ||
202 | The output format is json and contains fields as follows. | |
203 | ||
204 | * atime: access time of subvolume path in the format "YYYY-MM-DD HH:MM:SS" | |
205 | * mtime: modification time of subvolume path in the format "YYYY-MM-DD HH:MM:SS" | |
206 | * ctime: change time of subvolume path in the format "YYYY-MM-DD HH:MM:SS" | |
207 | * uid: uid of subvolume path | |
208 | * gid: gid of subvolume path | |
209 | * mode: mode of subvolume path | |
210 | * mon_addrs: list of monitor addresses | |
211 | * bytes_pcent: quota used in percentage if quota is set, else displays "undefined" | |
212 | * bytes_quota: quota size in bytes if quota is set, else displays "infinite" | |
213 | * bytes_used: current used size of the subvolume in bytes | |
214 | * created_at: time of creation of subvolume in the format "YYYY-MM-DD HH:MM:SS" | |
215 | * data_pool: data pool the subvolume belongs to | |
216 | * path: absolute path of a subvolume | |
217 | * type: subvolume type indicating whether it's clone or subvolume | |
e306af50 | 218 | * pool_namespace: RADOS namespace of the subvolume |
f6b5b4d7 | 219 | * features: features supported by the subvolume |
adb31ebb TL |
220 | * state: current state of the subvolume |
221 | ||
222 | If a subvolume has been removed retaining its snapshots, the output only contains fields as follows. | |
223 | ||
224 | * type: subvolume type indicating whether it's clone or subvolume | |
225 | * features: features supported by the subvolume | |
226 | * state: current state of the subvolume | |
f6b5b4d7 TL |
227 | |
228 | The subvolume "features" are based on the internal version of the subvolume and is a list containing | |
229 | a subset of the following features, | |
230 | ||
231 | * "snapshot-clone": supports cloning using a subvolumes snapshot as the source | |
232 | * "snapshot-autoprotect": supports automatically protecting snapshots, that are active clone sources, from deletion | |
adb31ebb TL |
233 | * "snapshot-retention": supports removing subvolume contents, retaining any existing snapshots |
234 | ||
235 | The subvolume "state" is based on the current state of the subvolume and contains one of the following values. | |
236 | ||
237 | * "complete": subvolume is ready for all operations | |
238 | * "snapshot-retained": subvolume is removed but its snapshots are retained | |
1911f103 | 239 | |
9f95a23c TL |
240 | List subvolumes using:: |
241 | ||
242 | $ ceph fs subvolume ls <vol_name> [--group_name <subvol_group_name>] | |
eafe8130 | 243 | |
adb31ebb TL |
244 | .. note:: subvolumes that are removed but have snapshots retained, are also listed. |
245 | ||
eafe8130 TL |
246 | Create a snapshot of a subvolume using:: |
247 | ||
248 | $ ceph fs subvolume snapshot create <vol_name> <subvol_name> <snap_name> [--group_name <subvol_group_name>] | |
249 | ||
250 | ||
251 | Remove a snapshot of a subvolume using:: | |
252 | ||
adb31ebb | 253 | $ ceph fs subvolume snapshot rm <vol_name> <subvol_name> <snap_name> [--group_name <subvol_group_name>] [--force] |
eafe8130 TL |
254 | |
255 | Using the '--force' flag allows the command to succeed that would otherwise | |
256 | fail if the snapshot did not exist. | |
257 | ||
adb31ebb TL |
258 | .. note:: if the last snapshot within a snapshot retained subvolume is removed, the subvolume is also removed |
259 | ||
9f95a23c TL |
260 | List snapshots of a subvolume using:: |
261 | ||
262 | $ ceph fs subvolume snapshot ls <vol_name> <subvol_name> [--group_name <subvol_group_name>] | |
263 | ||
e306af50 TL |
264 | Fetch the metadata of a snapshot using:: |
265 | ||
266 | $ ceph fs subvolume snapshot info <vol_name> <subvol_name> <snap_name> [--group_name <subvol_group_name>] | |
267 | ||
268 | The output format is json and contains fields as follows. | |
269 | ||
270 | * created_at: time of creation of snapshot in the format "YYYY-MM-DD HH:MM:SS:ffffff" | |
271 | * data_pool: data pool the snapshot belongs to | |
272 | * has_pending_clones: "yes" if snapshot clone is in progress otherwise "no" | |
e306af50 TL |
273 | * size: snapshot size in bytes |
274 | ||
9f95a23c TL |
275 | Cloning Snapshots |
276 | ----------------- | |
277 | ||
278 | Subvolumes can be created by cloning subvolume snapshots. Cloning is an asynchronous operation involving copying | |
279 | data from a snapshot to a subvolume. Due to this bulk copy nature, cloning is currently inefficient for very huge | |
280 | data sets. | |
281 | ||
f6b5b4d7 TL |
282 | .. note:: Removing a snapshot (source subvolume) would fail if there are pending or in progress clone operations. |
283 | ||
284 | Protecting snapshots prior to cloning was a pre-requisite in the Nautilus release, and the commands to protect/unprotect | |
285 | snapshots were introduced for this purpose. This pre-requisite, and hence the commands to protect/unprotect, is being | |
286 | deprecated in mainline CephFS, and may be removed from a future release. | |
287 | ||
f67539c2 | 288 | The commands being deprecated are: |
9f95a23c | 289 | $ ceph fs subvolume snapshot protect <vol_name> <subvol_name> <snap_name> [--group_name <subvol_group_name>] |
f6b5b4d7 TL |
290 | $ ceph fs subvolume snapshot unprotect <vol_name> <subvol_name> <snap_name> [--group_name <subvol_group_name>] |
291 | ||
292 | .. note:: Using the above commands would not result in an error, but they serve no useful function. | |
293 | ||
294 | .. note:: Use subvolume info command to fetch subvolume metadata regarding supported "features" to help decide if protect/unprotect of snapshots is required, based on the "snapshot-autoprotect" feature availability. | |
9f95a23c TL |
295 | |
296 | To initiate a clone operation use:: | |
297 | ||
298 | $ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> | |
299 | ||
300 | If a snapshot (source subvolume) is a part of non-default group, the group name needs to be specified as per:: | |
301 | ||
302 | $ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> --group_name <subvol_group_name> | |
303 | ||
304 | Cloned subvolumes can be a part of a different group than the source snapshot (by default, cloned subvolumes are created in default group). To clone to a particular group use:: | |
305 | ||
306 | $ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> --target_group_name <subvol_group_name> | |
307 | ||
308 | Similar to specifying a pool layout when creating a subvolume, pool layout can be specified when creating a cloned subvolume. To create a cloned subvolume with a specific pool layout use:: | |
309 | ||
310 | $ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> --pool_layout <pool_layout> | |
311 | ||
f91f0fd5 TL |
312 | Configure maximum number of concurrent clones. The default is set to 4:: |
313 | ||
314 | $ ceph config set mgr mgr/volumes/max_concurrent_clones <value> | |
315 | ||
9f95a23c TL |
316 | To check the status of a clone operation use:: |
317 | ||
318 | $ ceph fs clone status <vol_name> <clone_name> [--group_name <group_name>] | |
319 | ||
320 | A clone can be in one of the following states: | |
321 | ||
322 | #. `pending` : Clone operation has not started | |
323 | #. `in-progress` : Clone operation is in progress | |
f6b5b4d7 | 324 | #. `complete` : Clone operation has successfully finished |
9f95a23c TL |
325 | #. `failed` : Clone operation has failed |
326 | ||
327 | Sample output from an `in-progress` clone operation:: | |
328 | ||
9f95a23c TL |
329 | $ ceph fs subvolume snapshot clone cephfs subvol1 snap1 clone1 |
330 | $ ceph fs clone status cephfs clone1 | |
331 | { | |
332 | "status": { | |
333 | "state": "in-progress", | |
334 | "source": { | |
335 | "volume": "cephfs", | |
336 | "subvolume": "subvol1", | |
337 | "snapshot": "snap1" | |
338 | } | |
339 | } | |
340 | } | |
341 | ||
342 | (NOTE: since `subvol1` is in default group, `source` section in `clone status` does not include group name) | |
343 | ||
344 | .. note:: Cloned subvolumes are accessible only after the clone operation has successfully completed. | |
345 | ||
f6b5b4d7 | 346 | For a successful clone operation, `clone status` would look like so:: |
9f95a23c TL |
347 | |
348 | $ ceph fs clone status cephfs clone1 | |
349 | { | |
350 | "status": { | |
351 | "state": "complete" | |
352 | } | |
353 | } | |
354 | ||
355 | or `failed` state when clone is unsuccessful. | |
356 | ||
357 | On failure of a clone operation, the partial clone needs to be deleted and the clone operation needs to be retriggered. | |
358 | To delete a partial clone use:: | |
359 | ||
360 | $ ceph fs subvolume rm <vol_name> <clone_name> [--group_name <group_name>] --force | |
361 | ||
9f95a23c TL |
362 | .. note:: Cloning only synchronizes directories, regular files and symbolic links. Also, inode timestamps (access and |
363 | modification times) are synchronized upto seconds granularity. | |
364 | ||
365 | An `in-progress` or a `pending` clone operation can be canceled. To cancel a clone operation use the `clone cancel` command:: | |
366 | ||
367 | $ ceph fs clone cancel <vol_name> <clone_name> [--group_name <group_name>] | |
368 | ||
369 | On successful cancelation, the cloned subvolume is moved to `canceled` state:: | |
370 | ||
9f95a23c TL |
371 | $ ceph fs subvolume snapshot clone cephfs subvol1 snap1 clone1 |
372 | $ ceph fs clone cancel cephfs clone1 | |
373 | $ ceph fs clone status cephfs clone1 | |
374 | { | |
375 | "status": { | |
376 | "state": "canceled", | |
377 | "source": { | |
378 | "volume": "cephfs", | |
379 | "subvolume": "subvol1", | |
380 | "snapshot": "snap1" | |
381 | } | |
382 | } | |
383 | } | |
384 | ||
385 | .. note:: The canceled cloned can be deleted by using --force option in `fs subvolume rm` command. | |
386 | ||
eafe8130 TL |
387 | .. _manila: https://github.com/openstack/manila |
388 | .. _CSI: https://github.com/ceph/ceph-csi |