]> git.proxmox.com Git - ceph.git/blame - ceph/doc/cephfs/fs-volumes.rst
import quincy 17.2.0
[ceph.git] / ceph / doc / cephfs / fs-volumes.rst
CommitLineData
eafe8130
TL
1.. _fs-volumes-and-subvolumes:
2
3FS volumes and subvolumes
4=========================
5
6A single source of truth for CephFS exports is implemented in the volumes
7module of the :term:`Ceph Manager` daemon (ceph-mgr). The OpenStack shared
9f95a23c 8file system service (manila_), Ceph Container Storage Interface (CSI_),
eafe8130
TL
9storage administrators among others can use the common CLI provided by the
10ceph-mgr volumes module to manage the CephFS exports.
11
12The ceph-mgr volumes module implements the following file system export
20effc67 13abstractions:
eafe8130
TL
14
15* FS volumes, an abstraction for CephFS file systems
16
17* FS subvolumes, an abstraction for independent CephFS directory trees
18
19* FS subvolume groups, an abstraction for a directory level higher than FS
20 subvolumes to effect policies (e.g., :doc:`/cephfs/file-layouts`) across a
21 set of subvolumes
22
23Some possible use-cases for the export abstractions:
24
25* FS subvolumes used as manila shares or CSI volumes
26
27* FS subvolume groups used as manila share groups
28
29Requirements
30------------
31
32* Nautilus (14.2.x) or a later version of Ceph
33
34* Cephx client user (see :doc:`/rados/operations/user-management`) with
35 the following minimum capabilities::
36
37 mon 'allow r'
38 mgr 'allow rw'
39
40
41FS Volumes
42----------
43
44Create a volume using::
45
9f95a23c 46 $ ceph fs volume create <vol_name> [<placement>]
eafe8130 47
f91f0fd5
TL
48This creates a CephFS file system and its data and metadata pools. It can also
49try to create MDSes for the filesystem using the enabled ceph-mgr orchestrator
50module (see :doc:`/mgr/orchestrator`), e.g. rook.
51
52<vol_name> is the volume name (an arbitrary string), and
53
54<placement> is an optional string signifying which hosts should have NFS Ganesha
55daemon containers running on them and, optionally, the total number of NFS
56Ganesha daemons the cluster (should you want to have more than one NFS Ganesha
57daemon running per node). For example, the following placement string means
58"deploy NFS Ganesha daemons on nodes host1 and host2 (one daemon per host):
59
60 "host1,host2"
61
62and this placement specification says to deploy two NFS Ganesha daemons each
63on nodes host1 and host2 (for a total of four NFS Ganesha daemons in the
64cluster):
65
66 "4 host1,host2"
67
20effc67 68For more details on placement specification refer to the :ref:`orchestrator-cli-service-spec`,
f91f0fd5 69but keep in mind that specifying the placement via a YAML file is not supported.
eafe8130
TL
70
71Remove a volume using::
72
73 $ ceph fs volume rm <vol_name> [--yes-i-really-mean-it]
74
75This removes a file system and its data and metadata pools. It also tries to
76remove MDSes using the enabled ceph-mgr orchestrator module.
77
78List volumes using::
79
80 $ ceph fs volume ls
81
1d09f67e
TL
82Rename a volume using::
83
84 $ ceph fs volume rename <vol_name> <new_vol_name> [--yes-i-really-mean-it]
85
86Renaming a volume can be an expensive operation. It does the following:
87
88- renames the orchestrator managed MDS service to match the <new_vol_name>.
89 This involves launching a MDS service with <new_vol_name> and bringing down
90 the MDS service with <vol_name>.
91- renames the file system matching <vol_name> to <new_vol_name>
92- changes the application tags on the data and metadata pools of the file system
93 to <new_vol_name>
94- renames the metadata and data pools of the file system.
95
96The CephX IDs authorized to <vol_name> need to be reauthorized to <new_vol_name>. Any
97on-going operations of the clients using these IDs may be disrupted. Mirroring is
98expected to be disabled on the volume.
99
eafe8130
TL
100FS Subvolume groups
101-------------------
102
103Create a subvolume group using::
104
adb31ebb 105 $ ceph fs subvolumegroup create <vol_name> <group_name> [--pool_layout <data_pool_name>] [--uid <uid>] [--gid <gid>] [--mode <octal_mode>]
eafe8130
TL
106
107The command succeeds even if the subvolume group already exists.
108
109When creating a subvolume group you can specify its data pool layout (see
92f5a8d4
TL
110:doc:`/cephfs/file-layouts`), uid, gid, and file mode in octal numerals. By default, the
111subvolume group is created with an octal file mode '755', uid '0', gid '0' and data pool
112layout of its parent directory.
eafe8130
TL
113
114
115Remove a subvolume group using::
116
117 $ ceph fs subvolumegroup rm <vol_name> <group_name> [--force]
118
9f95a23c
TL
119The removal of a subvolume group fails if it is not empty or non-existent.
120'--force' flag allows the non-existent subvolume group remove command to succeed.
eafe8130
TL
121
122
123Fetch the absolute path of a subvolume group using::
124
125 $ ceph fs subvolumegroup getpath <vol_name> <group_name>
126
9f95a23c
TL
127List subvolume groups using::
128
129 $ ceph fs subvolumegroup ls <vol_name>
130
adb31ebb
TL
131.. note:: Subvolume group snapshot feature is no longer supported in mainline CephFS (existing group
132 snapshots can still be listed and deleted)
eafe8130
TL
133
134Remove a snapshot of a subvolume group using::
135
136 $ ceph fs subvolumegroup snapshot rm <vol_name> <group_name> <snap_name> [--force]
137
138Using the '--force' flag allows the command to succeed that would otherwise
139fail if the snapshot did not exist.
140
9f95a23c
TL
141List snapshots of a subvolume group using::
142
143 $ ceph fs subvolumegroup snapshot ls <vol_name> <group_name>
144
eafe8130
TL
145
146FS Subvolumes
147-------------
148
149Create a subvolume using::
150
adb31ebb 151 $ ceph fs subvolume create <vol_name> <subvol_name> [--size <size_in_bytes>] [--group_name <subvol_group_name>] [--pool_layout <data_pool_name>] [--uid <uid>] [--gid <gid>] [--mode <octal_mode>] [--namespace-isolated]
eafe8130
TL
152
153
154The command succeeds even if the subvolume already exists.
155
156When creating a subvolume you can specify its subvolume group, data pool layout,
92f5a8d4 157uid, gid, file mode in octal numerals, and size in bytes. The size of the subvolume is
e306af50
TL
158specified by setting a quota on it (see :doc:`/cephfs/quota`). The subvolume can be
159created in a separate RADOS namespace by specifying --namespace-isolated option. By
160default a subvolume is created within the default subvolume group, and with an octal file
92f5a8d4
TL
161mode '755', uid of its subvolume group, gid of its subvolume group, data pool layout of
162its parent directory and no size limit.
eafe8130 163
9f95a23c 164Remove a subvolume using::
eafe8130 165
adb31ebb 166 $ ceph fs subvolume rm <vol_name> <subvol_name> [--group_name <subvol_group_name>] [--force] [--retain-snapshots]
eafe8130
TL
167
168
169The command removes the subvolume and its contents. It does this in two steps.
adb31ebb 170First, it moves the subvolume to a trash folder, and then asynchronously purges
eafe8130
TL
171its contents.
172
173The removal of a subvolume fails if it has snapshots, or is non-existent.
9f95a23c 174'--force' flag allows the non-existent subvolume remove command to succeed.
eafe8130 175
adb31ebb
TL
176A subvolume can be removed retaining existing snapshots of the subvolume using the
177'--retain-snapshots' option. If snapshots are retained, the subvolume is considered
178empty for all operations not involving the retained snapshots.
179
180.. note:: Snapshot retained subvolumes can be recreated using 'ceph fs subvolume create'
181
182.. note:: Retained snapshots can be used as a clone source to recreate the subvolume, or clone to a newer subvolume.
183
92f5a8d4
TL
184Resize a subvolume using::
185
186 $ ceph fs subvolume resize <vol_name> <subvol_name> <new_size> [--group_name <subvol_group_name>] [--no_shrink]
187
188The command resizes the subvolume quota using the size specified by 'new_size'.
189'--no_shrink' flag prevents the subvolume to shrink below the current used size of the subvolume.
190
191The subvolume can be resized to an infinite size by passing 'inf' or 'infinite' as the new_size.
eafe8130 192
cd265ab1
TL
193Authorize cephx auth IDs, the read/read-write access to fs subvolumes::
194
195 $ ceph fs subvolume authorize <vol_name> <sub_name> <auth_id> [--group_name=<group_name>] [--access_level=<access_level>]
196
197The 'access_level' takes 'r' or 'rw' as value.
198
199Deauthorize cephx auth IDs, the read/read-write access to fs subvolumes::
200
201 $ ceph fs subvolume deauthorize <vol_name> <sub_name> <auth_id> [--group_name=<group_name>]
202
203List cephx auth IDs authorized to access fs subvolume::
204
205 $ ceph fs subvolume authorized_list <vol_name> <sub_name> [--group_name=<group_name>]
206
207Evict fs clients based on auth ID and subvolume mounted::
208
209 $ ceph fs subvolume evict <vol_name> <sub_name> <auth_id> [--group_name=<group_name>]
210
eafe8130
TL
211Fetch the absolute path of a subvolume using::
212
213 $ ceph fs subvolume getpath <vol_name> <subvol_name> [--group_name <subvol_group_name>]
214
1911f103
TL
215Fetch the metadata of a subvolume using::
216
217 $ ceph fs subvolume info <vol_name> <subvol_name> [--group_name <subvol_group_name>]
218
219The output format is json and contains fields as follows.
220
221* atime: access time of subvolume path in the format "YYYY-MM-DD HH:MM:SS"
222* mtime: modification time of subvolume path in the format "YYYY-MM-DD HH:MM:SS"
223* ctime: change time of subvolume path in the format "YYYY-MM-DD HH:MM:SS"
224* uid: uid of subvolume path
225* gid: gid of subvolume path
226* mode: mode of subvolume path
227* mon_addrs: list of monitor addresses
228* bytes_pcent: quota used in percentage if quota is set, else displays "undefined"
229* bytes_quota: quota size in bytes if quota is set, else displays "infinite"
230* bytes_used: current used size of the subvolume in bytes
231* created_at: time of creation of subvolume in the format "YYYY-MM-DD HH:MM:SS"
232* data_pool: data pool the subvolume belongs to
233* path: absolute path of a subvolume
234* type: subvolume type indicating whether it's clone or subvolume
e306af50 235* pool_namespace: RADOS namespace of the subvolume
f6b5b4d7 236* features: features supported by the subvolume
adb31ebb
TL
237* state: current state of the subvolume
238
239If a subvolume has been removed retaining its snapshots, the output only contains fields as follows.
240
241* type: subvolume type indicating whether it's clone or subvolume
242* features: features supported by the subvolume
243* state: current state of the subvolume
f6b5b4d7
TL
244
245The subvolume "features" are based on the internal version of the subvolume and is a list containing
246a subset of the following features,
247
248* "snapshot-clone": supports cloning using a subvolumes snapshot as the source
249* "snapshot-autoprotect": supports automatically protecting snapshots, that are active clone sources, from deletion
adb31ebb
TL
250* "snapshot-retention": supports removing subvolume contents, retaining any existing snapshots
251
252The subvolume "state" is based on the current state of the subvolume and contains one of the following values.
253
254* "complete": subvolume is ready for all operations
255* "snapshot-retained": subvolume is removed but its snapshots are retained
1911f103 256
9f95a23c
TL
257List subvolumes using::
258
259 $ ceph fs subvolume ls <vol_name> [--group_name <subvol_group_name>]
eafe8130 260
adb31ebb
TL
261.. note:: subvolumes that are removed but have snapshots retained, are also listed.
262
eafe8130
TL
263Create a snapshot of a subvolume using::
264
265 $ ceph fs subvolume snapshot create <vol_name> <subvol_name> <snap_name> [--group_name <subvol_group_name>]
266
267
268Remove a snapshot of a subvolume using::
269
adb31ebb 270 $ ceph fs subvolume snapshot rm <vol_name> <subvol_name> <snap_name> [--group_name <subvol_group_name>] [--force]
eafe8130
TL
271
272Using the '--force' flag allows the command to succeed that would otherwise
273fail if the snapshot did not exist.
274
adb31ebb
TL
275.. note:: if the last snapshot within a snapshot retained subvolume is removed, the subvolume is also removed
276
9f95a23c
TL
277List snapshots of a subvolume using::
278
279 $ ceph fs subvolume snapshot ls <vol_name> <subvol_name> [--group_name <subvol_group_name>]
280
e306af50
TL
281Fetch the metadata of a snapshot using::
282
283 $ ceph fs subvolume snapshot info <vol_name> <subvol_name> <snap_name> [--group_name <subvol_group_name>]
284
285The output format is json and contains fields as follows.
286
287* created_at: time of creation of snapshot in the format "YYYY-MM-DD HH:MM:SS:ffffff"
288* data_pool: data pool the snapshot belongs to
289* has_pending_clones: "yes" if snapshot clone is in progress otherwise "no"
e306af50
TL
290* size: snapshot size in bytes
291
9f95a23c
TL
292Cloning Snapshots
293-----------------
294
295Subvolumes can be created by cloning subvolume snapshots. Cloning is an asynchronous operation involving copying
296data from a snapshot to a subvolume. Due to this bulk copy nature, cloning is currently inefficient for very huge
297data sets.
298
f6b5b4d7
TL
299.. note:: Removing a snapshot (source subvolume) would fail if there are pending or in progress clone operations.
300
301Protecting snapshots prior to cloning was a pre-requisite in the Nautilus release, and the commands to protect/unprotect
302snapshots were introduced for this purpose. This pre-requisite, and hence the commands to protect/unprotect, is being
303deprecated in mainline CephFS, and may be removed from a future release.
304
f67539c2 305The commands being deprecated are:
9f95a23c 306 $ ceph fs subvolume snapshot protect <vol_name> <subvol_name> <snap_name> [--group_name <subvol_group_name>]
f6b5b4d7
TL
307 $ ceph fs subvolume snapshot unprotect <vol_name> <subvol_name> <snap_name> [--group_name <subvol_group_name>]
308
309.. note:: Using the above commands would not result in an error, but they serve no useful function.
310
311.. note:: Use subvolume info command to fetch subvolume metadata regarding supported "features" to help decide if protect/unprotect of snapshots is required, based on the "snapshot-autoprotect" feature availability.
9f95a23c
TL
312
313To initiate a clone operation use::
314
315 $ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name>
316
317If a snapshot (source subvolume) is a part of non-default group, the group name needs to be specified as per::
318
319 $ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> --group_name <subvol_group_name>
320
321Cloned subvolumes can be a part of a different group than the source snapshot (by default, cloned subvolumes are created in default group). To clone to a particular group use::
322
323 $ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> --target_group_name <subvol_group_name>
324
325Similar to specifying a pool layout when creating a subvolume, pool layout can be specified when creating a cloned subvolume. To create a cloned subvolume with a specific pool layout use::
326
327 $ ceph fs subvolume snapshot clone <vol_name> <subvol_name> <snap_name> <target_subvol_name> --pool_layout <pool_layout>
328
f91f0fd5
TL
329Configure maximum number of concurrent clones. The default is set to 4::
330
331 $ ceph config set mgr mgr/volumes/max_concurrent_clones <value>
332
9f95a23c
TL
333To check the status of a clone operation use::
334
335 $ ceph fs clone status <vol_name> <clone_name> [--group_name <group_name>]
336
337A clone can be in one of the following states:
338
339#. `pending` : Clone operation has not started
340#. `in-progress` : Clone operation is in progress
f6b5b4d7 341#. `complete` : Clone operation has successfully finished
9f95a23c
TL
342#. `failed` : Clone operation has failed
343
344Sample output from an `in-progress` clone operation::
345
9f95a23c
TL
346 $ ceph fs subvolume snapshot clone cephfs subvol1 snap1 clone1
347 $ ceph fs clone status cephfs clone1
348 {
349 "status": {
350 "state": "in-progress",
351 "source": {
352 "volume": "cephfs",
353 "subvolume": "subvol1",
354 "snapshot": "snap1"
355 }
356 }
357 }
358
359(NOTE: since `subvol1` is in default group, `source` section in `clone status` does not include group name)
360
361.. note:: Cloned subvolumes are accessible only after the clone operation has successfully completed.
362
f6b5b4d7 363For a successful clone operation, `clone status` would look like so::
9f95a23c
TL
364
365 $ ceph fs clone status cephfs clone1
366 {
367 "status": {
368 "state": "complete"
369 }
370 }
371
372or `failed` state when clone is unsuccessful.
373
374On failure of a clone operation, the partial clone needs to be deleted and the clone operation needs to be retriggered.
375To delete a partial clone use::
376
377 $ ceph fs subvolume rm <vol_name> <clone_name> [--group_name <group_name>] --force
378
9f95a23c 379.. note:: Cloning only synchronizes directories, regular files and symbolic links. Also, inode timestamps (access and
20effc67 380 modification times) are synchronized up to seconds granularity.
9f95a23c
TL
381
382An `in-progress` or a `pending` clone operation can be canceled. To cancel a clone operation use the `clone cancel` command::
383
384 $ ceph fs clone cancel <vol_name> <clone_name> [--group_name <group_name>]
385
20effc67 386On successful cancellation, the cloned subvolume is moved to `canceled` state::
9f95a23c 387
9f95a23c
TL
388 $ ceph fs subvolume snapshot clone cephfs subvol1 snap1 clone1
389 $ ceph fs clone cancel cephfs clone1
390 $ ceph fs clone status cephfs clone1
391 {
392 "status": {
393 "state": "canceled",
394 "source": {
395 "volume": "cephfs",
396 "subvolume": "subvol1",
397 "snapshot": "snap1"
398 }
399 }
400 }
401
402.. note:: The canceled cloned can be deleted by using --force option in `fs subvolume rm` command.
403
a4b75251
TL
404
405.. _subvol-pinning:
406
407Pinning Subvolumes and Subvolume Groups
408---------------------------------------
409
410
411Subvolumes and subvolume groups can be automatically pinned to ranks according
412to policies. This can help distribute load across MDS ranks in predictable and
413stable ways. Review :ref:`cephfs-pinning` and :ref:`cephfs-ephemeral-pinning`
414for details on how pinning works.
415
416Pinning is configured by::
417
418 $ ceph fs subvolumegroup pin <vol_name> <group_name> <pin_type> <pin_setting>
419
420or for subvolumes::
421
422 $ ceph fs subvolume pin <vol_name> <group_name> <pin_type> <pin_setting>
423
424Typically you will want to set subvolume group pins. The ``pin_type`` may be
425one of ``export``, ``distributed``, or ``random``. The ``pin_setting``
426corresponds to the extended attributed "value" as in the pinning documentation
427referenced above.
428
429So, for example, setting a distributed pinning strategy on a subvolume group::
430
431 $ ceph fs subvolumegroup pin cephfilesystem-a csi distributed 1
432
433Will enable distributed subtree partitioning policy for the "csi" subvolume
434group. This will cause every subvolume within the group to be automatically
435pinned to one of the available ranks on the file system.
436
437
eafe8130
TL
438.. _manila: https://github.com/openstack/manila
439.. _CSI: https://github.com/ceph/ceph-csi