5 CephFS supports asynchronous replication of snapshots to a remote CephFS file system via
6 `cephfs-mirror` tool. Snapshots are synchronized by mirroring snapshot data followed by
7 creating a snapshot with the same name (for a given directory on the remote file system) as
8 the snapshot being synchronized.
13 The primary (local) and secondary (remote) Ceph clusters version should be Pacific or later.
18 For a given snapshot pair in a directory, `cephfs-mirror` daemon will rely on readdir diff
19 to identify changes in a directory tree. The diffs are applied to directory in the remote
20 file system thereby only synchronizing files that have changed between two snapshots.
22 This feature is tracked here: https://tracker.ceph.com/issues/47034.
24 Currently, snapshot data is synchronized by bulk copying to the remote filesystem.
26 .. note:: Synchronizing hardlinks is not supported -- hardlinked files get synchronized
32 Start by creating a user (on the primary/local cluster) for the mirror daemon. This user
33 requires write capability on the metadata pool to create RADOS objects (index objects)
34 for watch/notify operation and read capability on the data pool(s).
36 $ ceph auth get-or-create client.mirror mon 'profile cephfs-mirror' mds 'allow r' osd 'allow rw tag cephfs metadata=*, allow r tag cephfs data=*' mgr 'allow r'
38 Create a user for each file system peer (on the secondary/remote cluster). This user needs
39 to have full capabilities on the MDS (to take snapshots) and the OSDs::
41 $ ceph fs authorize <fs_name> client.mirror_remote / rwps
43 This user should be used (as part of peer specification) when adding a peer.
45 Starting Mirror Daemon
46 ----------------------
48 Mirror daemon should be spawned using `systemctl(1)` unit files::
50 $ systemctl enable cephfs-mirror@mirror
51 $ systemctl start cephfs-mirror@mirror
53 `cephfs-mirror` daemon can be run in foreground using::
55 $ cephfs-mirror --id mirror --cluster site-a -f
57 .. note:: User used here is `mirror` as created in the `Creating Users` section.
62 CephFS supports asynchronous replication of snapshots to a remote CephFS file system
63 via `cephfs-mirror` tool. For a given directory, snapshots are synchronized by transferring
64 snapshot data to the remote file system and creating a snapshot with the same name as the
65 snapshot being synchronized.
67 Snapshot Synchronization Order
68 ------------------------------
70 Although the order in which snapshots get chosen for synchronization does not matter,
71 snapshots are picked based on creation order (using snap-id).
76 A snapshot may be deleted and recreated (with the same name) with different contents.
77 An "old" snapshot could have been synchronized (earlier) and the recreation of the
78 snapshot could have been done when mirroring was disabled. Using snapshot names to
79 infer the point-of-continuation would result in the "new" snapshot (incarnation)
80 never getting picked up for synchronization.
82 Snapshots on the secondary file system stores the snap-id of the snapshot it was
83 synchronized from. This metadata is stored in `SnapInfo` structure on the MDS.
88 `Mirroring` module (manager plugin) provides interfaces for managing directory snapshot
89 mirroring. Manager interfaces are (mostly) wrappers around monitor commands for managing
90 file system mirroring and is the recommended control interface.
92 Mirroring Module and Interface
93 ------------------------------
95 Mirroring module provides interface for managing directory snapshot mirroring. The module
96 is implemented as a Ceph Manager plugin. Mirroring module does not manage spawning (and
97 terminating) the mirror daemons. Right now the preferred way would be to start/stop
98 mirror daemons via `systemctl(1)`. Going forward, deploying mirror daemons would be
99 managed by `cephadm` (Tracker: http://tracker.ceph.com/issues/47261).
101 The manager module is responsible for assigning directories to mirror daemons for
102 synchronization. Multiple mirror daemons can be spawned to achieve concurrency in
103 directory snapshot synchronization. When mirror daemons are spawned (or terminated)
104 , the mirroring module discovers the modified set of mirror daemons and rebalances
105 the directory assignment amongst the new set thus providing high-availability.
107 .. note:: Multiple mirror daemons is currently untested. Only a single mirror daemon
110 Mirroring module is disabled by default. To enable mirroring use::
112 $ ceph mgr module enable mirroring
114 Mirroring module provides a family of commands to control mirroring of directory
115 snapshots. To add or remove directories, mirroring needs to be enabled for a given
116 file system. To enable mirroring use::
118 $ ceph fs snapshot mirror enable <fs_name>
120 .. note:: Mirroring module commands use `fs snapshot mirror` prefix as compared to
121 the monitor commands which `fs mirror` prefix. Make sure to use module
124 To disable mirroring, use::
126 $ ceph fs snapshot mirror disable <fs_name>
128 Once mirroring is enabled, add a peer to which directory snapshots are to be mirrored.
129 Peers follow `<client>@<cluster>` specification and get assigned a unique-id (UUID)
130 when added. See `Creating Users` section on how to create Ceph users for mirroring.
134 $ ceph fs snapshot mirror peer_add <fs_name> <remote_cluster_spec> [<remote_fs_name>] [<remote_mon_host>] [<cephx_key>]
136 `<remote_fs_name>` is optional, and default to `<fs_name>` (on the remote cluster).
138 This requires the remote cluster ceph configuration and user keyring to be available in
139 the primary cluster. See `Bootstrap Peers` section to avoid this. `peer_add` additionally
140 supports passing the remote cluster monitor address and the user key. However, bootstrapping
141 a peer is the recommended way to add a peer.
143 .. note:: Only a single peer is supported right now.
145 To remove a peer use::
147 $ ceph fs snapshot mirror peer_remove <fs_name> <peer_uuid>
149 .. note:: See `Mirror Daemon Status` section on how to figure out Peer UUID.
151 To list file system mirror peers use::
153 $ ceph fs snapshot mirror peer_list <fs_name>
155 To configure a directory for mirroring, use::
157 $ ceph fs snapshot mirror add <fs_name> <path>
159 To stop a mirroring directory snapshots use::
161 $ ceph fs snapshot mirror remove <fs_name> <path>
163 Only absolute directory paths are allowed. Also, paths are normalized by the mirroring
164 module, therfore, `/a/b/../b` is equivalent to `/a/b`.
167 $ ceph fs snapshot mirror add cephfs /d0/d1/d2
169 $ ceph fs snapshot mirror add cephfs /d0/d1/../d1/d2
170 Error EEXIST: directory /d0/d1/d2 is already tracked
172 Once a directory is added for mirroring, its subdirectory or ancestor directories are
173 disallowed to be added for mirorring::
175 $ ceph fs snapshot mirror add cephfs /d0/d1
176 Error EINVAL: /d0/d1 is a ancestor of tracked path /d0/d1/d2
177 $ ceph fs snapshot mirror add cephfs /d0/d1/d2/d3
178 Error EINVAL: /d0/d1/d2/d3 is a subtree of tracked path /d0/d1/d2
180 Commands to check directory mapping (to mirror daemons) and directory distribution are
181 detailed in `Mirror Daemon Status` section.
186 Adding a peer (via `peer_add`) requires the peer cluster configuration and user keyring
187 to be available in the primary cluster (manager host and hosts running the mirror daemon).
188 This can be avoided by bootstrapping and importing a peer token. Peer bootstrap involves
189 creating a bootstrap token on the peer cluster via::
191 $ ceph fs snapshot mirror peer_bootstrap create <fs_name> <client_entity> <site-name>
195 $ ceph fs snapshot mirror peer_bootstrap create backup_fs client.mirror_remote site-remote
196 {"token": "eyJmc2lkIjogIjBkZjE3MjE3LWRmY2QtNDAzMC05MDc5LTM2Nzk4NTVkNDJlZiIsICJmaWxlc3lzdGVtIjogImJhY2t1cF9mcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcGVlcl9ib290c3RyYXAiLCAic2l0ZV9uYW1lIjogInNpdGUtcmVtb3RlIiwgImtleSI6ICJBUUFhcDBCZ0xtRmpOeEFBVnNyZXozai9YYUV0T2UrbUJEZlJDZz09IiwgIm1vbl9ob3N0IjogIlt2MjoxOTIuMTY4LjAuNTo0MDkxOCx2MToxOTIuMTY4LjAuNTo0MDkxOV0ifQ=="}
198 `site-name` refers to a user-defined string to identify the remote filesystem. In context
199 of `peer_add` interface, `site-name` is the passed in `cluster` name from `remote_cluster_spec`.
201 Import the bootstrap token in the primary cluster via::
203 $ ceph fs snapshot mirror peer_bootstrap import <fs_name> <token>
207 $ ceph fs snapshot mirror peer_bootstrap import cephfs eyJmc2lkIjogIjBkZjE3MjE3LWRmY2QtNDAzMC05MDc5LTM2Nzk4NTVkNDJlZiIsICJmaWxlc3lzdGVtIjogImJhY2t1cF9mcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcGVlcl9ib290c3RyYXAiLCAic2l0ZV9uYW1lIjogInNpdGUtcmVtb3RlIiwgImtleSI6ICJBUUFhcDBCZ0xtRmpOeEFBVnNyZXozai9YYUV0T2UrbUJEZlJDZz09IiwgIm1vbl9ob3N0IjogIlt2MjoxOTIuMTY4LjAuNTo0MDkxOCx2MToxOTIuMTY4LjAuNTo0MDkxOV0ifQ==
212 Mirror daemons get asynchronously notified about changes in file system mirroring status
215 CephFS mirroring module provides `mirror daemon status` interface to check mirror daemon
218 $ ceph fs snapshot mirror daemon status
222 $ ceph fs snapshot mirror daemon status | jq
230 "directory_count": 1,
233 "uuid": "02117353-8cd1-44db-976b-eb20609aa160",
235 "client_name": "client.mirror_remote",
236 "cluster_name": "ceph",
237 "fs_name": "backup_fs"
250 An entry per mirror daemon instance is displayed along with information such as configured
251 peers and basic stats. For more detailed stats, use the admin socket interface as detailed
254 CephFS mirror daemons provide admin socket commands for querying mirror status. To check
255 available commands for mirror status use::
257 $ ceph --admin-daemon /path/to/mirror/daemon/admin/socket help
261 "fs mirror status cephfs@360": "get filesystem mirror status",
266 Commands with `fs mirror status` prefix provide mirror status for mirror enabled
267 file systems. Note that `cephfs@360` is of format `filesystem-name@filesystem-id`.
268 This format is required since mirror daemons get asynchronously notified regarding
269 file system mirror status (A file system can be deleted and recreated with the same
272 Right now, the command provides minimal information regarding mirror status::
274 $ ceph --admin-daemon /var/run/ceph/cephfs-mirror.asok fs mirror status cephfs@360
276 "rados_inst": "192.168.0.5:0/1476644347",
278 "a2dc7784-e7a1-4723-b103-03ee8d8768f8": {
280 "client_name": "client.mirror_remote",
281 "cluster_name": "site-a",
282 "fs_name": "backup_fs"
291 `Peers` section in the command output above shows the peer information such as unique
292 peer-id (UUID) and specification. The peer-id is required to remove an existing peer
293 as mentioned in the `Mirror Module and Interface` section.
295 Command with `fs mirror peer status` prefix provide peer synchronization status. This
296 command is of format `filesystem-name@filesystem-id peer-uuid`::
298 $ ceph --admin-daemon /var/run/ceph/cephfs-mirror.asok fs mirror peer status cephfs@360 a2dc7784-e7a1-4723-b103-03ee8d8768f8
302 "last_synced_snap": {
305 "sync_duration": 0.079997898999999997,
306 "sync_time_stamp": "274900.558797s"
314 Synchronization stats such as `snaps_synced`, `snaps_deleted` and `snaps_renamed` are reset
315 on daemon restart and/or when a directory is reassigned to another mirror daemon (when
316 multiple mirror daemons are deployed).
318 A directory can be in one of the following states::
320 - `idle`: The directory is currently not being synchronized
321 - `syncing`: The directory is currently being synchronized
322 - `failed`: The directory has hit upper limit of consecutive failures
324 When a directory hits a configured number of consecutive synchronization failures, the
325 mirror daemon marks it as `failed`. Synchronization for these directories are retried.
326 By default, the number of consecutive failures before a directory is marked as failed
327 is controlled by `cephfs_mirror_max_consecutive_failures_per_directory` configuration
328 option (default: 10) and the retry interval for failed directories is controlled via
329 `cephfs_mirror_retry_failed_directories_interval` configuration option (default: 60s).
331 E.g., adding a regular file for synchronization would result in failed status::
333 $ ceph fs snapshot mirror add cephfs /f0
334 $ ceph --admin-daemon /var/run/ceph/cephfs-mirror.asok fs mirror peer status cephfs@360 a2dc7784-e7a1-4723-b103-03ee8d8768f8
338 "last_synced_snap": {
341 "sync_duration": 0.079997898999999997,
342 "sync_time_stamp": "274900.558797s"
356 This allows a user to add a non-existent directory for synchronization. The mirror daemon
357 would mark the directory as failed and retry (less frequently). When the directory comes
358 to existence, the mirror daemons would unmark the failed state upon successfull snapshot
361 When mirroring is disabled, the respective `fs mirror status` command for the file system
362 will not show up in command help.
364 Mirroring module provides a couple of commands to display directory mapping and distribution
365 information. To check which mirror daemon a directory has been mapped to use::
367 $ ceph fs snapshot mirror dirmap cephfs /d0/d1/d2
369 "instance_id": "404148",
370 "last_shuffled": 1601284516.10986,
374 .. note:: `instance_id` is the RAODS instance-id associated with a mirror daemon.
376 Other information such as `state` and `last_shuffled` are interesting when running
377 multiple mirror daemons.
379 When no mirror daemons are running the above command shows::
381 $ ceph fs snapshot mirror dirmap cephfs /d0/d1/d2
383 "reason": "no mirror daemons running",
387 Signifying that no mirror daemons are running and mirroring is stalled.
392 When re-adding (reassigning) a peer to a file system in another cluster, ensure that
393 all mirror daemons have stopped synchronization to the peer. This can be checked
394 via `fs mirror status` admin socket command (the `Peer UUID` should not show up
395 in the command output). Also, it is recommended to purge synchronized directories
396 from the peer before re-adding it to another file system (especially those directories
397 which might exist in the new primary file system). This is not required if re-adding
398 a peer to the same primary file system it was earlier synchronized from.
403 `cephfs-mirror` daemon is built by default (follows `WITH_CEPHFS` CMake rule).