]> git.proxmox.com Git - ceph.git/blame - ceph/doc/dev/cephfs-mirroring.rst
bump version to 18.2.2-pve1
[ceph.git] / ceph / doc / dev / cephfs-mirroring.rst
CommitLineData
f67539c2
TL
1================
2CephFS Mirroring
3================
4
1e59de90
TL
5CephFS supports asynchronous replication of snapshots to a remote CephFS file
6system via `cephfs-mirror` tool. Snapshots are synchronized by mirroring
7snapshot data followed by creating a snapshot with the same name (for a given
8directory on the remote file system) as the snapshot being synchronized.
f67539c2
TL
9
10Requirements
11------------
12
1e59de90
TL
13The primary (local) and secondary (remote) Ceph clusters version should be
14Pacific or later.
f67539c2
TL
15
16Key Idea
17--------
18
1e59de90
TL
19For a given snapshot pair in a directory, `cephfs-mirror` daemon will rely on
20readdir diff to identify changes in a directory tree. The diffs are applied to
21directory in the remote file system thereby only synchronizing files that have
22changed between two snapshots.
f67539c2
TL
23
24This feature is tracked here: https://tracker.ceph.com/issues/47034.
25
1e59de90
TL
26Currently, snapshot data is synchronized by bulk copying to the remote
27filesystem.
f67539c2 28
1e59de90
TL
29.. note:: Synchronizing hardlinks is not supported -- hardlinked files get
30 synchronized as separate files.
f67539c2
TL
31
32Creating Users
33--------------
34
1e59de90
TL
35Start by creating a user (on the primary/local cluster) for the mirror daemon.
36This user requires write capability on the metadata pool to create RADOS
37objects (index objects) for watch/notify operation and read capability on the
38data pool(s).
f67539c2 39
1e59de90
TL
40.. prompt:: bash $
41
42 ceph auth get-or-create client.mirror mon 'profile cephfs-mirror' mds 'allow r' osd 'allow rw tag cephfs metadata=*, allow r tag cephfs data=*' mgr 'allow r'
f67539c2
TL
43
44Create a user for each file system peer (on the secondary/remote cluster). This user needs
45to have full capabilities on the MDS (to take snapshots) and the OSDs::
46
47 $ ceph fs authorize <fs_name> client.mirror_remote / rwps
48
49This user should be used (as part of peer specification) when adding a peer.
50
51Starting Mirror Daemon
52----------------------
53
54Mirror daemon should be spawned using `systemctl(1)` unit files::
55
56 $ systemctl enable cephfs-mirror@mirror
57 $ systemctl start cephfs-mirror@mirror
58
59`cephfs-mirror` daemon can be run in foreground using::
60
61 $ cephfs-mirror --id mirror --cluster site-a -f
62
63.. note:: User used here is `mirror` as created in the `Creating Users` section.
64
65Mirroring Design
66----------------
67
68CephFS supports asynchronous replication of snapshots to a remote CephFS file system
69via `cephfs-mirror` tool. For a given directory, snapshots are synchronized by transferring
70snapshot data to the remote file system and creating a snapshot with the same name as the
71snapshot being synchronized.
72
73Snapshot Synchronization Order
74------------------------------
75
76Although the order in which snapshots get chosen for synchronization does not matter,
77snapshots are picked based on creation order (using snap-id).
78
79Snapshot Incarnation
80--------------------
81
82A snapshot may be deleted and recreated (with the same name) with different contents.
83An "old" snapshot could have been synchronized (earlier) and the recreation of the
84snapshot could have been done when mirroring was disabled. Using snapshot names to
85infer the point-of-continuation would result in the "new" snapshot (incarnation)
86never getting picked up for synchronization.
87
88Snapshots on the secondary file system stores the snap-id of the snapshot it was
89synchronized from. This metadata is stored in `SnapInfo` structure on the MDS.
90
91Interfaces
92----------
93
94`Mirroring` module (manager plugin) provides interfaces for managing directory snapshot
95mirroring. Manager interfaces are (mostly) wrappers around monitor commands for managing
96file system mirroring and is the recommended control interface.
97
98Mirroring Module and Interface
99------------------------------
100
101Mirroring module provides interface for managing directory snapshot mirroring. The module
102is implemented as a Ceph Manager plugin. Mirroring module does not manage spawning (and
103terminating) the mirror daemons. Right now the preferred way would be to start/stop
104mirror daemons via `systemctl(1)`. Going forward, deploying mirror daemons would be
105managed by `cephadm` (Tracker: http://tracker.ceph.com/issues/47261).
106
107The manager module is responsible for assigning directories to mirror daemons for
108synchronization. Multiple mirror daemons can be spawned to achieve concurrency in
109directory snapshot synchronization. When mirror daemons are spawned (or terminated)
110, the mirroring module discovers the modified set of mirror daemons and rebalances
111the directory assignment amongst the new set thus providing high-availability.
112
113.. note:: Multiple mirror daemons is currently untested. Only a single mirror daemon
114 is recommended.
115
116Mirroring module is disabled by default. To enable mirroring use::
117
118 $ ceph mgr module enable mirroring
119
120Mirroring module provides a family of commands to control mirroring of directory
121snapshots. To add or remove directories, mirroring needs to be enabled for a given
122file system. To enable mirroring use::
123
b3b6e05e 124 $ ceph fs snapshot mirror enable <fs_name>
f67539c2
TL
125
126.. note:: Mirroring module commands use `fs snapshot mirror` prefix as compared to
127 the monitor commands which `fs mirror` prefix. Make sure to use module
128 commands.
129
130To disable mirroring, use::
131
b3b6e05e 132 $ ceph fs snapshot mirror disable <fs_name>
f67539c2
TL
133
134Once mirroring is enabled, add a peer to which directory snapshots are to be mirrored.
135Peers follow `<client>@<cluster>` specification and get assigned a unique-id (UUID)
136when added. See `Creating Users` section on how to create Ceph users for mirroring.
137
138To add a peer use::
139
b3b6e05e 140 $ ceph fs snapshot mirror peer_add <fs_name> <remote_cluster_spec> [<remote_fs_name>] [<remote_mon_host>] [<cephx_key>]
f67539c2 141
b3b6e05e 142`<remote_fs_name>` is optional, and default to `<fs_name>` (on the remote cluster).
f67539c2
TL
143
144This requires the remote cluster ceph configuration and user keyring to be available in
145the primary cluster. See `Bootstrap Peers` section to avoid this. `peer_add` additionally
146supports passing the remote cluster monitor address and the user key. However, bootstrapping
147a peer is the recommended way to add a peer.
148
149.. note:: Only a single peer is supported right now.
150
151To remove a peer use::
152
b3b6e05e 153 $ ceph fs snapshot mirror peer_remove <fs_name> <peer_uuid>
f67539c2
TL
154
155.. note:: See `Mirror Daemon Status` section on how to figure out Peer UUID.
156
157To list file system mirror peers use::
158
b3b6e05e 159 $ ceph fs snapshot mirror peer_list <fs_name>
f67539c2
TL
160
161To configure a directory for mirroring, use::
162
b3b6e05e 163 $ ceph fs snapshot mirror add <fs_name> <path>
f67539c2
TL
164
165To stop a mirroring directory snapshots use::
166
b3b6e05e 167 $ ceph fs snapshot mirror remove <fs_name> <path>
f67539c2
TL
168
169Only absolute directory paths are allowed. Also, paths are normalized by the mirroring
20effc67 170module, therefore, `/a/b/../b` is equivalent to `/a/b`.
f67539c2
TL
171
172 $ mkdir -p /d0/d1/d2
173 $ ceph fs snapshot mirror add cephfs /d0/d1/d2
174 {}
175 $ ceph fs snapshot mirror add cephfs /d0/d1/../d1/d2
176 Error EEXIST: directory /d0/d1/d2 is already tracked
177
178Once a directory is added for mirroring, its subdirectory or ancestor directories are
20effc67 179disallowed to be added for mirroring::
f67539c2
TL
180
181 $ ceph fs snapshot mirror add cephfs /d0/d1
182 Error EINVAL: /d0/d1 is a ancestor of tracked path /d0/d1/d2
183 $ ceph fs snapshot mirror add cephfs /d0/d1/d2/d3
184 Error EINVAL: /d0/d1/d2/d3 is a subtree of tracked path /d0/d1/d2
185
186Commands to check directory mapping (to mirror daemons) and directory distribution are
187detailed in `Mirror Daemon Status` section.
188
189Bootstrap Peers
190---------------
191
192Adding a peer (via `peer_add`) requires the peer cluster configuration and user keyring
193to be available in the primary cluster (manager host and hosts running the mirror daemon).
194This can be avoided by bootstrapping and importing a peer token. Peer bootstrap involves
195creating a bootstrap token on the peer cluster via::
196
197 $ ceph fs snapshot mirror peer_bootstrap create <fs_name> <client_entity> <site-name>
198
199e.g.::
200
201 $ ceph fs snapshot mirror peer_bootstrap create backup_fs client.mirror_remote site-remote
202 {"token": "eyJmc2lkIjogIjBkZjE3MjE3LWRmY2QtNDAzMC05MDc5LTM2Nzk4NTVkNDJlZiIsICJmaWxlc3lzdGVtIjogImJhY2t1cF9mcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcGVlcl9ib290c3RyYXAiLCAic2l0ZV9uYW1lIjogInNpdGUtcmVtb3RlIiwgImtleSI6ICJBUUFhcDBCZ0xtRmpOeEFBVnNyZXozai9YYUV0T2UrbUJEZlJDZz09IiwgIm1vbl9ob3N0IjogIlt2MjoxOTIuMTY4LjAuNTo0MDkxOCx2MToxOTIuMTY4LjAuNTo0MDkxOV0ifQ=="}
203
204`site-name` refers to a user-defined string to identify the remote filesystem. In context
205of `peer_add` interface, `site-name` is the passed in `cluster` name from `remote_cluster_spec`.
206
207Import the bootstrap token in the primary cluster via::
208
209 $ ceph fs snapshot mirror peer_bootstrap import <fs_name> <token>
210
211e.g.::
212
213 $ ceph fs snapshot mirror peer_bootstrap import cephfs eyJmc2lkIjogIjBkZjE3MjE3LWRmY2QtNDAzMC05MDc5LTM2Nzk4NTVkNDJlZiIsICJmaWxlc3lzdGVtIjogImJhY2t1cF9mcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcGVlcl9ib290c3RyYXAiLCAic2l0ZV9uYW1lIjogInNpdGUtcmVtb3RlIiwgImtleSI6ICJBUUFhcDBCZ0xtRmpOeEFBVnNyZXozai9YYUV0T2UrbUJEZlJDZz09IiwgIm1vbl9ob3N0IjogIlt2MjoxOTIuMTY4LjAuNTo0MDkxOCx2MToxOTIuMTY4LjAuNTo0MDkxOV0ifQ==
214
215Mirror Daemon Status
216--------------------
217
218Mirror daemons get asynchronously notified about changes in file system mirroring status
b3b6e05e
TL
219and/or peer updates.
220
221CephFS mirroring module provides `mirror daemon status` interface to check mirror daemon
222status::
223
a4b75251 224 $ ceph fs snapshot mirror daemon status
b3b6e05e
TL
225
226E.g::
227
a4b75251 228 $ ceph fs snapshot mirror daemon status | jq
b3b6e05e
TL
229 [
230 {
231 "daemon_id": 284167,
232 "filesystems": [
233 {
234 "filesystem_id": 1,
235 "name": "a",
236 "directory_count": 1,
237 "peers": [
238 {
239 "uuid": "02117353-8cd1-44db-976b-eb20609aa160",
240 "remote": {
241 "client_name": "client.mirror_remote",
242 "cluster_name": "ceph",
243 "fs_name": "backup_fs"
244 },
245 "stats": {
246 "failure_count": 1,
247 "recovery_count": 0
248 }
249 }
250 ]
251 }
252 ]
253 }
254 ]
255
256An entry per mirror daemon instance is displayed along with information such as configured
257peers and basic stats. For more detailed stats, use the admin socket interface as detailed
258below.
259
260CephFS mirror daemons provide admin socket commands for querying mirror status. To check
261available commands for mirror status use::
f67539c2
TL
262
263 $ ceph --admin-daemon /path/to/mirror/daemon/admin/socket help
264 {
265 ....
266 ....
267 "fs mirror status cephfs@360": "get filesystem mirror status",
268 ....
269 ....
270 }
271
272Commands with `fs mirror status` prefix provide mirror status for mirror enabled
273file systems. Note that `cephfs@360` is of format `filesystem-name@filesystem-id`.
274This format is required since mirror daemons get asynchronously notified regarding
275file system mirror status (A file system can be deleted and recreated with the same
276name).
277
278Right now, the command provides minimal information regarding mirror status::
279
280 $ ceph --admin-daemon /var/run/ceph/cephfs-mirror.asok fs mirror status cephfs@360
281 {
282 "rados_inst": "192.168.0.5:0/1476644347",
283 "peers": {
284 "a2dc7784-e7a1-4723-b103-03ee8d8768f8": {
285 "remote": {
286 "client_name": "client.mirror_remote",
287 "cluster_name": "site-a",
288 "fs_name": "backup_fs"
289 }
290 }
291 },
292 "snap_dirs": {
293 "dir_count": 1
294 }
295 }
296
297`Peers` section in the command output above shows the peer information such as unique
298peer-id (UUID) and specification. The peer-id is required to remove an existing peer
299as mentioned in the `Mirror Module and Interface` section.
300
301Command with `fs mirror peer status` prefix provide peer synchronization status. This
302command is of format `filesystem-name@filesystem-id peer-uuid`::
303
304 $ ceph --admin-daemon /var/run/ceph/cephfs-mirror.asok fs mirror peer status cephfs@360 a2dc7784-e7a1-4723-b103-03ee8d8768f8
305 {
306 "/d0": {
307 "state": "idle",
308 "last_synced_snap": {
309 "id": 120,
310 "name": "snap1",
311 "sync_duration": 0.079997898999999997,
312 "sync_time_stamp": "274900.558797s"
313 },
314 "snaps_synced": 2,
315 "snaps_deleted": 0,
316 "snaps_renamed": 0
317 }
318 }
319
320Synchronization stats such as `snaps_synced`, `snaps_deleted` and `snaps_renamed` are reset
321on daemon restart and/or when a directory is reassigned to another mirror daemon (when
322multiple mirror daemons are deployed).
323
324A directory can be in one of the following states::
325
326 - `idle`: The directory is currently not being synchronized
327 - `syncing`: The directory is currently being synchronized
328 - `failed`: The directory has hit upper limit of consecutive failures
329
330When a directory hits a configured number of consecutive synchronization failures, the
331mirror daemon marks it as `failed`. Synchronization for these directories are retried.
332By default, the number of consecutive failures before a directory is marked as failed
333is controlled by `cephfs_mirror_max_consecutive_failures_per_directory` configuration
334option (default: 10) and the retry interval for failed directories is controlled via
335`cephfs_mirror_retry_failed_directories_interval` configuration option (default: 60s).
336
337E.g., adding a regular file for synchronization would result in failed status::
338
339 $ ceph fs snapshot mirror add cephfs /f0
340 $ ceph --admin-daemon /var/run/ceph/cephfs-mirror.asok fs mirror peer status cephfs@360 a2dc7784-e7a1-4723-b103-03ee8d8768f8
341 {
342 "/d0": {
343 "state": "idle",
344 "last_synced_snap": {
345 "id": 120,
346 "name": "snap1",
347 "sync_duration": 0.079997898999999997,
348 "sync_time_stamp": "274900.558797s"
349 },
350 "snaps_synced": 2,
351 "snaps_deleted": 0,
352 "snaps_renamed": 0
353 },
354 "/f0": {
355 "state": "failed",
356 "snaps_synced": 0,
357 "snaps_deleted": 0,
358 "snaps_renamed": 0
359 }
360 }
361
362This allows a user to add a non-existent directory for synchronization. The mirror daemon
363would mark the directory as failed and retry (less frequently). When the directory comes
20effc67 364to existence, the mirror daemons would unmark the failed state upon successful snapshot
f67539c2
TL
365synchronization.
366
367When mirroring is disabled, the respective `fs mirror status` command for the file system
368will not show up in command help.
369
370Mirroring module provides a couple of commands to display directory mapping and distribution
371information. To check which mirror daemon a directory has been mapped to use::
372
373 $ ceph fs snapshot mirror dirmap cephfs /d0/d1/d2
374 {
375 "instance_id": "404148",
376 "last_shuffled": 1601284516.10986,
377 "state": "mapped"
378 }
379
aee94f69 380.. note:: `instance_id` is the RADOS instance-id associated with a mirror daemon.
f67539c2
TL
381
382Other information such as `state` and `last_shuffled` are interesting when running
383multiple mirror daemons.
384
385When no mirror daemons are running the above command shows::
386
387 $ ceph fs snapshot mirror dirmap cephfs /d0/d1/d2
388 {
389 "reason": "no mirror daemons running",
390 "state": "stalled"
391 }
392
393Signifying that no mirror daemons are running and mirroring is stalled.
394
395Re-adding Peers
396---------------
397
398When re-adding (reassigning) a peer to a file system in another cluster, ensure that
399all mirror daemons have stopped synchronization to the peer. This can be checked
400via `fs mirror status` admin socket command (the `Peer UUID` should not show up
401in the command output). Also, it is recommended to purge synchronized directories
402from the peer before re-adding it to another file system (especially those directories
403which might exist in the new primary file system). This is not required if re-adding
404a peer to the same primary file system it was earlier synchronized from.
405
406Feature Status
407--------------
408
409`cephfs-mirror` daemon is built by default (follows `WITH_CEPHFS` CMake rule).