]> git.proxmox.com Git - ceph.git/blob - ceph/doc/radosgw/multisite-sync-policy.rst
befef4279ee2090e03d6f9c327b080958bde8332
[ceph.git] / ceph / doc / radosgw / multisite-sync-policy.rst
1 =====================
2 Multisite Sync Policy
3 =====================
4
5 .. versionadded:: Octopus
6
7 Multisite bucket-granularity sync policy provides fine grained control of data movement between buckets in different zones. It extends the zone sync mechanism. Previously buckets were being treated symmetrically, that is -- each (data) zone holds a mirror of that bucket that should be the same as all the other zones. Whereas leveraging the bucket-granularity sync policy is possible for buckets to diverge, and a bucket can pull data from other buckets (ones that don't share its name or its ID) in different zone. The sync process was assuming therefore that the bucket sync source and the bucket sync destination were always referring to the same bucket, now that is not the case anymore.
8
9 The sync policy supersedes the old zonegroup coarse configuration (sync_from*). The sync policy can be configured at the zonegroup level (and if it is configured it replaces the old style config), but it can also be configured at the bucket level.
10
11 In the sync policy multiple groups that can contain lists of data-flow configurations can be defined, as well as lists of pipe configurations. The data-flow defines the flow of data between the different zones. It can define symmetrical data flow, in which multiple zones sync data from each other, and it can define directional data flow, in which the data moves in one way from one zone to another.
12
13 A pipe defines the actual buckets that can use these data flows, and the properties that are associated with it (for example: source object prefix).
14
15 A sync policy group can be in 3 states:
16
17 +----------------------------+----------------------------------------+
18 | Value | Description |
19 +============================+========================================+
20 | ``enabled`` | sync is allowed and enabled |
21 +----------------------------+----------------------------------------+
22 | ``allowed`` | sync is allowed |
23 +----------------------------+----------------------------------------+
24 | ``forbidden`` | sync (as defined by this group) is not |
25 | | allowed and can override other groups |
26 +----------------------------+----------------------------------------+
27
28 A policy can be defined at the bucket level. A bucket level sync policy inherits the data flow of the zonegroup policy, and can only define a subset of what the zonegroup allows.
29
30 A wildcard zone, and a wildcard bucket parameter in the policy defines all relevant zones, or all relevant buckets. In the context of a bucket policy it means the current bucket instance. A disaster recovery configuration where entire zones are mirrored doesn't require configuring anything on the buckets. However, for a fine grained bucket sync it would be better to configure the pipes to be synced by allowing (status=allowed) them at the zonegroup level (e.g., using wildcards), but only enable the specific sync at the bucket level (status=enabled). If needed, the policy at the bucket level can limit the data movement to specific relevant zones.
31
32 .. important:: Any changes to the zonegroup policy needs to be applied on the
33 zonegroup master zone, and require period update and commit. Changes
34 to the bucket policy needs to be applied on the zonegroup master
35 zone. The changes are dynamically handled by rgw.
36
37
38 S3 Replication API
39 ~~~~~~~~~~~~~~~~~~
40
41 The S3 bucket replication api has also been implemented, and allows users to create replication rules between different buckets. Note though that while the AWS replication feature allows bucket replication within the same zone, rgw does not allow it at the moment. However, the rgw api also added a new 'Zone' array that allows users to select to what zones the specific bucket will be synced.
42
43
44 Sync Policy Control Reference
45 =============================
46
47
48 Get Sync Policy
49 ~~~~~~~~~~~~~~~
50
51 To retrieve the current zonegroup sync policy, or a specific bucket policy:
52
53 ::
54
55 # radosgw-admin sync policy get [--bucket=<bucket>]
56
57
58 Create Sync Policy Group
59 ~~~~~~~~~~~~~~~~~~~~~~~~
60
61 To create a sync policy group:
62
63 ::
64
65 # radosgw-admin sync group create [--bucket=<bucket>] \
66 --group-id=<group-id> \
67 --status=<enabled | allowed | forbidden> \
68
69
70 Modify Sync Policy Group
71 ~~~~~~~~~~~~~~~~~~~~~~~~
72
73 To modify a sync policy group:
74
75 ::
76
77 # radosgw-admin sync group modify [--bucket=<bucket>] \
78 --group-id=<group-id> \
79 --status=<enabled | allowed | forbidden> \
80
81
82 Show Sync Policy Group
83 ~~~~~~~~~~~~~~~~~~~~~~~~
84
85 To show a sync policy group:
86
87 ::
88
89 # radosgw-admin sync group get [--bucket=<bucket>] \
90 --group-id=<group-id>
91
92
93 Remove Sync Policy Group
94 ~~~~~~~~~~~~~~~~~~~~~~~~
95
96 To remove a sync policy group:
97
98 ::
99
100 # radosgw-admin sync group remove [--bucket=<bucket>] \
101 --group-id=<group-id>
102
103
104
105 Create Sync Flow
106 ~~~~~~~~~~~~~~~~
107
108 - To create or update directional sync flow:
109
110 ::
111
112 # radosgw-admin sync group flow create [--bucket=<bucket>] \
113 --group-id=<group-id> \
114 --flow-id=<flow-id> \
115 --flow-type=directional \
116 --source-zone=<source_zone> \
117 --dest-zone=<dest_zone>
118
119
120 - To create or update symmetrical sync flow:
121
122 ::
123
124 # radosgw-admin sync group flow create [--bucket=<bucket>] \
125 --group-id=<group-id> \
126 --flow-id=<flow-id> \
127 --flow-type=symmetrical \
128 --zones=<zones>
129
130
131 Where zones are a comma separated list of all the zones that need to add to the flow.
132
133
134 Remove Sync Flow Zones
135 ~~~~~~~~~~~~~~~~~~~~~~
136
137 - To remove directional sync flow:
138
139 ::
140
141 # radosgw-admin sync group flow remove [--bucket=<bucket>] \
142 --group-id=<group-id> \
143 --flow-id=<flow-id> \
144 --flow-type=directional \
145 --source-zone=<source_zone> \
146 --dest-zone=<dest_zone>
147
148
149 - To remove specific zones from symmetrical sync flow:
150
151 ::
152
153 # radosgw-admin sync group flow remove [--bucket=<bucket>] \
154 --group-id=<group-id> \
155 --flow-id=<flow-id> \
156 --flow-type=symmetrical \
157 --zones=<zones>
158
159
160 Where zones are a comma separated list of all zones to remove from the flow.
161
162
163 - To remove symmetrical sync flow:
164
165 ::
166
167 # radosgw-admin sync group flow remove [--bucket=<bucket>] \
168 --group-id=<group-id> \
169 --flow-id=<flow-id> \
170 --flow-type=symmetrical
171
172
173 Create Sync Pipe
174 ~~~~~~~~~~~~~~~~
175
176 To create sync group pipe, or update its parameters:
177
178
179 ::
180
181 # radosgw-admin sync group pipe create [--bucket=<bucket>] \
182 --group-id=<group-id> \
183 --pipe-id=<pipe-id> \
184 --source-zones=<source_zones> \
185 [--source-bucket=<source_buckets>] \
186 [--source-bucket-id=<source_bucket_id>] \
187 --dest-zones=<dest_zones> \
188 [--dest-bucket=<dest_buckets>] \
189 [--dest-bucket-id=<dest_bucket_id>] \
190 [--prefix=<source_prefix>] \
191 [--prefix-rm] \
192 [--tags-add=<tags>] \
193 [--tags-rm=<tags>] \
194 [--dest-owner=<owner>] \
195 [--storage-class=<storage_class>] \
196 [--mode=<system | user>] \
197 [--uid=<user_id>]
198
199
200 Zones are either a list of zones, or '*' (wildcard). Wildcard zones mean any zone that matches the sync flow rules.
201 Buckets are either a bucket name, or '*' (wildcard). Wildcard bucket means the current bucket
202 Prefix can be defined to filter source objects.
203 Tags are passed by a comma separated list of 'key=value'.
204 Destination owner can be set to force a destination owner of the objects. If user mode is selected, only the destination bucket owner can be set.
205 Destination storage class can also be configured.
206 User id can be set for user mode, and will be the user under which the sync operation will be executed (for permissions validation).
207
208
209 Remove Sync Pipe
210 ~~~~~~~~~~~~~~~~
211
212 To remove specific sync group pipe params, or the entire pipe:
213
214
215 ::
216
217 # radosgw-admin sync group pipe remove [--bucket=<bucket>] \
218 --group-id=<group-id> \
219 --pipe-id=<pipe-id> \
220 [--source-zones=<source_zones>] \
221 [--source-bucket=<source_buckets>] \
222 [--source-bucket-id=<source_bucket_id>] \
223 [--dest-zones=<dest_zones>] \
224 [--dest-bucket=<dest_buckets>] \
225 [--dest-bucket-id=<dest_bucket_id>]
226
227
228 Sync Info
229 ~~~~~~~~~
230
231 To get information about the expected sync sources and targets (as defined by the sync policy):
232
233 ::
234
235 # radosgw-admin sync info [--bucket=<bucket>] \
236 [--effective-zone-name=<zone>]
237
238
239 Since a bucket can define a policy that defines data movement from it towards a different bucket at a different zone, when the policy is created we also generate a list of bucket dependencies that are used as hints when a sync of any particular bucket happens. The fact that a bucket references another bucket does not mean it actually syncs to/from it, as the data flow might not permit it.
240
241
242 Examples
243 ========
244
245 The system in these examples includes 3 zones: ``us-east`` (the master zone), ``us-west``, ``us-west-2``.
246
247 Example 1: Two Zones, Complete Mirror
248 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
249
250 This is similar to older (pre ``Octopus``) sync capabilities, but being done via the new sync policy engine. Note that changes to the zonegroup sync policy require a period update and commit.
251
252
253 ::
254
255 [us-east] $ radosgw-admin sync group create --group-id=group1 --status=allowed
256 [us-east] $ radosgw-admin sync group flow create --group-id=group1 \
257 --flow-id=flow-mirror --flow-type=symmetrical \
258 --zones=us-east,us-west
259 [us-east] $ radosgw-admin sync group pipe create --group-id=group1 \
260 --pipe-id=pipe1 --source-zones='*' \
261 --source-bucket='*' --dest-zones='*' \
262 --dest-bucket='*'
263 [us-east] $ radosgw-admin sync group modify --group-id=group1 --status=enabled
264 [us-east] $ radosgw-admin period update --commit
265
266 $ radosgw-admin sync info --bucket=buck
267 {
268 "sources": [
269 {
270 "id": "pipe1",
271 "source": {
272 "zone": "us-west",
273 "bucket": "buck:115b12b3-....4409.1"
274 },
275 "dest": {
276 "zone": "us-east",
277 "bucket": "buck:115b12b3-....4409.1"
278 },
279 "params": {
280 ...
281 }
282 }
283 ],
284 "dests": [
285 {
286 "id": "pipe1",
287 "source": {
288 "zone": "us-east",
289 "bucket": "buck:115b12b3-....4409.1"
290 },
291 "dest": {
292 "zone": "us-west",
293 "bucket": "buck:115b12b3-....4409.1"
294 },
295 ...
296 }
297 ],
298 ...
299 }
300 }
301
302
303 Note that the "id" field in the output above reflects the pipe rule
304 that generated that entry, a single rule can generate multiple sync
305 entries as can be seen in the example.
306
307 ::
308
309 [us-west] $ radosgw-admin sync info --bucket=buck
310 {
311 "sources": [
312 {
313 "id": "pipe1",
314 "source": {
315 "zone": "us-east",
316 "bucket": "buck:115b12b3-....4409.1"
317 },
318 "dest": {
319 "zone": "us-west",
320 "bucket": "buck:115b12b3-....4409.1"
321 },
322 ...
323 }
324 ],
325 "dests": [
326 {
327 "id": "pipe1",
328 "source": {
329 "zone": "us-west",
330 "bucket": "buck:115b12b3-....4409.1"
331 },
332 "dest": {
333 "zone": "us-east",
334 "bucket": "buck:115b12b3-....4409.1"
335 },
336 ...
337 }
338 ],
339 ...
340 }
341
342
343
344 Example 2: Directional, Entire Zone Backup
345 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
346
347 Also similar to older sync capabilities. In here we add a third zone, ``us-west-2`` that will be a replica of ``us-west``, but data will not be replicated back from it.
348
349 ::
350
351 [us-east] $ radosgw-admin sync group flow create --group-id=group1 \
352 --flow-id=us-west-backup --flow-type=directional \
353 --source-zone=us-west --dest-zone=us-west-2
354 [us-east] $ radosgw-admin period update --commit
355
356
357 Note that us-west has two dests:
358
359 ::
360
361 [us-west] $ radosgw-admin sync info --bucket=buck
362 {
363 "sources": [
364 {
365 "id": "pipe1",
366 "source": {
367 "zone": "us-east",
368 "bucket": "buck:115b12b3-....4409.1"
369 },
370 "dest": {
371 "zone": "us-west",
372 "bucket": "buck:115b12b3-....4409.1"
373 },
374 ...
375 }
376 ],
377 "dests": [
378 {
379 "id": "pipe1",
380 "source": {
381 "zone": "us-west",
382 "bucket": "buck:115b12b3-....4409.1"
383 },
384 "dest": {
385 "zone": "us-east",
386 "bucket": "buck:115b12b3-....4409.1"
387 },
388 ...
389 },
390 {
391 "id": "pipe1",
392 "source": {
393 "zone": "us-west",
394 "bucket": "buck:115b12b3-....4409.1"
395 },
396 "dest": {
397 "zone": "us-west-2",
398 "bucket": "buck:115b12b3-....4409.1"
399 },
400 ...
401 }
402 ],
403 ...
404 }
405
406
407 Whereas us-west-2 has only source and no destinations:
408
409 ::
410
411 [us-west-2] $ radosgw-admin sync info --bucket=buck
412 {
413 "sources": [
414 {
415 "id": "pipe1",
416 "source": {
417 "zone": "us-west",
418 "bucket": "buck:115b12b3-....4409.1"
419 },
420 "dest": {
421 "zone": "us-west-2",
422 "bucket": "buck:115b12b3-....4409.1"
423 },
424 ...
425 }
426 ],
427 "dests": [],
428 ...
429 }
430
431
432
433 Example 3: Mirror a Specific Bucket
434 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
435
436 Using the same group configuration, but this time switching it to ``allowed`` state, which means that sync is allowed but not enabled.
437
438 ::
439
440 [us-east] $ radosgw-admin sync group modify --group-id=group1 --status=allowed
441 [us-east] $ radosgw-admin period update --commit
442
443
444 And we will create a bucket level policy rule for existing bucket ``buck2``. Note that the bucket needs to exist before being able to set this policy, and admin commands that modify bucket policies need to run on the master zone, however, they do not require period update. There is no need to change the data flow, as it is inherited from the zonegroup policy. A bucket policy flow will only be a subset of the flow defined in the zonegroup policy. Same goes for pipes, although a bucket policy can enable pipes that are not enabled (albeit not forbidden) at the zonegroup policy.
445
446 ::
447
448 [us-east] $ radosgw-admin sync group create --bucket=buck2 \
449 --group-id=buck2-default --status=enabled
450
451 [us-east] $ radosgw-admin sync group pipe create --bucket=buck2 \
452 --group-id=buck2-default --pipe-id=pipe1 \
453 --source-zones='*' --dest-zones='*'
454
455
456
457 Example 4: Limit Bucket Sync To Specific Zones
458 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
459
460 This will only sync ``buck3`` to ``us-east`` (from any zone that flow allows to sync into ``us-east``).
461
462 ::
463
464 [us-east] $ radosgw-admin sync group create --bucket=buck3 \
465 --group-id=buck3-default --status=enabled
466
467 [us-east] $ radosgw-admin sync group pipe create --bucket=buck3 \
468 --group-id=buck3-default --pipe-id=pipe1 \
469 --source-zones='*' --dest-zones=us-east
470
471
472
473 Example 5: Sync From a Different Bucket
474 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
475
476 Note that bucket sync only works (currently) across zones and not within the same zone.
477
478 Set ``buck4`` to pull data from ``buck5``:
479
480 ::
481
482 [us-east] $ radosgw-admin sync group create --bucket=buck4 '
483 --group-id=buck4-default --status=enabled
484
485 [us-east] $ radosgw-admin sync group pipe create --bucket=buck4 \
486 --group-id=buck4-default --pipe-id=pipe1 \
487 --source-zones='*' --source-bucket=buck5 \
488 --dest-zones='*'
489
490
491 can also limit it to specific zones, for example the following will
492 only sync data originated in us-west:
493
494 ::
495
496 [us-east] $ radosgw-admin sync group pipe modify --bucket=buck4 \
497 --group-id=buck4-default --pipe-id=pipe1 \
498 --source-zones=us-west --source-bucket=buck5 \
499 --dest-zones='*'
500
501
502 Checking the sync info for ``buck5`` on ``us-west`` is interesting:
503
504 ::
505
506 [us-west] $ radosgw-admin sync info --bucket=buck5
507 {
508 "sources": [],
509 "dests": [],
510 "hints": {
511 "sources": [],
512 "dests": [
513 "buck4:115b12b3-....14433.2"
514 ]
515 },
516 "resolved-hints-1": {
517 "sources": [],
518 "dests": [
519 {
520 "id": "pipe1",
521 "source": {
522 "zone": "us-west",
523 "bucket": "buck5"
524 },
525 "dest": {
526 "zone": "us-east",
527 "bucket": "buck4:115b12b3-....14433.2"
528 },
529 ...
530 },
531 {
532 "id": "pipe1",
533 "source": {
534 "zone": "us-west",
535 "bucket": "buck5"
536 },
537 "dest": {
538 "zone": "us-west-2",
539 "bucket": "buck4:115b12b3-....14433.2"
540 },
541 ...
542 }
543 ]
544 },
545 "resolved-hints": {
546 "sources": [],
547 "dests": []
548 }
549 }
550
551
552 Note that there are resolved hints, which means that the bucket ``buck5`` found about ``buck4`` syncing from it indirectly, and not from its own policy (the policy for ``buck5`` itself is empty).
553
554
555 Example 6: Sync To Different Bucket
556 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
557
558 The same mechanism can work for configuring data to be synced to (vs. synced from as in the previous example). Note that internally data is still pulled from the source at the destination zone:
559
560 Set ``buck6`` to "push" data to ``buck5``:
561
562 ::
563
564 [us-east] $ radosgw-admin sync group create --bucket=buck6 \
565 --group-id=buck6-default --status=enabled
566
567 [us-east] $ radosgw-admin sync group pipe create --bucket=buck6 \
568 --group-id=buck6-default --pipe-id=pipe1 \
569 --source-zones='*' --source-bucket='*' \
570 --dest-zones='*' --dest-bucket=buck5
571
572
573 A wildcard bucket name means the current bucket in the context of bucket sync policy.
574
575 Combined with the configuration in Example 5, we can now write data to ``buck6`` on ``us-east``, data will sync to ``buck5`` on ``us-west``, and from there it will be distributed to ``buck4`` on ``us-east``, and on ``us-west-2``.
576
577 Example 7: Source Filters
578 ~~~~~~~~~~~~~~~~~~~~~~~~~
579
580 Sync from ``buck8`` to ``buck9``, but only objects that start with ``foo/``:
581
582 ::
583
584 [us-east] $ radosgw-admin sync group create --bucket=buck8 \
585 --group-id=buck8-default --status=enabled
586
587 [us-east] $ radosgw-admin sync group pipe create --bucket=buck8 \
588 --group-id=buck8-default --pipe-id=pipe-prefix \
589 --prefix=foo/ --source-zones='*' --dest-zones='*' \
590 --dest-bucket=buck9
591
592
593 Also sync from ``buck8`` to ``buck9`` any object that has the tags ``color=blue`` or ``color=red``:
594
595 ::
596
597 [us-east] $ radosgw-admin sync group pipe create --bucket=buck8 \
598 --group-id=buck8-default --pipe-id=pipe-tags \
599 --tags-add=color=blue,color=red --source-zones='*' \
600 --dest-zones='*' --dest-bucket=buck9
601
602
603 And we can check the expected sync in ``us-east`` (for example):
604
605 ::
606
607 [us-east] $ radosgw-admin sync info --bucket=buck8
608 {
609 "sources": [],
610 "dests": [
611 {
612 "id": "pipe-prefix",
613 "source": {
614 "zone": "us-east",
615 "bucket": "buck8:115b12b3-....14433.5"
616 },
617 "dest": {
618 "zone": "us-west",
619 "bucket": "buck9"
620 },
621 "params": {
622 "source": {
623 "filter": {
624 "prefix": "foo/",
625 "tags": []
626 }
627 },
628 ...
629 }
630 },
631 {
632 "id": "pipe-tags",
633 "source": {
634 "zone": "us-east",
635 "bucket": "buck8:115b12b3-....14433.5"
636 },
637 "dest": {
638 "zone": "us-west",
639 "bucket": "buck9"
640 },
641 "params": {
642 "source": {
643 "filter": {
644 "tags": [
645 {
646 "key": "color",
647 "value": "blue"
648 },
649 {
650 "key": "color",
651 "value": "red"
652 }
653 ]
654 }
655 },
656 ...
657 }
658 }
659 ],
660 ...
661 }
662
663
664 Note that there aren't any sources, only two different destinations (one for each configuration). When the sync process happens it will select the relevant rule for each object it syncs.
665
666 Prefixes and tags can be combined, in which object will need to have both in order to be synced. The priority param can also be passed, and it can be used to determine when there are multiple different rules that are matched (and have the same source and destination), to determine which of the rules to be used.
667
668
669 Example 8: Destination Params: Storage Class
670 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
671
672 Storage class of the destination objects can be configured:
673
674 ::
675
676 [us-east] $ radosgw-admin sync group create --bucket=buck10 \
677 --group-id=buck10-default --status=enabled
678
679 [us-east] $ radosgw-admin sync group pipe create --bucket=buck10 \
680 --group-id=buck10-default \
681 --pipe-id=pipe-storage-class \
682 --source-zones='*' --dest-zones=us-west-2 \
683 --storage-class=CHEAP_AND_SLOW
684
685
686 Example 9: Destination Params: Destination Owner Translation
687 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
688
689 Set the destination objects owner as the destination bucket owner.
690 This requires specifying the uid of the destination bucket:
691
692 ::
693
694 [us-east] $ radosgw-admin sync group create --bucket=buck11 \
695 --group-id=buck11-default --status=enabled
696
697 [us-east] $ radosgw-admin sync group pipe create --bucket=buck11 \
698 --group-id=buck11-default --pipe-id=pipe-dest-owner \
699 --source-zones='*' --dest-zones='*' \
700 --dest-bucket=buck12 --dest-owner=joe
701
702 Example 10: Destination Params: User Mode
703 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
704
705 User mode makes sure that the user has permissions to both read the objects, and write to the destination bucket. This requires that the uid of the user (which in its context the operation executes) is specified.
706
707 ::
708
709 [us-east] $ radosgw-admin sync group pipe modify --bucket=buck11 \
710 --group-id=buck11-default --pipe-id=pipe-dest-owner \
711 --mode=user --uid=jenny
712
713
714