]> git.proxmox.com Git - ceph.git/blame - ceph/doc/radosgw/cloud-sync-module.rst
update source to Ceph Pacific 16.2.2
[ceph.git] / ceph / doc / radosgw / cloud-sync-module.rst
CommitLineData
11fdf7f2
TL
1=========================
2Cloud Sync Module
3=========================
4
5.. versionadded:: Mimic
6
7This module syncs zone data to a remote cloud service. The sync is unidirectional; data is not synced back from the
8remote zone. The goal of this module is to enable syncing data to multiple cloud providers. The currently supported
9cloud providers are those that are compatible with AWS (S3).
10
11User credentials for the remote cloud object store service need to be configured. Since many cloud services impose limits
12on the number of buckets that each user can create, the mapping of source objects and buckets is configurable.
13It is possible to configure different targets to different buckets and bucket prefixes. Note that source ACLs will not
14be preserved. It is possible to map permissions of specific source users to specific destination users.
15
16Due to API limitations there is no way to preserve original object modification time and ETag. The cloud sync module
17stores these as metadata attributes on the destination objects.
18
19
20
21Cloud Sync Tier Type Configuration
22-------------------------------------
23
24Trivial Configuration:
25~~~~~~~~~~~~~~~~~~~~~~
26
27::
28
29 {
30 "connection": {
31 "access_key": <access>,
32 "secret": <secret>,
33 "endpoint": <endpoint>,
34 "host_style": <path | virtual>,
35 },
36 "acls": [ { "type": <id | email | uri>,
37 "source_id": <source_id>,
38 "dest_id": <dest_id> } ... ],
39 "target_path": <target_path>,
40 }
41
42
43Non Trivial Configuration:
44~~~~~~~~~~~~~~~~~~~~~~~~~~
45
46::
47
48 {
49 "default": {
50 "connection": {
51 "access_key": <access>,
52 "secret": <secret>,
53 "endpoint": <endpoint>,
54 "host_style" <path | virtual>,
55 },
56 "acls": [
57 {
58 "type" : <id | email | uri>, # optional, default is id
59 "source_id": <id>,
60 "dest_id": <id>
61 } ... ]
62 "target_path": <path> # optional
63 },
64 "connections": [
65 {
66 "connection_id": <id>,
67 "access_key": <access>,
68 "secret": <secret>,
69 "endpoint": <endpoint>,
70 "host_style" <path | virtual>, # optional
71 } ... ],
72 "acl_profiles": [
73 {
74 "acls_id": <id>, # acl mappings
75 "acls": [ {
76 "type": <id | email | uri>,
77 "source_id": <id>,
78 "dest_id": <id>
79 } ... ]
80 }
81 ],
82 "profiles": [
83 {
84 "source_bucket": <source>,
85 "connection_id": <connection_id>,
86 "acls_id": <mappings_id>,
87 "target_path": <dest>, # optional
88 } ... ],
89 }
90
91
92.. Note:: Trivial configuration can coincide with the non-trivial one.
93
94
95* ``connection`` (container)
96
97Represents a connection to the remote cloud service. Contains ``conection_id`, ``access_key``,
98``secret``, ``endpoint``, and ``host_style``.
99
100* ``access_key`` (string)
101
102The remote cloud access key that will be used for a specific connection.
103
104* ``secret`` (string)
105
106The secret key for the remote cloud service.
107
108* ``endpoint`` (string)
109
110URL of remote cloud service endpoint.
111
112* ``host_style`` (path | virtual)
113
114Type of host style to be used when accessing remote cloud endpoint (default: ``path``).
115
116* ``acls`` (array)
117
118Contains a list of ``acl_mappings``.
119
120* ``acl_mapping`` (container)
121
122Each ``acl_mapping`` structure contains ``type``, ``source_id``, and ``dest_id``. These
123will define the ACL mutation that will be done on each object. An ACL mutation allows converting source
124user id to a destination id.
125
126* ``type`` (id | email | uri)
127
128ACL type: ``id`` defines user id, ``email`` defines user by email, and ``uri`` defines user by ``uri`` (group).
129
130* ``source_id`` (string)
131
132ID of user in the source zone.
133
134* ``dest_id`` (string)
135
136ID of user in the destination.
137
138* ``target_path`` (string)
139
140A string that defines how the target path is created. The target path specifies a prefix to which
141the source object name is appended. The target path configurable can include any of the following
142variables:
143- ``sid``: unique string that represents the sync instance ID
144- ``zonegroup``: the zonegroup name
145- ``zonegroup_id``: the zonegroup ID
146- ``zone``: the zone name
147- ``zone_id``: the zone id
148- ``bucket``: source bucket name
149- ``owner``: source bucket owner ID
150
151For example: ``target_path = rgwx-${zone}-${sid}/${owner}/${bucket}``
152
153
154* ``acl_profiles`` (array)
155
f67539c2 156An array of ``acl_profile``.
11fdf7f2
TL
157
158* ``acl_profile`` (container)
159
160Each profile contains ``acls_id`` (string) that represents the profile, and ``acls`` array that
161holds a list of ``acl_mappings``.
162
163* ``profiles`` (array)
164
165A list of profiles. Each profile contains the following:
166- ``source_bucket``: either a bucket name, or a bucket prefix (if ends with ``*``) that defines the source bucket(s) for this profile
167- ``target_path``: as defined above
168- ``connection_id``: ID of the connection that will be used for this profile
169- ``acls_id``: ID of ACLs profile that will be used for this profile
170
171
172S3 Specific Configurables:
173~~~~~~~~~~~~~~~~~~~~~~~~~~
174
175Currently cloud sync will only work with backends that are compatible with AWS S3. There are
176a few configurables that can be used to tweak its behavior when accessing these cloud services:
177
178::
179
180 {
181 "multipart_sync_threshold": {object_size},
182 "multipart_min_part_size": {part_size}
183 }
184
185
186* ``multipart_sync_threshold`` (integer)
187
188Objects this size or larger will be synced to the cloud using multipart upload.
189
190* ``multipart_min_part_size`` (integer)
191
192Minimum parts size to use when syncing objects using multipart upload.
193
194
195How to Configure
196~~~~~~~~~~~~~~~~
197
f67539c2 198See :ref:`multisite` for how to multisite config instructions. The cloud sync module requires a creation of a new zone. The zone
11fdf7f2
TL
199tier type needs to be defined as ``cloud``:
200
201::
202
203 # radosgw-admin zone create --rgw-zonegroup={zone-group-name} \
204 --rgw-zone={zone-name} \
205 --endpoints={http://fqdn}[,{http://fqdn}]
206 --tier-type=cloud
207
208
209The tier configuration can be then done using the following command
210
211::
212
213 # radosgw-admin zone modify --rgw-zonegroup={zone-group-name} \
214 --rgw-zone={zone-name} \
215 --tier-config={key}={val}[,{key}={val}]
216
217The ``key`` in the configuration specifies the config variable that needs to be updated, and
218the ``val`` specifies its new value. Nested values can be accessed using period. For example:
219
220::
221
222 # radosgw-admin zone modify --rgw-zonegroup={zone-group-name} \
223 --rgw-zone={zone-name} \
224 --tier-config=connection.access_key={key},connection.secret={secret}
225
226
227Configuration array entries can be accessed by specifying the specific entry to be referenced enclosed
228in square brackets, and adding new array entry can be done by using `[]`. Index value of `-1` references
229the last entry in the array. At the moment it is not possible to create a new entry and reference it
230again at the same command.
231For example, creating a new profile for buckets starting with {prefix}:
232
233::
234
235 # radosgw-admin zone modify --rgw-zonegroup={zone-group-name} \
236 --rgw-zone={zone-name} \
237 --tier-config=profiles[].source_bucket={prefix}'*'
238
239 # radosgw-admin zone modify --rgw-zonegroup={zone-group-name} \
240 --rgw-zone={zone-name} \
241 --tier-config=profiles[-1].connection_id={conn_id},profiles[-1].acls_id={acls_id}
242
243
244An entry can be removed by using ``--tier-config-rm={key}``.