]> git.proxmox.com Git - ceph.git/blob - ceph/doc/radosgw/elastic-sync-module.rst
import quincy beta 17.1.0
[ceph.git] / ceph / doc / radosgw / elastic-sync-module.rst
1 =========================
2 ElasticSearch Sync Module
3 =========================
4
5 .. versionadded:: Kraken
6
7 .. note::
8 As of 31 May 2020, only Elasticsearch 6 and lower are supported. ElasticSearch 7 is not supported.
9
10 This sync module writes the metadata from other zones to `ElasticSearch`_. As of
11 luminous this is a json of data fields we currently store in ElasticSearch.
12
13 ::
14
15 {
16 "_index" : "rgw-gold-ee5863d6",
17 "_type" : "object",
18 "_id" : "34137443-8592-48d9-8ca7-160255d52ade.34137.1:object1:null",
19 "_score" : 1.0,
20 "_source" : {
21 "bucket" : "testbucket123",
22 "name" : "object1",
23 "instance" : "null",
24 "versioned_epoch" : 0,
25 "owner" : {
26 "id" : "user1",
27 "display_name" : "user1"
28 },
29 "permissions" : [
30 "user1"
31 ],
32 "meta" : {
33 "size" : 712354,
34 "mtime" : "2017-05-04T12:54:16.462Z",
35 "etag" : "7ac66c0f148de9519b8bd264312c4d64"
36 }
37 }
38 }
39
40
41
42 ElasticSearch tier type configurables
43 -------------------------------------
44
45 * ``endpoint``
46
47 Specifies the Elasticsearch server endpoint to access
48
49 * ``num_shards`` (integer)
50
51 The number of shards that Elasticsearch will be configured with on
52 data sync initialization. Note that this cannot be changed after init.
53 Any change here requires rebuild of the Elasticsearch index and reinit
54 of the data sync process.
55
56 * ``num_replicas`` (integer)
57
58 The number of the replicas that Elasticsearch will be configured with
59 on data sync initialization.
60
61 * ``explicit_custom_meta`` (true | false)
62
63 Specifies whether all user custom metadata will be indexed, or whether
64 user will need to configure (at the bucket level) what custom
65 metadata entries should be indexed. This is false by default
66
67 * ``index_buckets_list`` (comma separated list of strings)
68
69 If empty, all buckets will be indexed. Otherwise, only buckets
70 specified here will be indexed. It is possible to provide bucket
71 prefixes (e.g., foo\*), or bucket suffixes (e.g., \*bar).
72
73 * ``approved_owners_list`` (comma separated list of strings)
74
75 If empty, buckets of all owners will be indexed (subject to other
76 restrictions), otherwise, only buckets owned by specified owners will
77 be indexed. Suffixes and prefixes can also be provided.
78
79 * ``override_index_path`` (string)
80
81 if not empty, this string will be used as the elasticsearch index
82 path. Otherwise the index path will be determined and generated on
83 sync initialization.
84
85
86 End user metadata queries
87 -------------------------
88
89 .. versionadded:: Luminous
90
91 Since the ElasticSearch cluster now stores object metadata, it is important that
92 the ElasticSearch endpoint is not exposed to the public and only accessible to
93 the cluster administrators. For exposing metadata queries to the end user itself
94 this poses a problem since we'd want the user to only query their metadata and
95 not of any other users, this would require the ElasticSearch cluster to
96 authenticate users in a way similar to RGW does which poses a problem.
97
98 As of Luminous RGW in the metadata master zone can now service end user
99 requests. This allows for not exposing the elasticsearch endpoint in public and
100 also solves the authentication and authorization problem since RGW itself can
101 authenticate the end user requests. For this purpose RGW introduces a new query
102 in the bucket APIs that can service elasticsearch requests. All these requests
103 must be sent to the metadata master zone.
104
105 Syntax
106 ~~~~~~
107
108 Get an elasticsearch query
109 ``````````````````````````
110
111 ::
112
113 GET /{bucket}?query={query-expr}
114
115 request params:
116 - max-keys: max number of entries to return
117 - marker: pagination marker
118
119 ``expression := [(]<arg> <op> <value> [)][<and|or> ...]``
120
121 op is one of the following:
122 <, <=, ==, >=, >
123
124 For example ::
125
126 GET /?query=name==foo
127
128 Will return all the indexed keys that user has read permission to, and
129 are named 'foo'.
130
131 The output will be a list of keys in XML that is similar to the S3
132 list buckets response.
133
134 Configure custom metadata fields
135 ````````````````````````````````
136
137 Define which custom metadata entries should be indexed (under the
138 specified bucket), and what are the types of these keys. If explicit
139 custom metadata indexing is configured, this is needed so that rgw
140 will index the specified custom metadata values. Otherwise it is
141 needed in cases where the indexed metadata keys are of a type other
142 than string.
143
144 ::
145
146 POST /{bucket}?mdsearch
147 x-amz-meta-search: <key [; type]> [, ...]
148
149 Multiple metadata fields must be comma separated, a type can be forced for a
150 field with a `;`. The currently allowed types are string(default), integer and
151 date
152
153 eg. if you want to index a custom object metadata x-amz-meta-year as int,
154 x-amz-meta-date as type date and x-amz-meta-title as string, you'd do
155
156 ::
157
158 POST /mybooks?mdsearch
159 x-amz-meta-search: x-amz-meta-year;int, x-amz-meta-release-date;date, x-amz-meta-title;string
160
161
162 Delete custom metadata configuration
163 ````````````````````````````````````
164
165 Delete custom metadata bucket configuration.
166
167 ::
168
169 DELETE /<bucket>?mdsearch
170
171 Get custom metadata configuration
172 `````````````````````````````````
173
174 Retrieve custom metadata bucket configuration.
175
176 ::
177
178 GET /<bucket>?mdsearch
179
180
181 .. _`Elasticsearch`: https://github.com/elastic/elasticsearch