ceph/doc/radosgw/elastic-sync-module.rst

   1 =========================
   2 ElasticSearch Sync Module
   3 =========================
   4
   5 .. versionadded:: Kraken
   6
   7 .. note::
   8      As of 31 May 2020, only Elasticsearch 6 and lower are supported. ElasticSearch 7 is not supported.
   9
  10 This sync module writes the metadata from other zones to `ElasticSearch`_. As of
  11 luminous this is a json of data fields we currently store in ElasticSearch.
  12
  13 ::
  14
  15    {
  16         "_index" : "rgw-gold-ee5863d6",
  17         "_type" : "object",
  18         "_id" : "34137443-8592-48d9-8ca7-160255d52ade.34137.1:object1:null",
  19         "_score" : 1.0,
  20         "_source" : {
  21           "bucket" : "testbucket123",
  22           "name" : "object1",
  23           "instance" : "null",
  24           "versioned_epoch" : 0,
  25           "owner" : {
  26             "id" : "user1",
  27             "display_name" : "user1"
  28           },
  29           "permissions" : [
  30             "user1"
  31           ],
  32           "meta" : {
  33             "size" : 712354,
  34             "mtime" : "2017-05-04T12:54:16.462Z",
  35             "etag" : "7ac66c0f148de9519b8bd264312c4d64"
  36           }
  37         }
  38       }
  39
  40
  41
  42 ElasticSearch tier type configurables
  43 -------------------------------------
  44
  45 * ``endpoint``
  46
  47 Specifies the Elasticsearch server endpoint to access
  48
  49 * ``num_shards`` (integer)
  50
  51 The number of shards that Elasticsearch will be configured with on
  52 data sync initialization. Note that this cannot be changed after init.
  53 Any change here requires rebuild of the Elasticsearch index and reinit
  54 of the data sync process.
  55
  56 * ``num_replicas`` (integer)
  57
  58 The number of the replicas that Elasticsearch will be configured with
  59 on data sync initialization.
  60
  61 * ``explicit_custom_meta`` (true | false)
  62
  63 Specifies whether all user custom metadata will be indexed, or whether
  64 user will need to configure (at the bucket level) what custom
  65 metadata entries should be indexed. This is false by default
  66
  67 * ``index_buckets_list`` (comma separated list of strings)
  68
  69 If empty, all buckets will be indexed. Otherwise, only buckets
  70 specified here will be indexed. It is possible to provide bucket
  71 prefixes (e.g., foo\*), or bucket suffixes (e.g., \*bar).
  72
  73 * ``approved_owners_list`` (comma separated list of strings)
  74
  75 If empty, buckets of all owners will be indexed (subject to other
  76 restrictions), otherwise, only buckets owned by specified owners will
  77 be indexed. Suffixes and prefixes can also be provided.
  78
  79 * ``override_index_path`` (string)
  80
  81 if not empty, this string will be used as the elasticsearch index
  82 path. Otherwise the index path will be determined and generated on
  83 sync initialization.
  84
  85
  86 End user metadata queries
  87 -------------------------
  88
  89 .. versionadded:: Luminous
  90
  91 Since the ElasticSearch cluster now stores object metadata, it is important that
  92 the ElasticSearch endpoint is not exposed to the public and only accessible to
  93 the cluster administrators. For exposing metadata queries to the end user itself
  94 this poses a problem since we'd want the user to only query their metadata and
  95 not of any other users, this would require the ElasticSearch cluster to
  96 authenticate users in a way similar to RGW does which poses a problem.
  97
  98 As of Luminous RGW in the metadata master zone can now service end user
  99 requests. This allows for not exposing the elasticsearch endpoint in public and
 100 also solves the authentication and authorization problem since RGW itself can
 101 authenticate the end user requests. For this purpose RGW introduces a new query
 102 in the bucket APIs that can service elasticsearch requests. All these requests
 103 must be sent to the metadata master zone.
 104
 105 Syntax
 106 ~~~~~~
 107
 108 Get an elasticsearch query
 109 ``````````````````````````
 110
 111 ::
 112
 113    GET /{bucket}?query={query-expr}
 114
 115 request params:
 116  - max-keys: max number of entries to return
 117  - marker: pagination marker
 118
 119 ``expression := [(]<arg> <op> <value> [)][<and|or> ...]``
 120
 121 op is one of the following:
 122 <, <=, ==, >=, >
 123
 124 For example ::
 125
 126   GET /?query=name==foo
 127
 128 Will return all the indexed keys that user has read permission to, and
 129 are named 'foo'.
 130
 131 The output will be a list of keys in XML that is similar to the S3
 132 list buckets response.
 133
 134 Configure custom metadata fields
 135 ````````````````````````````````
 136
 137 Define which custom metadata entries should be indexed (under the
 138 specified bucket), and what are the types of these keys. If explicit
 139 custom metadata indexing is configured, this is needed so that rgw
 140 will index the specified custom metadata values. Otherwise it is
 141 needed in cases where the indexed metadata keys are of a type other
 142 than string.
 143
 144 ::
 145
 146    POST /{bucket}?mdsearch
 147    x-amz-meta-search: <key [; type]> [, ...]
 148
 149 Multiple metadata fields must be comma separated, a type can be forced for a
 150 field with a `;`. The currently allowed types are string(default), integer and
 151 date
 152
 153 eg. if you want to index a custom object metadata x-amz-meta-year as int,
 154 x-amz-meta-date as type date and x-amz-meta-title as string, you'd do
 155
 156 ::
 157
 158    POST /mybooks?mdsearch
 159    x-amz-meta-search: x-amz-meta-year;int, x-amz-meta-release-date;date, x-amz-meta-title;string
 160
 161
 162 Delete custom metadata configuration
 163 ````````````````````````````````````
 164
 165 Delete custom metadata bucket configuration.
 166
 167 ::
 168
 169    DELETE /<bucket>?mdsearch
 170
 171 Get custom metadata configuration
 172 `````````````````````````````````
 173
 174 Retrieve custom metadata bucket configuration.
 175
 176 ::
 177
 178    GET /<bucket>?mdsearch
 179
 180
 181 .. _`Elasticsearch`: https://github.com/elastic/elasticsearch