]> git.proxmox.com Git - ceph.git/blob - ceph/doc/radosgw/rgw-cache.rst
import quincy beta 17.1.0
[ceph.git] / ceph / doc / radosgw / rgw-cache.rst
1 ==========================
2 RGW Data caching and CDN
3 ==========================
4
5 .. versionadded:: Octopus
6
7 .. contents::
8
9 This feature adds to RGW the ability to securely cache objects and offload the workload from the cluster, using Nginx.
10 After an object is accessed the first time it will be stored in the Nginx cache directory.
11 When data is already cached, it need not be fetched from RGW. A permission check will be made against RGW to ensure the requesting user has access.
12 This feature is based on some Nginx modules, ngx_http_auth_request_module, https://github.com/kaltura/nginx-aws-auth-module, Openresty for Lua capabilities.
13
14 Currently, this feature will cache only AWSv4 requests (only s3 requests), caching-in the output of the 1st GET request
15 and caching-out on subsequent GET requests, passing thru transparently PUT,POST,HEAD,DELETE and COPY requests.
16
17
18 The feature introduces 2 new APIs: Auth and Cache.
19
20 NOTE: The `D3N RGW Data Cache`_ is an alternative data caching mechanism implemented natively in the Rados Gatewey.
21
22 New APIs
23 -------------------------
24
25 There are 2 new APIs for this feature:
26
27 Auth API - The cache uses this to validate that a user can access the cached data
28
29 Cache API - Adds the ability to override securely Range header, that way Nginx can use it is own smart cache on top of S3:
30 https://www.nginx.com/blog/smart-efficient-byte-range-caching-nginx/
31 Using this API gives the ability to read ahead objects when clients asking a specific range from the object.
32 On subsequent accesses to the cached object, Nginx will satisfy requests for already-cached ranges from the cache. Uncached ranges will be read from RGW (and cached).
33
34 Auth API
35 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
36
37 This API Validates a specific authenticated access being made to the cache, using RGW's knowledge of the client credentials and stored access policy.
38 Returns success if the encapsulated request would be granted.
39
40 Cache API
41 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
42
43 This API is meant to allow changing signed Range headers using a privileged user, cache user.
44
45 Creating cache user
46
47 ::
48
49 $ radosgw-admin user create --uid=<uid for cache user> --display-name="cache user" --caps="amz-cache=read"
50
51 This user can send to the RGW the Cache API header ``X-Amz-Cache``, this header contains the headers from the original request(before changing the Range header).
52 It means that ``X-Amz-Cache`` built from several headers.
53 The headers that are building the ``X-Amz-Cache`` header are separated by char with ASCII code 177 and the header name and value are separated by char ASCII code 178.
54 The RGW will check that the cache user is an authorized user and if it is a cache user,
55 if yes it will use the ``X-Amz-Cache`` to revalidate that the user has permissions, using the headers from the X-Amz-Cache.
56 During this flow, the RGW will override the Range header.
57
58
59 Using Nginx with RGW
60 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
61
62 Download the source of Openresty:
63
64 ::
65
66 $ wget https://openresty.org/download/openresty-1.15.8.3.tar.gz
67
68 git clone the AWS auth Nginx module:
69
70 ::
71
72 $ git clone https://github.com/kaltura/nginx-aws-auth-module
73
74 untar the openresty package:
75
76 ::
77
78 $ tar xvzf openresty-1.15.8.3.tar.gz
79 $ cd openresty-1.15.8.3
80
81 Compile openresty, Make sure that you have pcre lib and openssl lib:
82
83 ::
84
85 $ sudo yum install pcre-devel openssl-devel gcc curl zlib-devel nginx
86 $ ./configure --add-module=<the nginx-aws-auth-module dir> --with-http_auth_request_module --with-http_slice_module --conf-path=/etc/nginx/nginx.conf
87 $ gmake -j $(nproc)
88 $ sudo gmake install
89 $ sudo ln -sf /usr/local/openresty/bin/openresty /usr/bin/nginx
90
91 Put in-place your Nginx configuration files and edit them according to your environment:
92
93 All Nginx conf files are under: https://github.com/ceph/ceph/tree/master/examples/rgw-cache
94
95 `nginx.conf` should go to `/etc/nginx/nginx.conf`
96
97 `nginx-lua-file.lua` should go to `/etc/nginx/nginx-lua-file.lua`
98
99 `nginx-default.conf` should go to `/etc/nginx/conf.d/nginx-default.conf`
100
101 The parameters that are most likely to require adjustment according to the environment are located in the file `nginx-default.conf`
102
103 Modify the example values of *proxy_cache_path* and *max_size* at:
104
105 ::
106
107 proxy_cache_path /data/cache levels=2:2:2 keys_zone=mycache:999m max_size=20G inactive=1d use_temp_path=off;
108
109
110 And modify the example *server* values to point to the RGWs URIs:
111
112 ::
113
114 server rgw1:8000 max_fails=2 fail_timeout=5s;
115 server rgw2:8000 max_fails=2 fail_timeout=5s;
116 server rgw3:8000 max_fails=2 fail_timeout=5s;
117
118 | It is important to substitute the *access key* and *secret key* located in the `nginx.conf` with those belong to the user with the `amz-cache` caps
119 | for example, create the `cache` user as following:
120
121 ::
122
123 radosgw-admin user create --uid=cacheuser --display-name="cache user" --caps="amz-cache=read" --access-key <access> --secret <secret>
124
125 It is possible to use Nginx slicing which is a better method for streaming purposes.
126
127 For using slice you should use `nginx-slicing.conf` and not `nginx-default.conf`
128
129 Further information about Nginx slicing:
130
131 https://docs.nginx.com/nginx/admin-guide/content-cache/content-caching/#byte-range-caching
132
133
134 If you do not want to use the prefetch caching, It is possible to replace `nginx-default.conf` with `nginx-noprefetch.conf`
135 Using `noprefetch` means that if the client is sending range request of 0-4095 and then 0-4096 Nginx will cache those requests separately, So it will need to fetch those requests twice.
136
137
138 Run Nginx(openresty):
139
140 ::
141
142 $ sudo systemctl restart nginx
143
144 Appendix
145 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
146 **A note about performance:** In certain instances like development environment, disabling the authentication by commenting the following line in `nginx-default.conf`:
147
148 ::
149
150 #auth_request /authentication;
151
152 may (depending on the hardware) increases the performance significantly as it forgoes the auth API calls to radosgw.
153
154
155 .. _D3N RGW Data Cache: ../d3n_datacache/