]> git.proxmox.com Git - ceph.git/blob - ceph/doc/radosgw/dynamicresharding.rst
import quincy beta 17.1.0
[ceph.git] / ceph / doc / radosgw / dynamicresharding.rst
1 .. _rgw_dynamic_bucket_index_resharding:
2
3 ===================================
4 RGW Dynamic Bucket Index Resharding
5 ===================================
6
7 .. versionadded:: Luminous
8
9 A large bucket index can lead to performance problems. In order
10 to address this problem we introduced bucket index sharding.
11 Until Luminous, changing the number of bucket shards (resharding)
12 needed to be done offline. Starting with Luminous we support
13 online bucket resharding.
14
15 Each bucket index shard can handle its entries efficiently up until
16 reaching a certain threshold number of entries. If this threshold is
17 exceeded the system can suffer from performance issues. The dynamic
18 resharding feature detects this situation and automatically increases
19 the number of shards used by the bucket index, resulting in a
20 reduction of the number of entries in each bucket index shard. This
21 process is transparent to the user. Write I/Os to the target bucket
22 are blocked and read I/Os are not during resharding process.
23
24 By default dynamic bucket index resharding can only increase the
25 number of bucket index shards to 1999, although this upper-bound is a
26 configuration parameter (see Configuration below). When
27 possible, the process chooses a prime number of bucket index shards to
28 spread the number of bucket index entries across the bucket index
29 shards more evenly.
30
31 The detection process runs in a background process that periodically
32 scans all the buckets. A bucket that requires resharding is added to
33 the resharding queue and will be scheduled to be resharded later. The
34 reshard thread runs in the background and execute the scheduled
35 resharding tasks, one at a time.
36
37 Multisite
38 =========
39
40 Dynamic resharding is not supported in a multisite environment.
41
42
43 Configuration
44 =============
45
46 Enable/Disable dynamic bucket index resharding:
47
48 - ``rgw_dynamic_resharding``: true/false, default: true
49
50 Configuration options that control the resharding process:
51
52 - ``rgw_max_objs_per_shard``: maximum number of objects per bucket index shard before resharding is triggered, default: 100000 objects
53
54 - ``rgw_max_dynamic_shards``: maximum number of shards that dynamic bucket index resharding can increase to, default: 1999
55
56 - ``rgw_reshard_bucket_lock_duration``: duration, in seconds, of lock on bucket obj during resharding, default: 360 seconds (i.e., 6 minutes)
57
58 - ``rgw_reshard_thread_interval``: maximum time, in seconds, between rounds of resharding queue processing, default: 600 seconds (i.e., 10 minutes)
59
60 - ``rgw_reshard_num_logs``: number of shards for the resharding queue, default: 16
61
62 Admin commands
63 ==============
64
65 Add a bucket to the resharding queue
66 ------------------------------------
67
68 ::
69
70 # radosgw-admin reshard add --bucket <bucket_name> --num-shards <new number of shards>
71
72 List resharding queue
73 ---------------------
74
75 ::
76
77 # radosgw-admin reshard list
78
79 Process tasks on the resharding queue
80 -------------------------------------
81
82 ::
83
84 # radosgw-admin reshard process
85
86 Bucket resharding status
87 ------------------------
88
89 ::
90
91 # radosgw-admin reshard status --bucket <bucket_name>
92
93 The output is a json array of 3 objects (reshard_status, new_bucket_instance_id, num_shards) per shard.
94
95 For example, the output at different Dynamic Resharding stages is shown below:
96
97 ``1. Before resharding occurred:``
98 ::
99
100 [
101 {
102 "reshard_status": "not-resharding",
103 "new_bucket_instance_id": "",
104 "num_shards": -1
105 }
106 ]
107
108 ``2. During resharding:``
109 ::
110
111 [
112 {
113 "reshard_status": "in-progress",
114 "new_bucket_instance_id": "1179f470-2ebf-4630-8ec3-c9922da887fd.8652.1",
115 "num_shards": 2
116 },
117 {
118 "reshard_status": "in-progress",
119 "new_bucket_instance_id": "1179f470-2ebf-4630-8ec3-c9922da887fd.8652.1",
120 "num_shards": 2
121 }
122 ]
123
124 ``3, After resharding completed:``
125 ::
126
127 [
128 {
129 "reshard_status": "not-resharding",
130 "new_bucket_instance_id": "",
131 "num_shards": -1
132 },
133 {
134 "reshard_status": "not-resharding",
135 "new_bucket_instance_id": "",
136 "num_shards": -1
137 }
138 ]
139
140
141 Cancel pending bucket resharding
142 --------------------------------
143
144 Note: Ongoing bucket resharding operations cannot be cancelled. ::
145
146 # radosgw-admin reshard cancel --bucket <bucket_name>
147
148 Manual immediate bucket resharding
149 ----------------------------------
150
151 ::
152
153 # radosgw-admin bucket reshard --bucket <bucket_name> --num-shards <new number of shards>
154
155 When choosing a number of shards, the administrator should keep a
156 number of items in mind. Ideally the administrator is aiming for no
157 more than 100000 entries per shard, now and through some future point
158 in time.
159
160 Additionally, bucket index shards that are prime numbers tend to work
161 better in evenly distributing bucket index entries across the
162 shards. For example, 7001 bucket index shards is better than 7000
163 since the former is prime. A variety of web sites have lists of prime
164 numbers; search for "list of prime numbers" withy your favorite web
165 search engine to locate some web sites.
166
167 Troubleshooting
168 ===============
169
170 Clusters prior to Luminous 12.2.11 and Mimic 13.2.5 left behind stale bucket
171 instance entries, which were not automatically cleaned up. The issue also affected
172 LifeCycle policies, which were not applied to resharded buckets anymore. Both of
173 these issues can be worked around using a couple of radosgw-admin commands.
174
175 Stale instance management
176 -------------------------
177
178 List the stale instances in a cluster that are ready to be cleaned up.
179
180 ::
181
182 # radosgw-admin reshard stale-instances list
183
184 Clean up the stale instances in a cluster. Note: cleanup of these
185 instances should only be done on a single site cluster.
186
187 ::
188
189 # radosgw-admin reshard stale-instances rm
190
191
192 Lifecycle fixes
193 ---------------
194
195 For clusters that had resharded instances, it is highly likely that the old
196 lifecycle processes would have flagged and deleted lifecycle processing as the
197 bucket instance changed during a reshard. While this is fixed for newer clusters
198 (from Mimic 13.2.6 and Luminous 12.2.12), older buckets that had lifecycle policies and
199 that have undergone resharding will have to be manually fixed.
200
201 The command to do so is:
202
203 ::
204
205 # radosgw-admin lc reshard fix --bucket {bucketname}
206
207
208 As a convenience wrapper, if the ``--bucket`` argument is dropped then this
209 command will try and fix lifecycle policies for all the buckets in the cluster.
210
211 Object Expirer fixes
212 --------------------
213
214 Objects subject to Swift object expiration on older clusters may have
215 been dropped from the log pool and never deleted after the bucket was
216 resharded. This would happen if their expiration time was before the
217 cluster was upgraded, but if their expiration was after the upgrade
218 the objects would be correctly handled. To manage these expire-stale
219 objects, radosgw-admin provides two subcommands.
220
221 Listing:
222
223 ::
224
225 # radosgw-admin objects expire-stale list --bucket {bucketname}
226
227 Displays a list of object names and expiration times in JSON format.
228
229 Deleting:
230
231 ::
232
233 # radosgw-admin objects expire-stale rm --bucket {bucketname}
234
235
236 Initiates deletion of such objects, displaying a list of object names, expiration times, and deletion status in JSON format.