]>
Commit | Line | Data |
---|---|---|
11fdf7f2 TL |
1 | .. _rgw_dynamic_bucket_index_resharding: |
2 | ||
3 | =================================== | |
4 | RGW Dynamic Bucket Index Resharding | |
5 | =================================== | |
6 | ||
7 | .. versionadded:: Luminous | |
8 | ||
9 | A large bucket index can lead to performance problems. In order | |
10 | to address this problem we introduced bucket index sharding. | |
11 | Until Luminous, changing the number of bucket shards (resharding) | |
81eedcae | 12 | needed to be done offline. Starting with Luminous we support |
11fdf7f2 TL |
13 | online bucket resharding. |
14 | ||
15 | Each bucket index shard can handle its entries efficiently up until | |
9f95a23c | 16 | reaching a certain threshold number of entries. If this threshold is |
f67539c2 | 17 | exceeded the system can suffer from performance issues. The dynamic |
9f95a23c | 18 | resharding feature detects this situation and automatically increases |
f67539c2 | 19 | the number of shards used by the bucket index, resulting in a |
9f95a23c | 20 | reduction of the number of entries in each bucket index shard. This |
20effc67 TL |
21 | process is transparent to the user. Write I/Os to the target bucket |
22 | are blocked and read I/Os are not during resharding process. | |
9f95a23c TL |
23 | |
24 | By default dynamic bucket index resharding can only increase the | |
f67539c2 TL |
25 | number of bucket index shards to 1999, although this upper-bound is a |
26 | configuration parameter (see Configuration below). When | |
9f95a23c | 27 | possible, the process chooses a prime number of bucket index shards to |
f67539c2 | 28 | spread the number of bucket index entries across the bucket index |
9f95a23c TL |
29 | shards more evenly. |
30 | ||
31 | The detection process runs in a background process that periodically | |
32 | scans all the buckets. A bucket that requires resharding is added to | |
33 | the resharding queue and will be scheduled to be resharded later. The | |
34 | reshard thread runs in the background and execute the scheduled | |
35 | resharding tasks, one at a time. | |
11fdf7f2 TL |
36 | |
37 | Multisite | |
38 | ========= | |
81eedcae | 39 | |
1e59de90 TL |
40 | Prior to the Reef release, RGW does not support dynamic resharding in a |
41 | multisite environment. For information on dynamic resharding, see | |
42 | :ref:`Resharding <feature_resharding>` in the RGW multisite documentation. | |
11fdf7f2 TL |
43 | |
44 | Configuration | |
45 | ============= | |
46 | ||
81eedcae | 47 | Enable/Disable dynamic bucket index resharding: |
11fdf7f2 | 48 | |
81eedcae | 49 | - ``rgw_dynamic_resharding``: true/false, default: true |
11fdf7f2 | 50 | |
81eedcae | 51 | Configuration options that control the resharding process: |
11fdf7f2 | 52 | |
9f95a23c | 53 | - ``rgw_max_objs_per_shard``: maximum number of objects per bucket index shard before resharding is triggered, default: 100000 objects |
11fdf7f2 | 54 | |
9f95a23c | 55 | - ``rgw_max_dynamic_shards``: maximum number of shards that dynamic bucket index resharding can increase to, default: 1999 |
11fdf7f2 | 56 | |
9f95a23c | 57 | - ``rgw_reshard_bucket_lock_duration``: duration, in seconds, of lock on bucket obj during resharding, default: 360 seconds (i.e., 6 minutes) |
11fdf7f2 | 58 | |
9f95a23c | 59 | - ``rgw_reshard_thread_interval``: maximum time, in seconds, between rounds of resharding queue processing, default: 600 seconds (i.e., 10 minutes) |
11fdf7f2 | 60 | |
9f95a23c | 61 | - ``rgw_reshard_num_logs``: number of shards for the resharding queue, default: 16 |
11fdf7f2 TL |
62 | |
63 | Admin commands | |
64 | ============== | |
65 | ||
66 | Add a bucket to the resharding queue | |
67 | ------------------------------------ | |
68 | ||
69 | :: | |
70 | ||
71 | # radosgw-admin reshard add --bucket <bucket_name> --num-shards <new number of shards> | |
72 | ||
73 | List resharding queue | |
74 | --------------------- | |
75 | ||
76 | :: | |
77 | ||
78 | # radosgw-admin reshard list | |
79 | ||
81eedcae TL |
80 | Process tasks on the resharding queue |
81 | ------------------------------------- | |
11fdf7f2 TL |
82 | |
83 | :: | |
84 | ||
85 | # radosgw-admin reshard process | |
86 | ||
87 | Bucket resharding status | |
88 | ------------------------ | |
89 | ||
90 | :: | |
91 | ||
92 | # radosgw-admin reshard status --bucket <bucket_name> | |
93 | ||
494da23a TL |
94 | The output is a json array of 3 objects (reshard_status, new_bucket_instance_id, num_shards) per shard. |
95 | ||
96 | For example, the output at different Dynamic Resharding stages is shown below: | |
97 | ||
98 | ``1. Before resharding occurred:`` | |
99 | :: | |
100 | ||
101 | [ | |
102 | { | |
103 | "reshard_status": "not-resharding", | |
104 | "new_bucket_instance_id": "", | |
105 | "num_shards": -1 | |
106 | } | |
107 | ] | |
108 | ||
109 | ``2. During resharding:`` | |
110 | :: | |
111 | ||
112 | [ | |
113 | { | |
114 | "reshard_status": "in-progress", | |
115 | "new_bucket_instance_id": "1179f470-2ebf-4630-8ec3-c9922da887fd.8652.1", | |
116 | "num_shards": 2 | |
117 | }, | |
118 | { | |
119 | "reshard_status": "in-progress", | |
120 | "new_bucket_instance_id": "1179f470-2ebf-4630-8ec3-c9922da887fd.8652.1", | |
121 | "num_shards": 2 | |
122 | } | |
123 | ] | |
124 | ||
125 | ``3, After resharding completed:`` | |
126 | :: | |
127 | ||
128 | [ | |
129 | { | |
130 | "reshard_status": "not-resharding", | |
131 | "new_bucket_instance_id": "", | |
132 | "num_shards": -1 | |
133 | }, | |
134 | { | |
135 | "reshard_status": "not-resharding", | |
136 | "new_bucket_instance_id": "", | |
137 | "num_shards": -1 | |
138 | } | |
139 | ] | |
140 | ||
141 | ||
11fdf7f2 TL |
142 | Cancel pending bucket resharding |
143 | -------------------------------- | |
144 | ||
81eedcae | 145 | Note: Ongoing bucket resharding operations cannot be cancelled. :: |
11fdf7f2 TL |
146 | |
147 | # radosgw-admin reshard cancel --bucket <bucket_name> | |
148 | ||
81eedcae TL |
149 | Manual immediate bucket resharding |
150 | ---------------------------------- | |
11fdf7f2 TL |
151 | |
152 | :: | |
153 | ||
154 | # radosgw-admin bucket reshard --bucket <bucket_name> --num-shards <new number of shards> | |
155 | ||
9f95a23c TL |
156 | When choosing a number of shards, the administrator should keep a |
157 | number of items in mind. Ideally the administrator is aiming for no | |
158 | more than 100000 entries per shard, now and through some future point | |
159 | in time. | |
160 | ||
161 | Additionally, bucket index shards that are prime numbers tend to work | |
162 | better in evenly distributing bucket index entries across the | |
163 | shards. For example, 7001 bucket index shards is better than 7000 | |
164 | since the former is prime. A variety of web sites have lists of prime | |
165 | numbers; search for "list of prime numbers" withy your favorite web | |
166 | search engine to locate some web sites. | |
11fdf7f2 TL |
167 | |
168 | Troubleshooting | |
169 | =============== | |
170 | ||
171 | Clusters prior to Luminous 12.2.11 and Mimic 13.2.5 left behind stale bucket | |
81eedcae TL |
172 | instance entries, which were not automatically cleaned up. The issue also affected |
173 | LifeCycle policies, which were not applied to resharded buckets anymore. Both of | |
11fdf7f2 TL |
174 | these issues can be worked around using a couple of radosgw-admin commands. |
175 | ||
81eedcae | 176 | Stale instance management |
11fdf7f2 TL |
177 | ------------------------- |
178 | ||
81eedcae TL |
179 | List the stale instances in a cluster that are ready to be cleaned up. |
180 | ||
11fdf7f2 TL |
181 | :: |
182 | ||
183 | # radosgw-admin reshard stale-instances list | |
184 | ||
81eedcae TL |
185 | Clean up the stale instances in a cluster. Note: cleanup of these |
186 | instances should only be done on a single site cluster. | |
11fdf7f2 TL |
187 | |
188 | :: | |
189 | ||
190 | # radosgw-admin reshard stale-instances rm | |
191 | ||
192 | ||
193 | Lifecycle fixes | |
194 | --------------- | |
195 | ||
81eedcae TL |
196 | For clusters that had resharded instances, it is highly likely that the old |
197 | lifecycle processes would have flagged and deleted lifecycle processing as the | |
11fdf7f2 | 198 | bucket instance changed during a reshard. While this is fixed for newer clusters |
81eedcae TL |
199 | (from Mimic 13.2.6 and Luminous 12.2.12), older buckets that had lifecycle policies and |
200 | that have undergone resharding will have to be manually fixed. | |
201 | ||
202 | The command to do so is: | |
11fdf7f2 TL |
203 | |
204 | :: | |
205 | ||
206 | # radosgw-admin lc reshard fix --bucket {bucketname} | |
207 | ||
208 | ||
209 | As a convenience wrapper, if the ``--bucket`` argument is dropped then this | |
81eedcae TL |
210 | command will try and fix lifecycle policies for all the buckets in the cluster. |
211 | ||
212 | Object Expirer fixes | |
213 | -------------------- | |
214 | ||
215 | Objects subject to Swift object expiration on older clusters may have | |
216 | been dropped from the log pool and never deleted after the bucket was | |
217 | resharded. This would happen if their expiration time was before the | |
218 | cluster was upgraded, but if their expiration was after the upgrade | |
219 | the objects would be correctly handled. To manage these expire-stale | |
220 | objects, radosgw-admin provides two subcommands. | |
221 | ||
222 | Listing: | |
223 | ||
224 | :: | |
225 | ||
226 | # radosgw-admin objects expire-stale list --bucket {bucketname} | |
227 | ||
228 | Displays a list of object names and expiration times in JSON format. | |
229 | ||
230 | Deleting: | |
231 | ||
232 | :: | |
233 | ||
234 | # radosgw-admin objects expire-stale rm --bucket {bucketname} | |
235 | ||
236 | ||
237 | Initiates deletion of such objects, displaying a list of object names, expiration times, and deletion status in JSON format. |