]>
Commit | Line | Data |
---|---|---|
11fdf7f2 TL |
1 | .. _rgw_dynamic_bucket_index_resharding: |
2 | ||
3 | =================================== | |
4 | RGW Dynamic Bucket Index Resharding | |
5 | =================================== | |
6 | ||
7 | .. versionadded:: Luminous | |
8 | ||
9 | A large bucket index can lead to performance problems. In order | |
10 | to address this problem we introduced bucket index sharding. | |
11 | Until Luminous, changing the number of bucket shards (resharding) | |
81eedcae | 12 | needed to be done offline. Starting with Luminous we support |
11fdf7f2 TL |
13 | online bucket resharding. |
14 | ||
15 | Each bucket index shard can handle its entries efficiently up until | |
9f95a23c TL |
16 | reaching a certain threshold number of entries. If this threshold is |
17 | exceeded the system can encounter performance issues. The dynamic | |
18 | resharding feature detects this situation and automatically increases | |
19 | the number of shards used by the bucket index, resulting in the | |
20 | reduction of the number of entries in each bucket index shard. This | |
21 | process is transparent to the user. | |
22 | ||
23 | By default dynamic bucket index resharding can only increase the | |
24 | number of bucket index sharts to 1999, although the upper-bound is a | |
25 | configuration parameter (see Configuration below). Furthermore, when | |
26 | possible, the process chooses a prime number of bucket index shards to | |
27 | help spread the number of bucket index entries across the bucket index | |
28 | shards more evenly. | |
29 | ||
30 | The detection process runs in a background process that periodically | |
31 | scans all the buckets. A bucket that requires resharding is added to | |
32 | the resharding queue and will be scheduled to be resharded later. The | |
33 | reshard thread runs in the background and execute the scheduled | |
34 | resharding tasks, one at a time. | |
11fdf7f2 TL |
35 | |
36 | Multisite | |
37 | ========= | |
81eedcae TL |
38 | |
39 | Dynamic resharding is not supported in a multisite environment. | |
11fdf7f2 TL |
40 | |
41 | ||
42 | Configuration | |
43 | ============= | |
44 | ||
81eedcae | 45 | Enable/Disable dynamic bucket index resharding: |
11fdf7f2 | 46 | |
81eedcae | 47 | - ``rgw_dynamic_resharding``: true/false, default: true |
11fdf7f2 | 48 | |
81eedcae | 49 | Configuration options that control the resharding process: |
11fdf7f2 | 50 | |
9f95a23c | 51 | - ``rgw_max_objs_per_shard``: maximum number of objects per bucket index shard before resharding is triggered, default: 100000 objects |
11fdf7f2 | 52 | |
9f95a23c | 53 | - ``rgw_max_dynamic_shards``: maximum number of shards that dynamic bucket index resharding can increase to, default: 1999 |
11fdf7f2 | 54 | |
9f95a23c | 55 | - ``rgw_reshard_bucket_lock_duration``: duration, in seconds, of lock on bucket obj during resharding, default: 360 seconds (i.e., 6 minutes) |
11fdf7f2 | 56 | |
9f95a23c | 57 | - ``rgw_reshard_thread_interval``: maximum time, in seconds, between rounds of resharding queue processing, default: 600 seconds (i.e., 10 minutes) |
11fdf7f2 | 58 | |
9f95a23c | 59 | - ``rgw_reshard_num_logs``: number of shards for the resharding queue, default: 16 |
11fdf7f2 TL |
60 | |
61 | Admin commands | |
62 | ============== | |
63 | ||
64 | Add a bucket to the resharding queue | |
65 | ------------------------------------ | |
66 | ||
67 | :: | |
68 | ||
69 | # radosgw-admin reshard add --bucket <bucket_name> --num-shards <new number of shards> | |
70 | ||
71 | List resharding queue | |
72 | --------------------- | |
73 | ||
74 | :: | |
75 | ||
76 | # radosgw-admin reshard list | |
77 | ||
81eedcae TL |
78 | Process tasks on the resharding queue |
79 | ------------------------------------- | |
11fdf7f2 TL |
80 | |
81 | :: | |
82 | ||
83 | # radosgw-admin reshard process | |
84 | ||
85 | Bucket resharding status | |
86 | ------------------------ | |
87 | ||
88 | :: | |
89 | ||
90 | # radosgw-admin reshard status --bucket <bucket_name> | |
91 | ||
494da23a TL |
92 | The output is a json array of 3 objects (reshard_status, new_bucket_instance_id, num_shards) per shard. |
93 | ||
94 | For example, the output at different Dynamic Resharding stages is shown below: | |
95 | ||
96 | ``1. Before resharding occurred:`` | |
97 | :: | |
98 | ||
99 | [ | |
100 | { | |
101 | "reshard_status": "not-resharding", | |
102 | "new_bucket_instance_id": "", | |
103 | "num_shards": -1 | |
104 | } | |
105 | ] | |
106 | ||
107 | ``2. During resharding:`` | |
108 | :: | |
109 | ||
110 | [ | |
111 | { | |
112 | "reshard_status": "in-progress", | |
113 | "new_bucket_instance_id": "1179f470-2ebf-4630-8ec3-c9922da887fd.8652.1", | |
114 | "num_shards": 2 | |
115 | }, | |
116 | { | |
117 | "reshard_status": "in-progress", | |
118 | "new_bucket_instance_id": "1179f470-2ebf-4630-8ec3-c9922da887fd.8652.1", | |
119 | "num_shards": 2 | |
120 | } | |
121 | ] | |
122 | ||
123 | ``3, After resharding completed:`` | |
124 | :: | |
125 | ||
126 | [ | |
127 | { | |
128 | "reshard_status": "not-resharding", | |
129 | "new_bucket_instance_id": "", | |
130 | "num_shards": -1 | |
131 | }, | |
132 | { | |
133 | "reshard_status": "not-resharding", | |
134 | "new_bucket_instance_id": "", | |
135 | "num_shards": -1 | |
136 | } | |
137 | ] | |
138 | ||
139 | ||
11fdf7f2 TL |
140 | Cancel pending bucket resharding |
141 | -------------------------------- | |
142 | ||
81eedcae | 143 | Note: Ongoing bucket resharding operations cannot be cancelled. :: |
11fdf7f2 TL |
144 | |
145 | # radosgw-admin reshard cancel --bucket <bucket_name> | |
146 | ||
81eedcae TL |
147 | Manual immediate bucket resharding |
148 | ---------------------------------- | |
11fdf7f2 TL |
149 | |
150 | :: | |
151 | ||
152 | # radosgw-admin bucket reshard --bucket <bucket_name> --num-shards <new number of shards> | |
153 | ||
9f95a23c TL |
154 | When choosing a number of shards, the administrator should keep a |
155 | number of items in mind. Ideally the administrator is aiming for no | |
156 | more than 100000 entries per shard, now and through some future point | |
157 | in time. | |
158 | ||
159 | Additionally, bucket index shards that are prime numbers tend to work | |
160 | better in evenly distributing bucket index entries across the | |
161 | shards. For example, 7001 bucket index shards is better than 7000 | |
162 | since the former is prime. A variety of web sites have lists of prime | |
163 | numbers; search for "list of prime numbers" withy your favorite web | |
164 | search engine to locate some web sites. | |
11fdf7f2 TL |
165 | |
166 | Troubleshooting | |
167 | =============== | |
168 | ||
169 | Clusters prior to Luminous 12.2.11 and Mimic 13.2.5 left behind stale bucket | |
81eedcae TL |
170 | instance entries, which were not automatically cleaned up. The issue also affected |
171 | LifeCycle policies, which were not applied to resharded buckets anymore. Both of | |
11fdf7f2 TL |
172 | these issues can be worked around using a couple of radosgw-admin commands. |
173 | ||
81eedcae | 174 | Stale instance management |
11fdf7f2 TL |
175 | ------------------------- |
176 | ||
81eedcae TL |
177 | List the stale instances in a cluster that are ready to be cleaned up. |
178 | ||
11fdf7f2 TL |
179 | :: |
180 | ||
181 | # radosgw-admin reshard stale-instances list | |
182 | ||
81eedcae TL |
183 | Clean up the stale instances in a cluster. Note: cleanup of these |
184 | instances should only be done on a single site cluster. | |
11fdf7f2 TL |
185 | |
186 | :: | |
187 | ||
188 | # radosgw-admin reshard stale-instances rm | |
189 | ||
190 | ||
191 | Lifecycle fixes | |
192 | --------------- | |
193 | ||
81eedcae TL |
194 | For clusters that had resharded instances, it is highly likely that the old |
195 | lifecycle processes would have flagged and deleted lifecycle processing as the | |
11fdf7f2 | 196 | bucket instance changed during a reshard. While this is fixed for newer clusters |
81eedcae TL |
197 | (from Mimic 13.2.6 and Luminous 12.2.12), older buckets that had lifecycle policies and |
198 | that have undergone resharding will have to be manually fixed. | |
199 | ||
200 | The command to do so is: | |
11fdf7f2 TL |
201 | |
202 | :: | |
203 | ||
204 | # radosgw-admin lc reshard fix --bucket {bucketname} | |
205 | ||
206 | ||
207 | As a convenience wrapper, if the ``--bucket`` argument is dropped then this | |
81eedcae TL |
208 | command will try and fix lifecycle policies for all the buckets in the cluster. |
209 | ||
210 | Object Expirer fixes | |
211 | -------------------- | |
212 | ||
213 | Objects subject to Swift object expiration on older clusters may have | |
214 | been dropped from the log pool and never deleted after the bucket was | |
215 | resharded. This would happen if their expiration time was before the | |
216 | cluster was upgraded, but if their expiration was after the upgrade | |
217 | the objects would be correctly handled. To manage these expire-stale | |
218 | objects, radosgw-admin provides two subcommands. | |
219 | ||
220 | Listing: | |
221 | ||
222 | :: | |
223 | ||
224 | # radosgw-admin objects expire-stale list --bucket {bucketname} | |
225 | ||
226 | Displays a list of object names and expiration times in JSON format. | |
227 | ||
228 | Deleting: | |
229 | ||
230 | :: | |
231 | ||
232 | # radosgw-admin objects expire-stale rm --bucket {bucketname} | |
233 | ||
234 | ||
235 | Initiates deletion of such objects, displaying a list of object names, expiration times, and deletion status in JSON format. |