]> git.proxmox.com Git - ceph.git/blame - ceph/doc/radosgw/dynamicresharding.rst
bump version to 15.2.11-pve1
[ceph.git] / ceph / doc / radosgw / dynamicresharding.rst
CommitLineData
11fdf7f2
TL
1.. _rgw_dynamic_bucket_index_resharding:
2
3===================================
4RGW Dynamic Bucket Index Resharding
5===================================
6
7.. versionadded:: Luminous
8
9A large bucket index can lead to performance problems. In order
10to address this problem we introduced bucket index sharding.
11Until Luminous, changing the number of bucket shards (resharding)
81eedcae 12needed to be done offline. Starting with Luminous we support
11fdf7f2
TL
13online bucket resharding.
14
15Each bucket index shard can handle its entries efficiently up until
9f95a23c
TL
16reaching a certain threshold number of entries. If this threshold is
17exceeded the system can encounter performance issues. The dynamic
18resharding feature detects this situation and automatically increases
19the number of shards used by the bucket index, resulting in the
20reduction of the number of entries in each bucket index shard. This
21process is transparent to the user.
22
23By default dynamic bucket index resharding can only increase the
24number of bucket index sharts to 1999, although the upper-bound is a
25configuration parameter (see Configuration below). Furthermore, when
26possible, the process chooses a prime number of bucket index shards to
27help spread the number of bucket index entries across the bucket index
28shards more evenly.
29
30The detection process runs in a background process that periodically
31scans all the buckets. A bucket that requires resharding is added to
32the resharding queue and will be scheduled to be resharded later. The
33reshard thread runs in the background and execute the scheduled
34resharding tasks, one at a time.
11fdf7f2
TL
35
36Multisite
37=========
81eedcae
TL
38
39Dynamic resharding is not supported in a multisite environment.
11fdf7f2
TL
40
41
42Configuration
43=============
44
81eedcae 45Enable/Disable dynamic bucket index resharding:
11fdf7f2 46
81eedcae 47- ``rgw_dynamic_resharding``: true/false, default: true
11fdf7f2 48
81eedcae 49Configuration options that control the resharding process:
11fdf7f2 50
9f95a23c 51- ``rgw_max_objs_per_shard``: maximum number of objects per bucket index shard before resharding is triggered, default: 100000 objects
11fdf7f2 52
9f95a23c 53- ``rgw_max_dynamic_shards``: maximum number of shards that dynamic bucket index resharding can increase to, default: 1999
11fdf7f2 54
9f95a23c 55- ``rgw_reshard_bucket_lock_duration``: duration, in seconds, of lock on bucket obj during resharding, default: 360 seconds (i.e., 6 minutes)
11fdf7f2 56
9f95a23c 57- ``rgw_reshard_thread_interval``: maximum time, in seconds, between rounds of resharding queue processing, default: 600 seconds (i.e., 10 minutes)
11fdf7f2 58
9f95a23c 59- ``rgw_reshard_num_logs``: number of shards for the resharding queue, default: 16
11fdf7f2
TL
60
61Admin commands
62==============
63
64Add a bucket to the resharding queue
65------------------------------------
66
67::
68
69 # radosgw-admin reshard add --bucket <bucket_name> --num-shards <new number of shards>
70
71List resharding queue
72---------------------
73
74::
75
76 # radosgw-admin reshard list
77
81eedcae
TL
78Process tasks on the resharding queue
79-------------------------------------
11fdf7f2
TL
80
81::
82
83 # radosgw-admin reshard process
84
85Bucket resharding status
86------------------------
87
88::
89
90 # radosgw-admin reshard status --bucket <bucket_name>
91
494da23a
TL
92The output is a json array of 3 objects (reshard_status, new_bucket_instance_id, num_shards) per shard.
93
94For example, the output at different Dynamic Resharding stages is shown below:
95
96``1. Before resharding occurred:``
97::
98
99 [
100 {
101 "reshard_status": "not-resharding",
102 "new_bucket_instance_id": "",
103 "num_shards": -1
104 }
105 ]
106
107``2. During resharding:``
108::
109
110 [
111 {
112 "reshard_status": "in-progress",
113 "new_bucket_instance_id": "1179f470-2ebf-4630-8ec3-c9922da887fd.8652.1",
114 "num_shards": 2
115 },
116 {
117 "reshard_status": "in-progress",
118 "new_bucket_instance_id": "1179f470-2ebf-4630-8ec3-c9922da887fd.8652.1",
119 "num_shards": 2
120 }
121 ]
122
123``3, After resharding completed:``
124::
125
126 [
127 {
128 "reshard_status": "not-resharding",
129 "new_bucket_instance_id": "",
130 "num_shards": -1
131 },
132 {
133 "reshard_status": "not-resharding",
134 "new_bucket_instance_id": "",
135 "num_shards": -1
136 }
137 ]
138
139
11fdf7f2
TL
140Cancel pending bucket resharding
141--------------------------------
142
81eedcae 143Note: Ongoing bucket resharding operations cannot be cancelled. ::
11fdf7f2
TL
144
145 # radosgw-admin reshard cancel --bucket <bucket_name>
146
81eedcae
TL
147Manual immediate bucket resharding
148----------------------------------
11fdf7f2
TL
149
150::
151
152 # radosgw-admin bucket reshard --bucket <bucket_name> --num-shards <new number of shards>
153
9f95a23c
TL
154When choosing a number of shards, the administrator should keep a
155number of items in mind. Ideally the administrator is aiming for no
156more than 100000 entries per shard, now and through some future point
157in time.
158
159Additionally, bucket index shards that are prime numbers tend to work
160better in evenly distributing bucket index entries across the
161shards. For example, 7001 bucket index shards is better than 7000
162since the former is prime. A variety of web sites have lists of prime
163numbers; search for "list of prime numbers" withy your favorite web
164search engine to locate some web sites.
11fdf7f2
TL
165
166Troubleshooting
167===============
168
169Clusters prior to Luminous 12.2.11 and Mimic 13.2.5 left behind stale bucket
81eedcae
TL
170instance entries, which were not automatically cleaned up. The issue also affected
171LifeCycle policies, which were not applied to resharded buckets anymore. Both of
11fdf7f2
TL
172these issues can be worked around using a couple of radosgw-admin commands.
173
81eedcae 174Stale instance management
11fdf7f2
TL
175-------------------------
176
81eedcae
TL
177List the stale instances in a cluster that are ready to be cleaned up.
178
11fdf7f2
TL
179::
180
181 # radosgw-admin reshard stale-instances list
182
81eedcae
TL
183Clean up the stale instances in a cluster. Note: cleanup of these
184instances should only be done on a single site cluster.
11fdf7f2
TL
185
186::
187
188 # radosgw-admin reshard stale-instances rm
189
190
191Lifecycle fixes
192---------------
193
81eedcae
TL
194For clusters that had resharded instances, it is highly likely that the old
195lifecycle processes would have flagged and deleted lifecycle processing as the
11fdf7f2 196bucket instance changed during a reshard. While this is fixed for newer clusters
81eedcae
TL
197(from Mimic 13.2.6 and Luminous 12.2.12), older buckets that had lifecycle policies and
198that have undergone resharding will have to be manually fixed.
199
200The command to do so is:
11fdf7f2
TL
201
202::
203
204 # radosgw-admin lc reshard fix --bucket {bucketname}
205
206
207As a convenience wrapper, if the ``--bucket`` argument is dropped then this
81eedcae
TL
208command will try and fix lifecycle policies for all the buckets in the cluster.
209
210Object Expirer fixes
211--------------------
212
213Objects subject to Swift object expiration on older clusters may have
214been dropped from the log pool and never deleted after the bucket was
215resharded. This would happen if their expiration time was before the
216cluster was upgraded, but if their expiration was after the upgrade
217the objects would be correctly handled. To manage these expire-stale
218objects, radosgw-admin provides two subcommands.
219
220Listing:
221
222::
223
224 # radosgw-admin objects expire-stale list --bucket {bucketname}
225
226Displays a list of object names and expiration times in JSON format.
227
228Deleting:
229
230::
231
232 # radosgw-admin objects expire-stale rm --bucket {bucketname}
233
234
235Initiates deletion of such objects, displaying a list of object names, expiration times, and deletion status in JSON format.