]> git.proxmox.com Git - ceph.git/blame - ceph/doc/radosgw/dynamicresharding.rst
update ceph source to reef 18.1.2
[ceph.git] / ceph / doc / radosgw / dynamicresharding.rst
CommitLineData
11fdf7f2
TL
1.. _rgw_dynamic_bucket_index_resharding:
2
3===================================
4RGW Dynamic Bucket Index Resharding
5===================================
6
7.. versionadded:: Luminous
8
9A large bucket index can lead to performance problems. In order
10to address this problem we introduced bucket index sharding.
11Until Luminous, changing the number of bucket shards (resharding)
81eedcae 12needed to be done offline. Starting with Luminous we support
11fdf7f2
TL
13online bucket resharding.
14
15Each bucket index shard can handle its entries efficiently up until
9f95a23c 16reaching a certain threshold number of entries. If this threshold is
f67539c2 17exceeded the system can suffer from performance issues. The dynamic
9f95a23c 18resharding feature detects this situation and automatically increases
f67539c2 19the number of shards used by the bucket index, resulting in a
9f95a23c 20reduction of the number of entries in each bucket index shard. This
20effc67
TL
21process is transparent to the user. Write I/Os to the target bucket
22are blocked and read I/Os are not during resharding process.
9f95a23c
TL
23
24By default dynamic bucket index resharding can only increase the
f67539c2
TL
25number of bucket index shards to 1999, although this upper-bound is a
26configuration parameter (see Configuration below). When
9f95a23c 27possible, the process chooses a prime number of bucket index shards to
f67539c2 28spread the number of bucket index entries across the bucket index
9f95a23c
TL
29shards more evenly.
30
31The detection process runs in a background process that periodically
32scans all the buckets. A bucket that requires resharding is added to
33the resharding queue and will be scheduled to be resharded later. The
34reshard thread runs in the background and execute the scheduled
35resharding tasks, one at a time.
11fdf7f2
TL
36
37Multisite
38=========
81eedcae 39
1e59de90
TL
40Prior to the Reef release, RGW does not support dynamic resharding in a
41multisite environment. For information on dynamic resharding, see
42:ref:`Resharding <feature_resharding>` in the RGW multisite documentation.
11fdf7f2
TL
43
44Configuration
45=============
46
81eedcae 47Enable/Disable dynamic bucket index resharding:
11fdf7f2 48
81eedcae 49- ``rgw_dynamic_resharding``: true/false, default: true
11fdf7f2 50
81eedcae 51Configuration options that control the resharding process:
11fdf7f2 52
9f95a23c 53- ``rgw_max_objs_per_shard``: maximum number of objects per bucket index shard before resharding is triggered, default: 100000 objects
11fdf7f2 54
9f95a23c 55- ``rgw_max_dynamic_shards``: maximum number of shards that dynamic bucket index resharding can increase to, default: 1999
11fdf7f2 56
9f95a23c 57- ``rgw_reshard_bucket_lock_duration``: duration, in seconds, of lock on bucket obj during resharding, default: 360 seconds (i.e., 6 minutes)
11fdf7f2 58
9f95a23c 59- ``rgw_reshard_thread_interval``: maximum time, in seconds, between rounds of resharding queue processing, default: 600 seconds (i.e., 10 minutes)
11fdf7f2 60
9f95a23c 61- ``rgw_reshard_num_logs``: number of shards for the resharding queue, default: 16
11fdf7f2
TL
62
63Admin commands
64==============
65
66Add a bucket to the resharding queue
67------------------------------------
68
69::
70
71 # radosgw-admin reshard add --bucket <bucket_name> --num-shards <new number of shards>
72
73List resharding queue
74---------------------
75
76::
77
78 # radosgw-admin reshard list
79
81eedcae
TL
80Process tasks on the resharding queue
81-------------------------------------
11fdf7f2
TL
82
83::
84
85 # radosgw-admin reshard process
86
87Bucket resharding status
88------------------------
89
90::
91
92 # radosgw-admin reshard status --bucket <bucket_name>
93
494da23a
TL
94The output is a json array of 3 objects (reshard_status, new_bucket_instance_id, num_shards) per shard.
95
96For example, the output at different Dynamic Resharding stages is shown below:
97
98``1. Before resharding occurred:``
99::
100
101 [
102 {
103 "reshard_status": "not-resharding",
104 "new_bucket_instance_id": "",
105 "num_shards": -1
106 }
107 ]
108
109``2. During resharding:``
110::
111
112 [
113 {
114 "reshard_status": "in-progress",
115 "new_bucket_instance_id": "1179f470-2ebf-4630-8ec3-c9922da887fd.8652.1",
116 "num_shards": 2
117 },
118 {
119 "reshard_status": "in-progress",
120 "new_bucket_instance_id": "1179f470-2ebf-4630-8ec3-c9922da887fd.8652.1",
121 "num_shards": 2
122 }
123 ]
124
125``3, After resharding completed:``
126::
127
128 [
129 {
130 "reshard_status": "not-resharding",
131 "new_bucket_instance_id": "",
132 "num_shards": -1
133 },
134 {
135 "reshard_status": "not-resharding",
136 "new_bucket_instance_id": "",
137 "num_shards": -1
138 }
139 ]
140
141
11fdf7f2
TL
142Cancel pending bucket resharding
143--------------------------------
144
81eedcae 145Note: Ongoing bucket resharding operations cannot be cancelled. ::
11fdf7f2
TL
146
147 # radosgw-admin reshard cancel --bucket <bucket_name>
148
81eedcae
TL
149Manual immediate bucket resharding
150----------------------------------
11fdf7f2
TL
151
152::
153
154 # radosgw-admin bucket reshard --bucket <bucket_name> --num-shards <new number of shards>
155
9f95a23c
TL
156When choosing a number of shards, the administrator should keep a
157number of items in mind. Ideally the administrator is aiming for no
158more than 100000 entries per shard, now and through some future point
159in time.
160
161Additionally, bucket index shards that are prime numbers tend to work
162better in evenly distributing bucket index entries across the
163shards. For example, 7001 bucket index shards is better than 7000
164since the former is prime. A variety of web sites have lists of prime
165numbers; search for "list of prime numbers" withy your favorite web
166search engine to locate some web sites.
11fdf7f2
TL
167
168Troubleshooting
169===============
170
171Clusters prior to Luminous 12.2.11 and Mimic 13.2.5 left behind stale bucket
81eedcae
TL
172instance entries, which were not automatically cleaned up. The issue also affected
173LifeCycle policies, which were not applied to resharded buckets anymore. Both of
11fdf7f2
TL
174these issues can be worked around using a couple of radosgw-admin commands.
175
81eedcae 176Stale instance management
11fdf7f2
TL
177-------------------------
178
81eedcae
TL
179List the stale instances in a cluster that are ready to be cleaned up.
180
11fdf7f2
TL
181::
182
183 # radosgw-admin reshard stale-instances list
184
81eedcae
TL
185Clean up the stale instances in a cluster. Note: cleanup of these
186instances should only be done on a single site cluster.
11fdf7f2
TL
187
188::
189
190 # radosgw-admin reshard stale-instances rm
191
192
193Lifecycle fixes
194---------------
195
81eedcae
TL
196For clusters that had resharded instances, it is highly likely that the old
197lifecycle processes would have flagged and deleted lifecycle processing as the
11fdf7f2 198bucket instance changed during a reshard. While this is fixed for newer clusters
81eedcae
TL
199(from Mimic 13.2.6 and Luminous 12.2.12), older buckets that had lifecycle policies and
200that have undergone resharding will have to be manually fixed.
201
202The command to do so is:
11fdf7f2
TL
203
204::
205
206 # radosgw-admin lc reshard fix --bucket {bucketname}
207
208
209As a convenience wrapper, if the ``--bucket`` argument is dropped then this
81eedcae
TL
210command will try and fix lifecycle policies for all the buckets in the cluster.
211
212Object Expirer fixes
213--------------------
214
215Objects subject to Swift object expiration on older clusters may have
216been dropped from the log pool and never deleted after the bucket was
217resharded. This would happen if their expiration time was before the
218cluster was upgraded, but if their expiration was after the upgrade
219the objects would be correctly handled. To manage these expire-stale
220objects, radosgw-admin provides two subcommands.
221
222Listing:
223
224::
225
226 # radosgw-admin objects expire-stale list --bucket {bucketname}
227
228Displays a list of object names and expiration times in JSON format.
229
230Deleting:
231
232::
233
234 # radosgw-admin objects expire-stale rm --bucket {bucketname}
235
236
237Initiates deletion of such objects, displaying a list of object names, expiration times, and deletion status in JSON format.