]> git.proxmox.com Git - ceph.git/blob - ceph/doc/rbd/rbd-config-ref.rst
update sources to ceph Nautilus 14.2.1
[ceph.git] / ceph / doc / rbd / rbd-config-ref.rst
1 =======================
2 librbd Settings
3 =======================
4
5 See `Block Device`_ for additional details.
6
7 Cache Settings
8 =======================
9
10 .. sidebar:: Kernel Caching
11
12 The kernel driver for Ceph block devices can use the Linux page cache to
13 improve performance.
14
15 The user space implementation of the Ceph block device (i.e., ``librbd``) cannot
16 take advantage of the Linux page cache, so it includes its own in-memory
17 caching, called "RBD caching." RBD caching behaves just like well-behaved hard
18 disk caching. When the OS sends a barrier or a flush request, all dirty data is
19 written to the OSDs. This means that using write-back caching is just as safe as
20 using a well-behaved physical hard disk with a VM that properly sends flushes
21 (i.e. Linux kernel >= 2.6.32). The cache uses a Least Recently Used (LRU)
22 algorithm, and in write-back mode it can coalesce contiguous requests for
23 better throughput.
24
25 .. versionadded:: 0.46
26
27 Ceph supports write-back caching for RBD. To enable it, add ``rbd cache =
28 true`` to the ``[client]`` section of your ``ceph.conf`` file. By default
29 ``librbd`` does not perform any caching. Writes and reads go directly to the
30 storage cluster, and writes return only when the data is on disk on all
31 replicas. With caching enabled, writes return immediately, unless there are more
32 than ``rbd cache max dirty`` unflushed bytes. In this case, the write triggers
33 writeback and blocks until enough bytes are flushed.
34
35 .. versionadded:: 0.47
36
37 Ceph supports write-through caching for RBD. You can set the size of
38 the cache, and you can set targets and limits to switch from
39 write-back caching to write through caching. To enable write-through
40 mode, set ``rbd cache max dirty`` to 0. This means writes return only
41 when the data is on disk on all replicas, but reads may come from the
42 cache. The cache is in memory on the client, and each RBD image has
43 its own. Since the cache is local to the client, there's no coherency
44 if there are others accessing the image. Running GFS or OCFS on top of
45 RBD will not work with caching enabled.
46
47 The ``ceph.conf`` file settings for RBD should be set in the ``[client]``
48 section of your configuration file. The settings include:
49
50
51 ``rbd cache``
52
53 :Description: Enable caching for RADOS Block Device (RBD).
54 :Type: Boolean
55 :Required: No
56 :Default: ``true``
57
58
59 ``rbd cache size``
60
61 :Description: The RBD cache size in bytes.
62 :Type: 64-bit Integer
63 :Required: No
64 :Default: ``32 MiB``
65
66
67 ``rbd cache max dirty``
68
69 :Description: The ``dirty`` limit in bytes at which the cache triggers write-back. If ``0``, uses write-through caching.
70 :Type: 64-bit Integer
71 :Required: No
72 :Constraint: Must be less than ``rbd cache size``.
73 :Default: ``24 MiB``
74
75
76 ``rbd cache target dirty``
77
78 :Description: The ``dirty target`` before the cache begins writing data to the data storage. Does not block writes to the cache.
79 :Type: 64-bit Integer
80 :Required: No
81 :Constraint: Must be less than ``rbd cache max dirty``.
82 :Default: ``16 MiB``
83
84
85 ``rbd cache max dirty age``
86
87 :Description: The number of seconds dirty data is in the cache before writeback starts.
88 :Type: Float
89 :Required: No
90 :Default: ``1.0``
91
92 .. versionadded:: 0.60
93
94 ``rbd cache writethrough until flush``
95
96 :Description: Start out in write-through mode, and switch to write-back after the first flush request is received. Enabling this is a conservative but safe setting in case VMs running on rbd are too old to send flushes, like the virtio driver in Linux before 2.6.32.
97 :Type: Boolean
98 :Required: No
99 :Default: ``true``
100
101 .. _Block Device: ../../rbd
102
103
104 Read-ahead Settings
105 =======================
106
107 .. versionadded:: 0.86
108
109 RBD supports read-ahead/prefetching to optimize small, sequential reads.
110 This should normally be handled by the guest OS in the case of a VM,
111 but boot loaders may not issue efficient reads.
112 Read-ahead is automatically disabled if caching is disabled.
113
114
115 ``rbd readahead trigger requests``
116
117 :Description: Number of sequential read requests necessary to trigger read-ahead.
118 :Type: Integer
119 :Required: No
120 :Default: ``10``
121
122
123 ``rbd readahead max bytes``
124
125 :Description: Maximum size of a read-ahead request. If zero, read-ahead is disabled.
126 :Type: 64-bit Integer
127 :Required: No
128 :Default: ``512 KiB``
129
130
131 ``rbd readahead disable after bytes``
132
133 :Description: After this many bytes have been read from an RBD image, read-ahead is disabled for that image until it is closed. This allows the guest OS to take over read-ahead once it is booted. If zero, read-ahead stays enabled.
134 :Type: 64-bit Integer
135 :Required: No
136 :Default: ``50 MiB``
137
138
139 RBD Features
140 ============
141
142 RBD supports advanced features which can be specified via the command line when creating images or the default features can be specified via Ceph config file via 'rbd_default_features = <sum of feature numeric values>' or 'rbd_default_features = <comma-delimited list of CLI values>'
143
144 ``Layering``
145
146 :Description: Layering enables you to use cloning.
147 :Internal value: 1
148 :CLI value: layering
149 :Added in: v0.70 (Emperor)
150 :KRBD support: since v3.10
151 :Default: yes
152
153 ``Striping v2``
154
155 :Description: Striping spreads data across multiple objects. Striping helps with parallelism for sequential read/write workloads.
156 :Internal value: 2
157 :CLI value: striping
158 :Added in: v0.70 (Emperor)
159 :KRBD support: since v3.10
160 :Default: yes
161
162 ``Exclusive locking``
163
164 :Description: When enabled, it requires a client to get a lock on an object before making a write. Exclusive lock should only be enabled when a single client is accessing an image at the same time.
165 :Internal value: 4
166 :CLI value: exclusive-lock
167 :Added in: v0.92 (Hammer)
168 :KRBD support: since v4.9
169 :Default: yes
170
171 ``Object map``
172
173 :Description: Object map support depends on exclusive lock support. Block devices are thin provisioned—meaning, they only store data that actually exists. Object map support helps track which objects actually exist (have data stored on a drive). Enabling object map support speeds up I/O operations for cloning; importing and exporting a sparsely populated image; and deleting.
174 :Internal value: 8
175 :CLI value: object-map
176 :Added in: v0.93 (Hammer)
177 :KRBD support: no
178 :Default: yes
179
180
181 ``Fast-diff``
182
183 :Description: Fast-diff support depends on object map support and exclusive lock support. It adds another property to the object map, which makes it much faster to generate diffs between snapshots of an image, and the actual data usage of a snapshot much faster.
184 :Internal value: 16
185 :CLI value: fast-diff
186 :Added in: v9.0.1 (Infernalis)
187 :KRBD support: no
188 :Default: yes
189
190
191 ``Deep-flatten``
192
193 :Description: Deep-flatten makes rbd flatten work on all the snapshots of an image, in addition to the image itself. Without it, snapshots of an image will still rely on the parent, so the parent will not be delete-able until the snapshots are deleted. Deep-flatten makes a parent independent of its clones, even if they have snapshots.
194 :Internal value: 32
195 :CLI value: deep-flatten
196 :Added in: v9.0.2 (Infernalis)
197 :KRBD support: no
198 :Default: yes
199
200
201 ``Journaling``
202
203 :Description: Journaling support depends on exclusive lock support. Journaling records all modifications to an image in the order they occur. RBD mirroring utilizes the journal to replicate a crash consistent image to a remote cluster.
204 :Internal value: 64
205 :CLI value: journaling
206 :Added in: v10.0.1 (Jewel)
207 :KRBD support: no
208 :Default: no
209
210
211 ``Data pool``
212
213 :Description: On erasure-coded pools, the image data block objects need to be stored on a separate pool from the image metadata.
214 :Internal value: 128
215 :Added in: v11.1.0 (Kraken)
216 :KRBD support: since v4.11
217 :Default: no
218
219
220 ``Operations``
221
222 :Description: Used to restrict older clients from performing certain maintenance operations against an image (e.g. clone, snap create).
223 :Internal value: 256
224 :Added in: v13.0.2 (Mimic)
225 :KRBD support: since v4.16
226
227
228 ``Migrating``
229
230 :Description: Used to restrict older clients from opening an image when it is in migration state.
231 :Internal value: 512
232 :Added in: v14.0.1 (Nautilus)
233 :KRBD support: no
234
235
236 RBD QOS Settings
237 ================
238
239 RBD supports limiting per image IO, controlled by the following
240 settings.
241
242 ``rbd qos iops limit``
243
244 :Description: The desired limit of IO operations per second.
245 :Type: Unsigned Integer
246 :Required: No
247 :Default: ``0``
248
249
250 ``rbd qos bps limit``
251
252 :Description: The desired limit of IO bytes per second.
253 :Type: Unsigned Integer
254 :Required: No
255 :Default: ``0``
256
257
258 ``rbd qos read iops limit``
259
260 :Description: The desired limit of read operations per second.
261 :Type: Unsigned Integer
262 :Required: No
263 :Default: ``0``
264
265
266 ``rbd qos write iops limit``
267
268 :Description: The desired limit of write operations per second.
269 :Type: Unsigned Integer
270 :Required: No
271 :Default: ``0``
272
273
274 ``rbd qos read bps limit``
275
276 :Description: The desired limit of read bytes per second.
277 :Type: Unsigned Integer
278 :Required: No
279 :Default: ``0``
280
281
282 ``rbd qos write bps limit``
283
284 :Description: The desired limit of write bytes per second.
285 :Type: Unsigned Integer
286 :Required: No
287 :Default: ``0``
288
289
290 ``rbd qos iops burst``
291
292 :Description: The desired burst limit of IO operations.
293 :Type: Unsigned Integer
294 :Required: No
295 :Default: ``0``
296
297
298 ``rbd qos bps burst``
299
300 :Description: The desired burst limit of IO bytes.
301 :Type: Unsigned Integer
302 :Required: No
303 :Default: ``0``
304
305
306 ``rbd qos read iops burst``
307
308 :Description: The desired burst limit of read operations.
309 :Type: Unsigned Integer
310 :Required: No
311 :Default: ``0``
312
313
314 ``rbd qos write iops burst``
315
316 :Description: The desired burst limit of write operations.
317 :Type: Unsigned Integer
318 :Required: No
319 :Default: ``0``
320
321
322 ``rbd qos read bps burst``
323
324 :Description: The desired burst limit of read bytes.
325 :Type: Unsigned Integer
326 :Required: No
327 :Default: ``0``
328
329
330 ``rbd qos write bps burst``
331
332 :Description: The desired burst limit of write bytes.
333 :Type: Unsigned Integer
334 :Required: No
335 :Default: ``0``
336
337
338 ``rbd qos schedule tick min``
339
340 :Description: The minimum schedule tick (in milliseconds) for QoS.
341 :Type: Unsigned Integer
342 :Required: No
343 :Default: ``50``