]> git.proxmox.com Git - ceph.git/blob - ceph/doc/rbd/rbd-openstack.rst
import ceph quincy 17.2.6
[ceph.git] / ceph / doc / rbd / rbd-openstack.rst
1 =============================
2 Block Devices and OpenStack
3 =============================
4
5 .. index:: Ceph Block Device; OpenStack
6
7 You can attach Ceph Block Device images to OpenStack instances through ``libvirt``,
8 which configures the QEMU interface to ``librbd``. Ceph stripes block volumes
9 across multiple OSDs within the cluster, which means that large volumes can
10 realize better performance than local drives on a standalone server!
11
12 To use Ceph Block Devices with OpenStack, you must install QEMU, ``libvirt``,
13 and OpenStack first. We recommend using a separate physical node for your
14 OpenStack installation. OpenStack recommends a minimum of 8GB of RAM and a
15 quad-core processor. The following diagram depicts the OpenStack/Ceph
16 technology stack.
17
18
19 .. ditaa::
20
21 +---------------------------------------------------+
22 | OpenStack |
23 +---------------------------------------------------+
24 | libvirt |
25 +------------------------+--------------------------+
26 |
27 | configures
28 v
29 +---------------------------------------------------+
30 | QEMU |
31 +---------------------------------------------------+
32 | librbd |
33 +---------------------------------------------------+
34 | librados |
35 +------------------------+-+------------------------+
36 | OSDs | | Monitors |
37 +------------------------+ +------------------------+
38
39 .. important:: To use Ceph Block Devices with OpenStack, you must have
40 access to a running Ceph Storage Cluster.
41
42 Three parts of OpenStack integrate with Ceph's block devices:
43
44 - **Images**: OpenStack Glance manages images for VMs. Images are immutable.
45 OpenStack treats images as binary blobs and downloads them accordingly.
46
47 - **Volumes**: Volumes are block devices. OpenStack uses volumes to boot VMs,
48 or to attach volumes to running VMs. OpenStack manages volumes using
49 Cinder services.
50
51 - **Guest Disks**: Guest disks are guest operating system disks. By default,
52 when you boot a virtual machine, its disk appears as a file on the file system
53 of the hypervisor (usually under ``/var/lib/nova/instances/<uuid>/``). Prior
54 to OpenStack Havana, the only way to boot a VM in Ceph was to use the
55 boot-from-volume functionality of Cinder. However, now it is possible to boot
56 every virtual machine inside Ceph directly without using Cinder, which is
57 advantageous because it allows you to perform maintenance operations easily
58 with the live-migration process. Additionally, if your hypervisor dies it is
59 also convenient to trigger ``nova evacuate`` and reinstate the virtual machine
60 elsewhere almost seamlessly. In doing so,
61 :ref:`exclusive locks <rbd-exclusive-locks>` prevent multiple
62 compute nodes from concurrently accessing the guest disk.
63
64
65 You can use OpenStack Glance to store images as Ceph Block Devices, and you
66 can use Cinder to boot a VM using a copy-on-write clone of an image.
67
68 The instructions below detail the setup for Glance, Cinder and Nova, although
69 they do not have to be used together. You may store images in Ceph block devices
70 while running VMs using a local disk, or vice versa.
71
72 .. important:: Using QCOW2 for hosting a virtual machine disk is NOT recommended.
73 If you want to boot virtual machines in Ceph (ephemeral backend or boot
74 from volume), please use the ``raw`` image format within Glance.
75
76 .. index:: pools; OpenStack
77
78 Create a Pool
79 =============
80
81 By default, Ceph block devices live within the ``rbd`` pool. You may use any
82 suitable pool by specifying it explicitly. We recommend creating a pool for
83 Cinder and a pool for Glance. Ensure your Ceph cluster is running, then create the pools. ::
84
85 ceph osd pool create volumes
86 ceph osd pool create images
87 ceph osd pool create backups
88 ceph osd pool create vms
89
90 See `Create a Pool`_ for detail on specifying the number of placement groups for
91 your pools, and `Placement Groups`_ for details on the number of placement
92 groups you should set for your pools.
93
94 Newly created pools must be initialized prior to use. Use the ``rbd`` tool
95 to initialize the pools::
96
97 rbd pool init volumes
98 rbd pool init images
99 rbd pool init backups
100 rbd pool init vms
101
102 .. _Create a Pool: ../../rados/operations/pools#createpool
103 .. _Placement Groups: ../../rados/operations/placement-groups
104
105
106 Configure OpenStack Ceph Clients
107 ================================
108
109 The nodes running ``glance-api``, ``cinder-volume``, ``nova-compute`` and
110 ``cinder-backup`` act as Ceph clients. Each requires the ``ceph.conf`` file::
111
112 ssh {your-openstack-server} sudo tee /etc/ceph/ceph.conf </etc/ceph/ceph.conf
113
114
115 Install Ceph client packages
116 ----------------------------
117
118 On the ``glance-api`` node, you will need the Python bindings for ``librbd``::
119
120 sudo apt-get install python-rbd
121 sudo yum install python-rbd
122
123 On the ``nova-compute``, ``cinder-backup`` and on the ``cinder-volume`` node,
124 use both the Python bindings and the client command line tools::
125
126 sudo apt-get install ceph-common
127 sudo yum install ceph-common
128
129
130 Setup Ceph Client Authentication
131 --------------------------------
132
133 If you have `cephx authentication`_ enabled, create a new user for Nova/Cinder
134 and Glance. Execute the following::
135
136 ceph auth get-or-create client.glance mon 'profile rbd' osd 'profile rbd pool=images' mgr 'profile rbd pool=images'
137 ceph auth get-or-create client.cinder mon 'profile rbd' osd 'profile rbd pool=volumes, profile rbd pool=vms, profile rbd-read-only pool=images' mgr 'profile rbd pool=volumes, profile rbd pool=vms'
138 ceph auth get-or-create client.cinder-backup mon 'profile rbd' osd 'profile rbd pool=backups' mgr 'profile rbd pool=backups'
139
140 Add the keyrings for ``client.cinder``, ``client.glance``, and
141 ``client.cinder-backup`` to the appropriate nodes and change their ownership::
142
143 ceph auth get-or-create client.glance | ssh {your-glance-api-server} sudo tee /etc/ceph/ceph.client.glance.keyring
144 ssh {your-glance-api-server} sudo chown glance:glance /etc/ceph/ceph.client.glance.keyring
145 ceph auth get-or-create client.cinder | ssh {your-volume-server} sudo tee /etc/ceph/ceph.client.cinder.keyring
146 ssh {your-cinder-volume-server} sudo chown cinder:cinder /etc/ceph/ceph.client.cinder.keyring
147 ceph auth get-or-create client.cinder-backup | ssh {your-cinder-backup-server} sudo tee /etc/ceph/ceph.client.cinder-backup.keyring
148 ssh {your-cinder-backup-server} sudo chown cinder:cinder /etc/ceph/ceph.client.cinder-backup.keyring
149
150 Nodes running ``nova-compute`` need the keyring file for the ``nova-compute``
151 process::
152
153 ceph auth get-or-create client.cinder | ssh {your-nova-compute-server} sudo tee /etc/ceph/ceph.client.cinder.keyring
154
155 They also need to store the secret key of the ``client.cinder`` user in
156 ``libvirt``. The libvirt process needs it to access the cluster while attaching
157 a block device from Cinder.
158
159 Create a temporary copy of the secret key on the nodes running
160 ``nova-compute``::
161
162 ceph auth get-key client.cinder | ssh {your-compute-node} tee client.cinder.key
163
164 Then, on the compute nodes, add the secret key to ``libvirt`` and remove the
165 temporary copy of the key::
166
167 uuidgen
168 457eb676-33da-42ec-9a8c-9293d545c337
169
170 cat > secret.xml <<EOF
171 <secret ephemeral='no' private='no'>
172 <uuid>457eb676-33da-42ec-9a8c-9293d545c337</uuid>
173 <usage type='ceph'>
174 <name>client.cinder secret</name>
175 </usage>
176 </secret>
177 EOF
178 sudo virsh secret-define --file secret.xml
179 Secret 457eb676-33da-42ec-9a8c-9293d545c337 created
180 sudo virsh secret-set-value --secret 457eb676-33da-42ec-9a8c-9293d545c337 --base64 $(cat client.cinder.key) && rm client.cinder.key secret.xml
181
182 Save the uuid of the secret for configuring ``nova-compute`` later.
183
184 .. important:: You don't necessarily need the UUID on all the compute nodes.
185 However from a platform consistency perspective, it's better to keep the
186 same UUID.
187
188 .. _cephx authentication: ../../rados/configuration/auth-config-ref/#enabling-disabling-cephx
189
190
191 Configure OpenStack to use Ceph
192 ===============================
193
194 Configuring Glance
195 ------------------
196
197 Glance can use multiple back ends to store images. To use Ceph block devices by
198 default, configure Glance like the following.
199
200
201 Kilo and after
202 ~~~~~~~~~~~~~~
203
204 Edit ``/etc/glance/glance-api.conf`` and add under the ``[glance_store]`` section::
205
206 [glance_store]
207 stores = rbd
208 default_store = rbd
209 rbd_store_pool = images
210 rbd_store_user = glance
211 rbd_store_ceph_conf = /etc/ceph/ceph.conf
212 rbd_store_chunk_size = 8
213
214 For more information about the configuration options available in Glance please refer to the OpenStack Configuration Reference: http://docs.openstack.org/.
215
216 Enable copy-on-write cloning of images
217 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
218
219 Note that this exposes the back end location via Glance's API, so the endpoint
220 with this option enabled should not be publicly accessible.
221
222 Any OpenStack version except Mitaka
223 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
224
225 If you want to enable copy-on-write cloning of images, also add under the ``[DEFAULT]`` section::
226
227 show_image_direct_url = True
228
229 Disable cache management (any OpenStack version)
230 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
231
232 Disable the Glance cache management to avoid images getting cached under ``/var/lib/glance/image-cache/``,
233 assuming your configuration file has ``flavor = keystone+cachemanagement``::
234
235 [paste_deploy]
236 flavor = keystone
237
238 Image properties
239 ~~~~~~~~~~~~~~~~
240
241 We recommend to use the following properties for your images:
242
243 - ``hw_scsi_model=virtio-scsi``: add the virtio-scsi controller and get better performance and support for discard operation
244 - ``hw_disk_bus=scsi``: connect every cinder block devices to that controller
245 - ``hw_qemu_guest_agent=yes``: enable the QEMU guest agent
246 - ``os_require_quiesce=yes``: send fs-freeze/thaw calls through the QEMU guest agent
247
248
249 Configuring Cinder
250 ------------------
251
252 OpenStack requires a driver to interact with Ceph block devices. You must also
253 specify the pool name for the block device. On your OpenStack node, edit
254 ``/etc/cinder/cinder.conf`` by adding::
255
256 [DEFAULT]
257 ...
258 enabled_backends = ceph
259 glance_api_version = 2
260 ...
261 [ceph]
262 volume_driver = cinder.volume.drivers.rbd.RBDDriver
263 volume_backend_name = ceph
264 rbd_pool = volumes
265 rbd_ceph_conf = /etc/ceph/ceph.conf
266 rbd_flatten_volume_from_snapshot = false
267 rbd_max_clone_depth = 5
268 rbd_store_chunk_size = 4
269 rados_connect_timeout = -1
270
271 If you are using `cephx authentication`_, also configure the user and uuid of
272 the secret you added to ``libvirt`` as documented earlier::
273
274 [ceph]
275 ...
276 rbd_user = cinder
277 rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337
278
279 Note that if you are configuring multiple cinder back ends,
280 ``glance_api_version = 2`` must be in the ``[DEFAULT]`` section.
281
282
283 Configuring Cinder Backup
284 -------------------------
285
286 OpenStack Cinder Backup requires a specific daemon so don't forget to install it.
287 On your Cinder Backup node, edit ``/etc/cinder/cinder.conf`` and add::
288
289 backup_driver = cinder.backup.drivers.ceph
290 backup_ceph_conf = /etc/ceph/ceph.conf
291 backup_ceph_user = cinder-backup
292 backup_ceph_chunk_size = 134217728
293 backup_ceph_pool = backups
294 backup_ceph_stripe_unit = 0
295 backup_ceph_stripe_count = 0
296 restore_discard_excess_bytes = true
297
298
299 Configuring Nova to attach Ceph RBD block device
300 ------------------------------------------------
301
302 In order to attach Cinder devices (either normal block or by issuing a boot
303 from volume), you must tell Nova (and libvirt) which user and UUID to refer to
304 when attaching the device. libvirt will refer to this user when connecting and
305 authenticating with the Ceph cluster. ::
306
307 [libvirt]
308 ...
309 rbd_user = cinder
310 rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337
311
312 These two flags are also used by the Nova ephemeral back end.
313
314
315 Configuring Nova
316 ----------------
317
318 In order to boot virtual machines directly from Ceph volumes, you must
319 configure the ephemeral backend for Nova.
320
321 It is recommended to enable the RBD cache in your Ceph configuration file; this
322 has been enabled by default since the Giant release. Moreover, enabling the
323 client admin socket allows the collection of metrics and can be invaluable
324 for troubleshooting.
325
326 This socket can be accessed on the hypervisor (Nova compute) node::
327
328 ceph daemon /var/run/ceph/ceph-client.cinder.19195.32310016.asok help
329
330 To enable RBD cache and admin sockets, ensure that on each hypervisor's
331 ``ceph.conf`` contains::
332
333 [client]
334 rbd cache = true
335 rbd cache writethrough until flush = true
336 admin socket = /var/run/ceph/guests/$cluster-$type.$id.$pid.$cctid.asok
337 log file = /var/log/qemu/qemu-guest-$pid.log
338 rbd concurrent management ops = 20
339
340 Configure permissions for these directories::
341
342 mkdir -p /var/run/ceph/guests/ /var/log/qemu/
343 chown qemu:libvirtd /var/run/ceph/guests /var/log/qemu/
344
345 Note that user ``qemu`` and group ``libvirtd`` can vary depending on your system.
346 The provided example works for RedHat based systems.
347
348 .. tip:: If your virtual machine is already running you can simply restart it to enable the admin socket
349
350
351 Restart OpenStack
352 =================
353
354 To activate the Ceph block device driver and load the block device pool name
355 into the configuration, you must restart the related OpenStack services.
356 For Debian based systems execute these commands on the appropriate nodes::
357
358 sudo glance-control api restart
359 sudo service nova-compute restart
360 sudo service cinder-volume restart
361 sudo service cinder-backup restart
362
363 For Red Hat based systems execute::
364
365 sudo service openstack-glance-api restart
366 sudo service openstack-nova-compute restart
367 sudo service openstack-cinder-volume restart
368 sudo service openstack-cinder-backup restart
369
370 Once OpenStack is up and running, you should be able to create a volume
371 and boot from it.
372
373
374 Booting from a Block Device
375 ===========================
376
377 You can create a volume from an image using the Cinder command line tool::
378
379 cinder create --image-id {id of image} --display-name {name of volume} {size of volume}
380
381 You can use `qemu-img`_ to convert from one format to another. For example::
382
383 qemu-img convert -f {source-format} -O {output-format} {source-filename} {output-filename}
384 qemu-img convert -f qcow2 -O raw precise-cloudimg.img precise-cloudimg.raw
385
386 When Glance and Cinder are both using Ceph block devices, the image is a
387 copy-on-write clone, so new volumes are created quickly. In the OpenStack
388 dashboard, you can boot from that volume by performing the following steps:
389
390 #. Launch a new instance.
391 #. Choose the image associated to the copy-on-write clone.
392 #. Select 'boot from volume'.
393 #. Select the volume you created.
394
395 .. _qemu-img: ../qemu-rbd/#running-qemu-with-rbd