1 =============================
2 Block Devices and OpenStack
3 =============================
5 .. index:: Ceph Block Device; OpenStack
7 You can attach Ceph Block Device images to OpenStack instances through ``libvirt``,
8 which configures the QEMU interface to ``librbd``. Ceph stripes block volumes
9 across multiple OSDs within the cluster, which means that large volumes can
10 realize better performance than local drives on a standalone server!
12 To use Ceph Block Devices with OpenStack, you must install QEMU, ``libvirt``,
13 and OpenStack first. We recommend using a separate physical node for your
14 OpenStack installation. OpenStack recommends a minimum of 8GB of RAM and a
15 quad-core processor. The following diagram depicts the OpenStack/Ceph
21 +---------------------------------------------------+
23 +---------------------------------------------------+
25 +------------------------+--------------------------+
29 +---------------------------------------------------+
31 +---------------------------------------------------+
33 +---------------------------------------------------+
35 +------------------------+-+------------------------+
37 +------------------------+ +------------------------+
39 .. important:: To use Ceph Block Devices with OpenStack, you must have
40 access to a running Ceph Storage Cluster.
42 Three parts of OpenStack integrate with Ceph's block devices:
44 - **Images**: OpenStack Glance manages images for VMs. Images are immutable.
45 OpenStack treats images as binary blobs and downloads them accordingly.
47 - **Volumes**: Volumes are block devices. OpenStack uses volumes to boot VMs,
48 or to attach volumes to running VMs. OpenStack manages volumes using
51 - **Guest Disks**: Guest disks are guest operating system disks. By default,
52 when you boot a virtual machine, its disk appears as a file on the file system
53 of the hypervisor (usually under ``/var/lib/nova/instances/<uuid>/``). Prior
54 to OpenStack Havana, the only way to boot a VM in Ceph was to use the
55 boot-from-volume functionality of Cinder. However, now it is possible to boot
56 every virtual machine inside Ceph directly without using Cinder, which is
57 advantageous because it allows you to perform maintenance operations easily
58 with the live-migration process. Additionally, if your hypervisor dies it is
59 also convenient to trigger ``nova evacuate`` and reinstate the virtual machine
60 elsewhere almost seamlessly. In doing so,
61 :ref:`exclusive locks <rbd-exclusive-locks>` prevent multiple
62 compute nodes from concurrently accessing the guest disk.
65 You can use OpenStack Glance to store images as Ceph Block Devices, and you
66 can use Cinder to boot a VM using a copy-on-write clone of an image.
68 The instructions below detail the setup for Glance, Cinder and Nova, although
69 they do not have to be used together. You may store images in Ceph block devices
70 while running VMs using a local disk, or vice versa.
72 .. important:: Using QCOW2 for hosting a virtual machine disk is NOT recommended.
73 If you want to boot virtual machines in Ceph (ephemeral backend or boot
74 from volume), please use the ``raw`` image format within Glance.
76 .. index:: pools; OpenStack
81 By default, Ceph block devices live within the ``rbd`` pool. You may use any
82 suitable pool by specifying it explicitly. We recommend creating a pool for
83 Cinder and a pool for Glance. Ensure your Ceph cluster is running, then create the pools. ::
85 ceph osd pool create volumes
86 ceph osd pool create images
87 ceph osd pool create backups
88 ceph osd pool create vms
90 See `Create a Pool`_ for detail on specifying the number of placement groups for
91 your pools, and `Placement Groups`_ for details on the number of placement
92 groups you should set for your pools.
94 Newly created pools must be initialized prior to use. Use the ``rbd`` tool
95 to initialize the pools::
102 .. _Create a Pool: ../../rados/operations/pools#createpool
103 .. _Placement Groups: ../../rados/operations/placement-groups
106 Configure OpenStack Ceph Clients
107 ================================
109 The nodes running ``glance-api``, ``cinder-volume``, ``nova-compute`` and
110 ``cinder-backup`` act as Ceph clients. Each requires the ``ceph.conf`` file::
112 ssh {your-openstack-server} sudo tee /etc/ceph/ceph.conf </etc/ceph/ceph.conf
115 Install Ceph client packages
116 ----------------------------
118 On the ``glance-api`` node, you will need the Python bindings for ``librbd``::
120 sudo apt-get install python-rbd
121 sudo yum install python-rbd
123 On the ``nova-compute``, ``cinder-backup`` and on the ``cinder-volume`` node,
124 use both the Python bindings and the client command line tools::
126 sudo apt-get install ceph-common
127 sudo yum install ceph-common
130 Setup Ceph Client Authentication
131 --------------------------------
133 If you have `cephx authentication`_ enabled, create a new user for Nova/Cinder
134 and Glance. Execute the following::
136 ceph auth get-or-create client.glance mon 'profile rbd' osd 'profile rbd pool=images' mgr 'profile rbd pool=images'
137 ceph auth get-or-create client.cinder mon 'profile rbd' osd 'profile rbd pool=volumes, profile rbd pool=vms, profile rbd-read-only pool=images' mgr 'profile rbd pool=volumes, profile rbd pool=vms'
138 ceph auth get-or-create client.cinder-backup mon 'profile rbd' osd 'profile rbd pool=backups' mgr 'profile rbd pool=backups'
140 Add the keyrings for ``client.cinder``, ``client.glance``, and
141 ``client.cinder-backup`` to the appropriate nodes and change their ownership::
143 ceph auth get-or-create client.glance | ssh {your-glance-api-server} sudo tee /etc/ceph/ceph.client.glance.keyring
144 ssh {your-glance-api-server} sudo chown glance:glance /etc/ceph/ceph.client.glance.keyring
145 ceph auth get-or-create client.cinder | ssh {your-volume-server} sudo tee /etc/ceph/ceph.client.cinder.keyring
146 ssh {your-cinder-volume-server} sudo chown cinder:cinder /etc/ceph/ceph.client.cinder.keyring
147 ceph auth get-or-create client.cinder-backup | ssh {your-cinder-backup-server} sudo tee /etc/ceph/ceph.client.cinder-backup.keyring
148 ssh {your-cinder-backup-server} sudo chown cinder:cinder /etc/ceph/ceph.client.cinder-backup.keyring
150 Nodes running ``nova-compute`` need the keyring file for the ``nova-compute``
153 ceph auth get-or-create client.cinder | ssh {your-nova-compute-server} sudo tee /etc/ceph/ceph.client.cinder.keyring
155 They also need to store the secret key of the ``client.cinder`` user in
156 ``libvirt``. The libvirt process needs it to access the cluster while attaching
157 a block device from Cinder.
159 Create a temporary copy of the secret key on the nodes running
162 ceph auth get-key client.cinder | ssh {your-compute-node} tee client.cinder.key
164 Then, on the compute nodes, add the secret key to ``libvirt`` and remove the
165 temporary copy of the key::
168 457eb676-33da-42ec-9a8c-9293d545c337
170 cat > secret.xml <<EOF
171 <secret ephemeral='no' private='no'>
172 <uuid>457eb676-33da-42ec-9a8c-9293d545c337</uuid>
174 <name>client.cinder secret</name>
178 sudo virsh secret-define --file secret.xml
179 Secret 457eb676-33da-42ec-9a8c-9293d545c337 created
180 sudo virsh secret-set-value --secret 457eb676-33da-42ec-9a8c-9293d545c337 --base64 $(cat client.cinder.key) && rm client.cinder.key secret.xml
182 Save the uuid of the secret for configuring ``nova-compute`` later.
184 .. important:: You don't necessarily need the UUID on all the compute nodes.
185 However from a platform consistency perspective, it's better to keep the
188 .. _cephx authentication: ../../rados/configuration/auth-config-ref/#enabling-disabling-cephx
191 Configure OpenStack to use Ceph
192 ===============================
197 Glance can use multiple back ends to store images. To use Ceph block devices by
198 default, configure Glance like the following.
204 Edit ``/etc/glance/glance-api.conf`` and add under the ``[glance_store]`` section::
209 rbd_store_pool = images
210 rbd_store_user = glance
211 rbd_store_ceph_conf = /etc/ceph/ceph.conf
212 rbd_store_chunk_size = 8
214 For more information about the configuration options available in Glance please refer to the OpenStack Configuration Reference: http://docs.openstack.org/.
216 Enable copy-on-write cloning of images
217 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
219 Note that this exposes the back end location via Glance's API, so the endpoint
220 with this option enabled should not be publicly accessible.
222 Any OpenStack version except Mitaka
223 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
225 If you want to enable copy-on-write cloning of images, also add under the ``[DEFAULT]`` section::
227 show_image_direct_url = True
229 Disable cache management (any OpenStack version)
230 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
232 Disable the Glance cache management to avoid images getting cached under ``/var/lib/glance/image-cache/``,
233 assuming your configuration file has ``flavor = keystone+cachemanagement``::
241 We recommend to use the following properties for your images:
243 - ``hw_scsi_model=virtio-scsi``: add the virtio-scsi controller and get better performance and support for discard operation
244 - ``hw_disk_bus=scsi``: connect every cinder block devices to that controller
245 - ``hw_qemu_guest_agent=yes``: enable the QEMU guest agent
246 - ``os_require_quiesce=yes``: send fs-freeze/thaw calls through the QEMU guest agent
252 OpenStack requires a driver to interact with Ceph block devices. You must also
253 specify the pool name for the block device. On your OpenStack node, edit
254 ``/etc/cinder/cinder.conf`` by adding::
258 enabled_backends = ceph
259 glance_api_version = 2
262 volume_driver = cinder.volume.drivers.rbd.RBDDriver
263 volume_backend_name = ceph
265 rbd_ceph_conf = /etc/ceph/ceph.conf
266 rbd_flatten_volume_from_snapshot = false
267 rbd_max_clone_depth = 5
268 rbd_store_chunk_size = 4
269 rados_connect_timeout = -1
271 If you are using `cephx authentication`_, also configure the user and uuid of
272 the secret you added to ``libvirt`` as documented earlier::
277 rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337
279 Note that if you are configuring multiple cinder back ends,
280 ``glance_api_version = 2`` must be in the ``[DEFAULT]`` section.
283 Configuring Cinder Backup
284 -------------------------
286 OpenStack Cinder Backup requires a specific daemon so don't forget to install it.
287 On your Cinder Backup node, edit ``/etc/cinder/cinder.conf`` and add::
289 backup_driver = cinder.backup.drivers.ceph
290 backup_ceph_conf = /etc/ceph/ceph.conf
291 backup_ceph_user = cinder-backup
292 backup_ceph_chunk_size = 134217728
293 backup_ceph_pool = backups
294 backup_ceph_stripe_unit = 0
295 backup_ceph_stripe_count = 0
296 restore_discard_excess_bytes = true
299 Configuring Nova to attach Ceph RBD block device
300 ------------------------------------------------
302 In order to attach Cinder devices (either normal block or by issuing a boot
303 from volume), you must tell Nova (and libvirt) which user and UUID to refer to
304 when attaching the device. libvirt will refer to this user when connecting and
305 authenticating with the Ceph cluster. ::
310 rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337
312 These two flags are also used by the Nova ephemeral back end.
318 In order to boot virtual machines directly from Ceph volumes, you must
319 configure the ephemeral backend for Nova.
321 It is recommended to enable the RBD cache in your Ceph configuration file; this
322 has been enabled by default since the Giant release. Moreover, enabling the
323 client admin socket allows the collection of metrics and can be invaluable
326 This socket can be accessed on the hypervisor (Nova compute) node::
328 ceph daemon /var/run/ceph/ceph-client.cinder.19195.32310016.asok help
330 To enable RBD cache and admin sockets, ensure that on each hypervisor's
331 ``ceph.conf`` contains::
335 rbd cache writethrough until flush = true
336 admin socket = /var/run/ceph/guests/$cluster-$type.$id.$pid.$cctid.asok
337 log file = /var/log/qemu/qemu-guest-$pid.log
338 rbd concurrent management ops = 20
340 Configure permissions for these directories::
342 mkdir -p /var/run/ceph/guests/ /var/log/qemu/
343 chown qemu:libvirtd /var/run/ceph/guests /var/log/qemu/
345 Note that user ``qemu`` and group ``libvirtd`` can vary depending on your system.
346 The provided example works for RedHat based systems.
348 .. tip:: If your virtual machine is already running you can simply restart it to enable the admin socket
354 To activate the Ceph block device driver and load the block device pool name
355 into the configuration, you must restart the related OpenStack services.
356 For Debian based systems execute these commands on the appropriate nodes::
358 sudo glance-control api restart
359 sudo service nova-compute restart
360 sudo service cinder-volume restart
361 sudo service cinder-backup restart
363 For Red Hat based systems execute::
365 sudo service openstack-glance-api restart
366 sudo service openstack-nova-compute restart
367 sudo service openstack-cinder-volume restart
368 sudo service openstack-cinder-backup restart
370 Once OpenStack is up and running, you should be able to create a volume
374 Booting from a Block Device
375 ===========================
377 You can create a volume from an image using the Cinder command line tool::
379 cinder create --image-id {id of image} --display-name {name of volume} {size of volume}
381 You can use `qemu-img`_ to convert from one format to another. For example::
383 qemu-img convert -f {source-format} -O {output-format} {source-filename} {output-filename}
384 qemu-img convert -f qcow2 -O raw precise-cloudimg.img precise-cloudimg.raw
386 When Glance and Cinder are both using Ceph block devices, the image is a
387 copy-on-write clone, so new volumes are created quickly. In the OpenStack
388 dashboard, you can boot from that volume by performing the following steps:
390 #. Launch a new instance.
391 #. Choose the image associated to the copy-on-write clone.
392 #. Select 'boot from volume'.
393 #. Select the volume you created.
395 .. _qemu-img: ../qemu-rbd/#running-qemu-with-rbd