]> git.proxmox.com Git - ceph.git/blob - ceph/doc/rbd/rbd-openstack.rst
import 15.2.0 Octopus source
[ceph.git] / ceph / doc / rbd / rbd-openstack.rst
1 =============================
2 Block Devices and OpenStack
3 =============================
4
5 .. index:: Ceph Block Device; OpenStack
6
7 You may use Ceph Block Device images with OpenStack through ``libvirt``, which
8 configures the QEMU interface to ``librbd``. Ceph stripes block device images as
9 objects across the cluster, which means that large Ceph Block Device images have
10 better performance than a standalone server!
11
12 To use Ceph Block Devices with OpenStack, you must install QEMU, ``libvirt``,
13 and OpenStack first. We recommend using a separate physical node for your
14 OpenStack installation. OpenStack recommends a minimum of 8GB of RAM and a
15 quad-core processor. The following diagram depicts the OpenStack/Ceph
16 technology stack.
17
18
19 .. ditaa:: +---------------------------------------------------+
20 | OpenStack |
21 +---------------------------------------------------+
22 | libvirt |
23 +------------------------+--------------------------+
24 |
25 | configures
26 v
27 +---------------------------------------------------+
28 | QEMU |
29 +---------------------------------------------------+
30 | librbd |
31 +---------------------------------------------------+
32 | librados |
33 +------------------------+-+------------------------+
34 | OSDs | | Monitors |
35 +------------------------+ +------------------------+
36
37 .. important:: To use Ceph Block Devices with OpenStack, you must have
38 access to a running Ceph Storage Cluster.
39
40 Three parts of OpenStack integrate with Ceph's block devices:
41
42 - **Images**: OpenStack Glance manages images for VMs. Images are immutable.
43 OpenStack treats images as binary blobs and downloads them accordingly.
44
45 - **Volumes**: Volumes are block devices. OpenStack uses volumes to boot VMs,
46 or to attach volumes to running VMs. OpenStack manages volumes using
47 Cinder services.
48
49 - **Guest Disks**: Guest disks are guest operating system disks. By default,
50 when you boot a virtual machine, its disk appears as a file on the file system
51 of the hypervisor (usually under ``/var/lib/nova/instances/<uuid>/``). Prior
52 to OpenStack Havana, the only way to boot a VM in Ceph was to use the
53 boot-from-volume functionality of Cinder. However, now it is possible to boot
54 every virtual machine inside Ceph directly without using Cinder, which is
55 advantageous because it allows you to perform maintenance operations easily
56 with the live-migration process. Additionally, if your hypervisor dies it is
57 also convenient to trigger ``nova evacuate`` and run the virtual machine
58 elsewhere almost seamlessly. In doing so,
59 :ref:`exclusive locks <rbd-exclusive-locks>` prevent multiple
60 compute nodes from concurrently accessing the guest disk.
61
62
63 You can use OpenStack Glance to store images in a Ceph Block Device, and you
64 can use Cinder to boot a VM using a copy-on-write clone of an image.
65
66 The instructions below detail the setup for Glance, Cinder and Nova, although
67 they do not have to be used together. You may store images in Ceph block devices
68 while running VMs using a local disk, or vice versa.
69
70 .. important:: Using QCOW2 for hosting a virtual machine disk is NOT recommended.
71 If you want to boot virtual machines in Ceph (ephemeral backend or boot
72 from volume), please use the ``raw`` image format within Glance.
73
74 .. index:: pools; OpenStack
75
76 Create a Pool
77 =============
78
79 By default, Ceph block devices use the ``rbd`` pool. You may use any available
80 pool. We recommend creating a pool for Cinder and a pool for Glance. Ensure
81 your Ceph cluster is running, then create the pools. ::
82
83 ceph osd pool create volumes
84 ceph osd pool create images
85 ceph osd pool create backups
86 ceph osd pool create vms
87
88 See `Create a Pool`_ for detail on specifying the number of placement groups for
89 your pools, and `Placement Groups`_ for details on the number of placement
90 groups you should set for your pools.
91
92 Newly created pools must be initialized prior to use. Use the ``rbd`` tool
93 to initialize the pools::
94
95 rbd pool init volumes
96 rbd pool init images
97 rbd pool init backups
98 rbd pool init vms
99
100 .. _Create a Pool: ../../rados/operations/pools#createpool
101 .. _Placement Groups: ../../rados/operations/placement-groups
102
103
104 Configure OpenStack Ceph Clients
105 ================================
106
107 The nodes running ``glance-api``, ``cinder-volume``, ``nova-compute`` and
108 ``cinder-backup`` act as Ceph clients. Each requires the ``ceph.conf`` file::
109
110 ssh {your-openstack-server} sudo tee /etc/ceph/ceph.conf </etc/ceph/ceph.conf
111
112
113 Install Ceph client packages
114 ----------------------------
115
116 On the ``glance-api`` node, you will need the Python bindings for ``librbd``::
117
118 sudo apt-get install python-rbd
119 sudo yum install python-rbd
120
121 On the ``nova-compute``, ``cinder-backup`` and on the ``cinder-volume`` node,
122 use both the Python bindings and the client command line tools::
123
124 sudo apt-get install ceph-common
125 sudo yum install ceph-common
126
127
128 Setup Ceph Client Authentication
129 --------------------------------
130
131 If you have `cephx authentication`_ enabled, create a new user for Nova/Cinder
132 and Glance. Execute the following::
133
134 ceph auth get-or-create client.glance mon 'profile rbd' osd 'profile rbd pool=images' mgr 'profile rbd pool=images'
135 ceph auth get-or-create client.cinder mon 'profile rbd' osd 'profile rbd pool=volumes, profile rbd pool=vms, profile rbd-read-only pool=images' mgr 'profile rbd pool=volumes, profile rbd pool=vms'
136 ceph auth get-or-create client.cinder-backup mon 'profile rbd' osd 'profile rbd pool=backups' mgr 'profile rbd pool=backups'
137
138 Add the keyrings for ``client.cinder``, ``client.glance``, and
139 ``client.cinder-backup`` to the appropriate nodes and change their ownership::
140
141 ceph auth get-or-create client.glance | ssh {your-glance-api-server} sudo tee /etc/ceph/ceph.client.glance.keyring
142 ssh {your-glance-api-server} sudo chown glance:glance /etc/ceph/ceph.client.glance.keyring
143 ceph auth get-or-create client.cinder | ssh {your-volume-server} sudo tee /etc/ceph/ceph.client.cinder.keyring
144 ssh {your-cinder-volume-server} sudo chown cinder:cinder /etc/ceph/ceph.client.cinder.keyring
145 ceph auth get-or-create client.cinder-backup | ssh {your-cinder-backup-server} sudo tee /etc/ceph/ceph.client.cinder-backup.keyring
146 ssh {your-cinder-backup-server} sudo chown cinder:cinder /etc/ceph/ceph.client.cinder-backup.keyring
147
148 Nodes running ``nova-compute`` need the keyring file for the ``nova-compute``
149 process::
150
151 ceph auth get-or-create client.cinder | ssh {your-nova-compute-server} sudo tee /etc/ceph/ceph.client.cinder.keyring
152
153 They also need to store the secret key of the ``client.cinder`` user in
154 ``libvirt``. The libvirt process needs it to access the cluster while attaching
155 a block device from Cinder.
156
157 Create a temporary copy of the secret key on the nodes running
158 ``nova-compute``::
159
160 ceph auth get-key client.cinder | ssh {your-compute-node} tee client.cinder.key
161
162 Then, on the compute nodes, add the secret key to ``libvirt`` and remove the
163 temporary copy of the key::
164
165 uuidgen
166 457eb676-33da-42ec-9a8c-9293d545c337
167
168 cat > secret.xml <<EOF
169 <secret ephemeral='no' private='no'>
170 <uuid>457eb676-33da-42ec-9a8c-9293d545c337</uuid>
171 <usage type='ceph'>
172 <name>client.cinder secret</name>
173 </usage>
174 </secret>
175 EOF
176 sudo virsh secret-define --file secret.xml
177 Secret 457eb676-33da-42ec-9a8c-9293d545c337 created
178 sudo virsh secret-set-value --secret 457eb676-33da-42ec-9a8c-9293d545c337 --base64 $(cat client.cinder.key) && rm client.cinder.key secret.xml
179
180 Save the uuid of the secret for configuring ``nova-compute`` later.
181
182 .. important:: You don't necessarily need the UUID on all the compute nodes.
183 However from a platform consistency perspective, it's better to keep the
184 same UUID.
185
186 .. _cephx authentication: ../../rados/configuration/auth-config-ref/#enabling-disabling-cephx
187
188
189 Configure OpenStack to use Ceph
190 ===============================
191
192 Configuring Glance
193 ------------------
194
195 Glance can use multiple back ends to store images. To use Ceph block devices by
196 default, configure Glance like the following.
197
198
199 Kilo and after
200 ~~~~~~~~~~~~~~
201
202 Edit ``/etc/glance/glance-api.conf`` and add under the ``[glance_store]`` section::
203
204 [glance_store]
205 stores = rbd
206 default_store = rbd
207 rbd_store_pool = images
208 rbd_store_user = glance
209 rbd_store_ceph_conf = /etc/ceph/ceph.conf
210 rbd_store_chunk_size = 8
211
212 For more information about the configuration options available in Glance please refer to the OpenStack Configuration Reference: http://docs.openstack.org/.
213
214 Enable copy-on-write cloning of images
215 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
216
217 Note that this exposes the back end location via Glance's API, so the endpoint
218 with this option enabled should not be publicly accessible.
219
220 Any OpenStack version except Mitaka
221 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
222
223 If you want to enable copy-on-write cloning of images, also add under the ``[DEFAULT]`` section::
224
225 show_image_direct_url = True
226
227 Disable cache management (any OpenStack version)
228 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
229
230 Disable the Glance cache management to avoid images getting cached under ``/var/lib/glance/image-cache/``,
231 assuming your configuration file has ``flavor = keystone+cachemanagement``::
232
233 [paste_deploy]
234 flavor = keystone
235
236 Image properties
237 ~~~~~~~~~~~~~~~~
238
239 We recommend to use the following properties for your images:
240
241 - ``hw_scsi_model=virtio-scsi``: add the virtio-scsi controller and get better performance and support for discard operation
242 - ``hw_disk_bus=scsi``: connect every cinder block devices to that controller
243 - ``hw_qemu_guest_agent=yes``: enable the QEMU guest agent
244 - ``os_require_quiesce=yes``: send fs-freeze/thaw calls through the QEMU guest agent
245
246
247 Configuring Cinder
248 ------------------
249
250 OpenStack requires a driver to interact with Ceph block devices. You must also
251 specify the pool name for the block device. On your OpenStack node, edit
252 ``/etc/cinder/cinder.conf`` by adding::
253
254 [DEFAULT]
255 ...
256 enabled_backends = ceph
257 glance_api_version = 2
258 ...
259 [ceph]
260 volume_driver = cinder.volume.drivers.rbd.RBDDriver
261 volume_backend_name = ceph
262 rbd_pool = volumes
263 rbd_ceph_conf = /etc/ceph/ceph.conf
264 rbd_flatten_volume_from_snapshot = false
265 rbd_max_clone_depth = 5
266 rbd_store_chunk_size = 4
267 rados_connect_timeout = -1
268
269 If you are using `cephx authentication`_, also configure the user and uuid of
270 the secret you added to ``libvirt`` as documented earlier::
271
272 [ceph]
273 ...
274 rbd_user = cinder
275 rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337
276
277 Note that if you are configuring multiple cinder back ends,
278 ``glance_api_version = 2`` must be in the ``[DEFAULT]`` section.
279
280
281 Configuring Cinder Backup
282 -------------------------
283
284 OpenStack Cinder Backup requires a specific daemon so don't forget to install it.
285 On your Cinder Backup node, edit ``/etc/cinder/cinder.conf`` and add::
286
287 backup_driver = cinder.backup.drivers.ceph
288 backup_ceph_conf = /etc/ceph/ceph.conf
289 backup_ceph_user = cinder-backup
290 backup_ceph_chunk_size = 134217728
291 backup_ceph_pool = backups
292 backup_ceph_stripe_unit = 0
293 backup_ceph_stripe_count = 0
294 restore_discard_excess_bytes = true
295
296
297 Configuring Nova to attach Ceph RBD block device
298 ------------------------------------------------
299
300 In order to attach Cinder devices (either normal block or by issuing a boot
301 from volume), you must tell Nova (and libvirt) which user and UUID to refer to
302 when attaching the device. libvirt will refer to this user when connecting and
303 authenticating with the Ceph cluster. ::
304
305 [libvirt]
306 ...
307 rbd_user = cinder
308 rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337
309
310 These two flags are also used by the Nova ephemeral backend.
311
312
313 Configuring Nova
314 ----------------
315
316 In order to boot all the virtual machines directly into Ceph, you must
317 configure the ephemeral backend for Nova.
318
319 It is recommended to enable the RBD cache in your Ceph configuration file
320 (enabled by default since Giant). Moreover, enabling the admin socket
321 brings a lot of benefits while troubleshooting. Having one socket
322 per virtual machine using a Ceph block device will help investigating performance and/or wrong behaviors.
323
324 This socket can be accessed like this::
325
326 ceph daemon /var/run/ceph/ceph-client.cinder.19195.32310016.asok help
327
328 Now on every compute nodes edit your Ceph configuration file::
329
330 [client]
331 rbd cache = true
332 rbd cache writethrough until flush = true
333 admin socket = /var/run/ceph/guests/$cluster-$type.$id.$pid.$cctid.asok
334 log file = /var/log/qemu/qemu-guest-$pid.log
335 rbd concurrent management ops = 20
336
337 Configure the permissions of these paths::
338
339 mkdir -p /var/run/ceph/guests/ /var/log/qemu/
340 chown qemu:libvirtd /var/run/ceph/guests /var/log/qemu/
341
342 Note that user ``qemu`` and group ``libvirtd`` can vary depending on your system.
343 The provided example works for RedHat based systems.
344
345 .. tip:: If your virtual machine is already running you can simply restart it to get the socket
346
347
348 Restart OpenStack
349 =================
350
351 To activate the Ceph block device driver and load the block device pool name
352 into the configuration, you must restart OpenStack. Thus, for Debian based
353 systems execute these commands on the appropriate nodes::
354
355 sudo glance-control api restart
356 sudo service nova-compute restart
357 sudo service cinder-volume restart
358 sudo service cinder-backup restart
359
360 For Red Hat based systems execute::
361
362 sudo service openstack-glance-api restart
363 sudo service openstack-nova-compute restart
364 sudo service openstack-cinder-volume restart
365 sudo service openstack-cinder-backup restart
366
367 Once OpenStack is up and running, you should be able to create a volume
368 and boot from it.
369
370
371 Booting from a Block Device
372 ===========================
373
374 You can create a volume from an image using the Cinder command line tool::
375
376 cinder create --image-id {id of image} --display-name {name of volume} {size of volume}
377
378 You can use `qemu-img`_ to convert from one format to another. For example::
379
380 qemu-img convert -f {source-format} -O {output-format} {source-filename} {output-filename}
381 qemu-img convert -f qcow2 -O raw precise-cloudimg.img precise-cloudimg.raw
382
383 When Glance and Cinder are both using Ceph block devices, the image is a
384 copy-on-write clone, so it can create a new volume quickly. In the OpenStack
385 dashboard, you can boot from that volume by performing the following steps:
386
387 #. Launch a new instance.
388 #. Choose the image associated to the copy-on-write clone.
389 #. Select 'boot from volume'.
390 #. Select the volume you created.
391
392 .. _qemu-img: ../qemu-rbd/#running-qemu-with-rbd