]> git.proxmox.com Git - ceph.git/blame - ceph/doc/ceph-volume/lvm/prepare.rst
update sources to 12.2.10
[ceph.git] / ceph / doc / ceph-volume / lvm / prepare.rst
CommitLineData
b5b8bbf5
FG
1.. _ceph-volume-lvm-prepare:
2
3``prepare``
4===========
3efd9988
FG
5This subcommand allows a :term:`filestore` or :term:`bluestore` setup. It is
6recommended to pre-provision a logical volume before using it with
7``ceph-volume lvm``.
8
9Logical volumes are not altered except for adding extra metadata.
b5b8bbf5
FG
10
11.. note:: This is part of a two step process to deploy an OSD. If looking for
12 a single-call way, please see :ref:`ceph-volume-lvm-create`
13
14To help identify volumes, the process of preparing a volume (or volumes) to
15work with Ceph, the tool will assign a few pieces of metadata information using
16:term:`LVM tags`.
17
18:term:`LVM tags` makes volumes easy to discover later, and help identify them as
19part of a Ceph system, and what role they have (journal, filestore, bluestore,
20etc...)
21
22Although initially :term:`filestore` is supported (and supported by default)
23the back end can be specified with:
24
25
26* :ref:`--filestore <ceph-volume-lvm-prepare_filestore>`
3efd9988 27* :ref:`--bluestore <ceph-volume-lvm-prepare_bluestore>`
b5b8bbf5
FG
28
29.. _ceph-volume-lvm-prepare_filestore:
30
31``filestore``
32-------------
3efd9988
FG
33This is the OSD backend that allows preparation of logical volumes for
34a :term:`filestore` objectstore OSD.
b5b8bbf5 35
3efd9988
FG
36It can use a logical volume for the OSD data and a partitioned physical device
37or logical volume for the journal. No special preparation is needed for these
38volumes other than following the minimum size requirements for data and
39journal.
b5b8bbf5
FG
40
41The API call looks like::
42
43 ceph-volume prepare --filestore --data data --journal journal
44
28e407b8
AA
45For enabling :ref:`encryption <ceph-volume-lvm-encryption>`, the ``--dmcrypt`` flag is required::
46
47 ceph-volume lvm prepare --filestore --dmcrypt --data volume_group/lv_name --journal journal
48
3efd9988
FG
49There is flexibility to use a raw device or partition as well for ``--data``
50that will be converted to a logical volume. This is not ideal in all situations
51since ``ceph-volume`` is just going to create a unique volume group and
52a logical volume from that device.
53
54When using logical volumes for ``--data``, the value *must* be a volume group
55name and a logical volume name separated by a ``/``. Since logical volume names
56are not enforced for uniqueness, this prevents using the wrong volume. The
57``--journal`` can be either a logical volume *or* a partition.
58
59When using a partition, it *must* contain a ``PARTUUID`` discoverable by
60``blkid``, so that it can later be identified correctly regardless of the
61device name (or path).
62
63When using a partition, this is how it would look for ``/dev/sdc1``::
64
65 ceph-volume prepare --filestore --data volume_group/lv_name --journal /dev/sdc1
66
67For a logical volume, just like for ``--data``, a volume group and logical
68volume name are required::
69
70 ceph-volume prepare --filestore --data volume_group/lv_name --journal volume_group/journal_lv
b5b8bbf5
FG
71
72A generated uuid is used to ask the cluster for a new OSD. These two pieces are
73crucial for identifying an OSD and will later be used throughout the
74:ref:`ceph-volume-lvm-activate` process.
75
76The OSD data directory is created using the following convention::
77
78 /var/lib/ceph/osd/<cluster name>-<osd id>
79
80At this point the data volume is mounted at this location, and the journal
81volume is linked::
82
83 ln -s /path/to/journal /var/lib/ceph/osd/<cluster_name>-<osd-id>/journal
84
85The monmap is fetched using the bootstrap key from the OSD::
86
87 /usr/bin/ceph --cluster ceph --name client.bootstrap-osd
88 --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
89 mon getmap -o /var/lib/ceph/osd/<cluster name>-<osd id>/activate.monmap
90
91``ceph-osd`` will be called to populate the OSD directory, that is already
92mounted, re-using all the pieces of information from the initial steps::
93
94 ceph-osd --cluster ceph --mkfs --mkkey -i <osd id> \
95 --monmap /var/lib/ceph/osd/<cluster name>-<osd id>/activate.monmap --osd-data \
96 /var/lib/ceph/osd/<cluster name>-<osd id> --osd-journal /var/lib/ceph/osd/<cluster name>-<osd id>/journal \
97 --osd-uuid <osd uuid> --keyring /var/lib/ceph/osd/<cluster name>-<osd id>/keyring \
98 --setuser ceph --setgroup ceph
99
100.. _ceph-volume-lvm-existing-osds:
101
102Existing OSDs
103-------------
104For existing clusters that want to use this new system and have OSDs that are
105already running there are a few things to take into account:
106
107.. warning:: this process will forcefully format the data device, destroying
108 existing data, if any.
109
110* OSD paths should follow this convention::
111
112 /var/lib/ceph/osd/<cluster name>-<osd id>
113
114* Preferably, no other mechanisms to mount the volume should exist, and should
115 be removed (like fstab mount points)
b5b8bbf5
FG
116
117The one time process for an existing OSD, with an ID of 0 and
118using a ``"ceph"`` cluster name would look like::
119
120 ceph-volume lvm prepare --filestore --osd-id 0 --osd-fsid E3D291C1-E7BF-4984-9794-B60D9FA139CB
121
122The command line tool will not contact the monitor to generate an OSD ID and
123will format the LVM device in addition to storing the metadata on it so that it
124can later be startednot contact the monitor to generate an OSD ID and will
125format the LVM device in addition to storing the metadata on it so that it can
126later be started (for detailed metadata description see :ref:`ceph-volume-lvm-tags`).
127
128
129.. _ceph-volume-lvm-prepare_bluestore:
130
131``bluestore``
132-------------
3efd9988
FG
133The :term:`bluestore` objectstore is the default for new OSDs. It offers a bit
134more flexibility for devices. Bluestore supports the following configurations:
135
136* A block device, a block.wal, and a block.db device
137* A block device and a block.wal device
138* A block device and a block.db device
139* A single block device
140
141It can accept a whole device (or partition), or a logical volume for ``block``.
142If a physical device is provided it will then be turned into a logical volume.
143This allows a simpler approach at using LVM but at the cost of flexibility:
144there are no options or configurations to change how the LV is created.
145
146The ``block`` is specified with the ``--data`` flag, and in its simplest use
147case it looks like::
148
149 ceph-volume lvm prepare --bluestore --data vg/lv
150
151A raw device can be specified in the same way::
152
153 ceph-volume lvm prepare --bluestore --data /path/to/device
154
28e407b8
AA
155For enabling :ref:`encryption <ceph-volume-lvm-encryption>`, the ``--dmcrypt`` flag is required::
156
157 ceph-volume lvm prepare --bluestore --dmcrypt --data vg/lv
3efd9988
FG
158
159If a ``block.db`` or a ``block.wal`` is needed (they are optional for
160bluestore) they can be specified with ``--block.db`` and ``--block.wal``
161accordingly. These can be a physical device (they **must** be a partition) or
162a logical volume.
163
164For both ``block.db`` and ``block.wal`` partitions aren't made logical volumes
165because they can be used as-is. Logical Volumes are also allowed.
166
167While creating the OSD directory, the process will use a ``tmpfs`` mount to
168place all the files needed for the OSD. These files are initially created by
169``ceph-osd --mkfs`` and are fully ephemeral.
170
171A symlink is always created for the ``block`` device, and optionally for
172``block.db`` and ``block.wal``. For a cluster with a default name, and an OSD
173id of 0, the directory could look like::
174
175 # ls -l /var/lib/ceph/osd/ceph-0
176 lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block -> /dev/ceph-be2b6fbd-bcf2-4c51-b35d-a35a162a02f0/osd-block-25cf0a05-2bc6-44ef-9137-79d65bd7ad62
177 lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block.db -> /dev/sda1
178 lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block.wal -> /dev/ceph/osd-wal-0
179 -rw-------. 1 ceph ceph 37 Oct 20 13:05 ceph_fsid
180 -rw-------. 1 ceph ceph 37 Oct 20 13:05 fsid
181 -rw-------. 1 ceph ceph 55 Oct 20 13:05 keyring
182 -rw-------. 1 ceph ceph 6 Oct 20 13:05 ready
183 -rw-------. 1 ceph ceph 10 Oct 20 13:05 type
184 -rw-------. 1 ceph ceph 2 Oct 20 13:05 whoami
185
186In the above case, a device was used for ``block`` so ``ceph-volume`` create
187a volume group and a logical volume using the following convention:
188
189* volume group name: ``ceph-{cluster fsid}`` or if the vg exists already
190 ``ceph-{random uuid}``
191
192* logical volume name: ``osd-block-{osd_fsid}``
b5b8bbf5
FG
193
194
b32b8144
FG
195Crush device class
196------------------
197
198To set the crush device class for the OSD, use the ``--crush-device-class`` flag. This will
199work for both bluestore and filestore OSDs::
200
201 ceph-volume lvm prepare --bluestore --data vg/lv --crush-device-class foo
202
203
91327a77
AA
204.. _ceph-volume-lvm-multipath:
205
206``multipath`` support
207---------------------
208Devices that come from ``multipath`` are not supported as-is. The tool will
209refuse to consume a raw multipath device and will report a message like::
210
211 --> RuntimeError: Cannot use device (/dev/mapper/<name>). A vg/lv path or an existing device is needed
212
213The reason for not supporting multipath is that depending on the type of the
214multipath setup, if using an active/passive array as the underlying physical
215devices, filters are required in ``lvm.conf`` to exclude the disks that are part of
216those underlying devices.
217
218It is unfeasible for ceph-volume to understand what type of configuration is
219needed for LVM to be able to work in various different multipath scenarios. The
220functionality to create the LV for you is merely a (naive) convenience,
221anything that involves different settings or configuration must be provided by
222a config management system which can then provide VGs and LVs for ceph-volume
223to consume.
224
225This situation will only arise when trying to use the ceph-volume functionality
226that creates a volume group and logical volume from a device. If a multipath
227device is already a logical volume it *should* work, given that the LVM
228configuration is done correctly to avoid issues.
229
230
b5b8bbf5
FG
231Storing metadata
232----------------
3efd9988
FG
233The following tags will get applied as part of the preparation process
234regardless of the type of volume (journal or data) or OSD objectstore:
b5b8bbf5
FG
235
236* ``cluster_fsid``
b5b8bbf5
FG
237* ``encrypted``
238* ``osd_fsid``
239* ``osd_id``
b32b8144 240* ``crush_device_class``
3efd9988
FG
241
242For :term:`filestore` these tags will be added:
243
244* ``journal_device``
245* ``journal_uuid``
246
247For :term:`bluestore` these tags will be added:
248
249* ``block_device``
250* ``block_uuid``
251* ``db_device``
252* ``db_uuid``
253* ``wal_device``
254* ``wal_uuid``
b5b8bbf5
FG
255
256.. note:: For the complete lvm tag conventions see :ref:`ceph-volume-lvm-tag-api`
257
258
259Summary
260-------
3efd9988
FG
261To recap the ``prepare`` process for :term:`bluestore`:
262
263#. Accept a logical volume for block or a raw device (that will get converted
264 to an lv)
265#. Accept partitions or logical volumes for ``block.wal`` or ``block.db``
266#. Generate a UUID for the OSD
267#. Ask the monitor get an OSD ID reusing the generated UUID
268#. OSD data directory is created on a tmpfs mount.
269#. ``block``, ``block.wal``, and ``block.db`` are symlinked if defined.
270#. monmap is fetched for activation
271#. Data directory is populated by ``ceph-osd``
272#. Logical Volumes are are assigned all the Ceph metadata using lvm tags
273
274
275And the ``prepare`` process for :term:`filestore`:
b5b8bbf5
FG
276
277#. Accept only logical volumes for data and journal (both required)
278#. Generate a UUID for the OSD
279#. Ask the monitor get an OSD ID reusing the generated UUID
280#. OSD data directory is created and data volume mounted
281#. Journal is symlinked from data volume to journal location
282#. monmap is fetched for activation
283#. devices is mounted and data directory is populated by ``ceph-osd``
284#. data and journal volumes are assigned all the Ceph metadata using lvm tags