]>
Commit | Line | Data |
---|---|---|
1 | .. _ceph-volume-lvm-prepare: | |
2 | ||
3 | ``prepare`` | |
4 | =========== | |
5 | This subcommand allows a :term:`filestore` or :term:`bluestore` setup. It is | |
6 | recommended to pre-provision a logical volume before using it with | |
7 | ``ceph-volume lvm``. | |
8 | ||
9 | Logical volumes are not altered except for adding extra metadata. | |
10 | ||
11 | .. note:: This is part of a two step process to deploy an OSD. If looking for | |
12 | a single-call way, please see :ref:`ceph-volume-lvm-create` | |
13 | ||
14 | To help identify volumes, the process of preparing a volume (or volumes) to | |
15 | work with Ceph, the tool will assign a few pieces of metadata information using | |
16 | :term:`LVM tags`. | |
17 | ||
18 | :term:`LVM tags` makes volumes easy to discover later, and help identify them as | |
19 | part of a Ceph system, and what role they have (journal, filestore, bluestore, | |
20 | etc...) | |
21 | ||
22 | Although initially :term:`filestore` is supported (and supported by default) | |
23 | the back end can be specified with: | |
24 | ||
25 | ||
26 | * :ref:`--filestore <ceph-volume-lvm-prepare_filestore>` | |
27 | * :ref:`--bluestore <ceph-volume-lvm-prepare_bluestore>` | |
28 | ||
29 | .. _ceph-volume-lvm-prepare_filestore: | |
30 | ||
31 | ``filestore`` | |
32 | ------------- | |
33 | This is the OSD backend that allows preparation of logical volumes for | |
34 | a :term:`filestore` objectstore OSD. | |
35 | ||
36 | It can use a logical volume for the OSD data and a partitioned physical device | |
37 | or logical volume for the journal. No special preparation is needed for these | |
38 | volumes other than following the minimum size requirements for data and | |
39 | journal. | |
40 | ||
41 | The API call looks like:: | |
42 | ||
43 | ceph-volume prepare --filestore --data data --journal journal | |
44 | ||
45 | There is flexibility to use a raw device or partition as well for ``--data`` | |
46 | that will be converted to a logical volume. This is not ideal in all situations | |
47 | since ``ceph-volume`` is just going to create a unique volume group and | |
48 | a logical volume from that device. | |
49 | ||
50 | When using logical volumes for ``--data``, the value *must* be a volume group | |
51 | name and a logical volume name separated by a ``/``. Since logical volume names | |
52 | are not enforced for uniqueness, this prevents using the wrong volume. The | |
53 | ``--journal`` can be either a logical volume *or* a partition. | |
54 | ||
55 | When using a partition, it *must* contain a ``PARTUUID`` discoverable by | |
56 | ``blkid``, so that it can later be identified correctly regardless of the | |
57 | device name (or path). | |
58 | ||
59 | When using a partition, this is how it would look for ``/dev/sdc1``:: | |
60 | ||
61 | ceph-volume prepare --filestore --data volume_group/lv_name --journal /dev/sdc1 | |
62 | ||
63 | For a logical volume, just like for ``--data``, a volume group and logical | |
64 | volume name are required:: | |
65 | ||
66 | ceph-volume prepare --filestore --data volume_group/lv_name --journal volume_group/journal_lv | |
67 | ||
68 | A generated uuid is used to ask the cluster for a new OSD. These two pieces are | |
69 | crucial for identifying an OSD and will later be used throughout the | |
70 | :ref:`ceph-volume-lvm-activate` process. | |
71 | ||
72 | The OSD data directory is created using the following convention:: | |
73 | ||
74 | /var/lib/ceph/osd/<cluster name>-<osd id> | |
75 | ||
76 | At this point the data volume is mounted at this location, and the journal | |
77 | volume is linked:: | |
78 | ||
79 | ln -s /path/to/journal /var/lib/ceph/osd/<cluster_name>-<osd-id>/journal | |
80 | ||
81 | The monmap is fetched using the bootstrap key from the OSD:: | |
82 | ||
83 | /usr/bin/ceph --cluster ceph --name client.bootstrap-osd | |
84 | --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring | |
85 | mon getmap -o /var/lib/ceph/osd/<cluster name>-<osd id>/activate.monmap | |
86 | ||
87 | ``ceph-osd`` will be called to populate the OSD directory, that is already | |
88 | mounted, re-using all the pieces of information from the initial steps:: | |
89 | ||
90 | ceph-osd --cluster ceph --mkfs --mkkey -i <osd id> \ | |
91 | --monmap /var/lib/ceph/osd/<cluster name>-<osd id>/activate.monmap --osd-data \ | |
92 | /var/lib/ceph/osd/<cluster name>-<osd id> --osd-journal /var/lib/ceph/osd/<cluster name>-<osd id>/journal \ | |
93 | --osd-uuid <osd uuid> --keyring /var/lib/ceph/osd/<cluster name>-<osd id>/keyring \ | |
94 | --setuser ceph --setgroup ceph | |
95 | ||
96 | .. _ceph-volume-lvm-existing-osds: | |
97 | ||
98 | Existing OSDs | |
99 | ------------- | |
100 | For existing clusters that want to use this new system and have OSDs that are | |
101 | already running there are a few things to take into account: | |
102 | ||
103 | .. warning:: this process will forcefully format the data device, destroying | |
104 | existing data, if any. | |
105 | ||
106 | * OSD paths should follow this convention:: | |
107 | ||
108 | /var/lib/ceph/osd/<cluster name>-<osd id> | |
109 | ||
110 | * Preferably, no other mechanisms to mount the volume should exist, and should | |
111 | be removed (like fstab mount points) | |
112 | * There is currently no support for encrypted volumes | |
113 | ||
114 | The one time process for an existing OSD, with an ID of 0 and | |
115 | using a ``"ceph"`` cluster name would look like:: | |
116 | ||
117 | ceph-volume lvm prepare --filestore --osd-id 0 --osd-fsid E3D291C1-E7BF-4984-9794-B60D9FA139CB | |
118 | ||
119 | The command line tool will not contact the monitor to generate an OSD ID and | |
120 | will format the LVM device in addition to storing the metadata on it so that it | |
121 | can later be startednot contact the monitor to generate an OSD ID and will | |
122 | format the LVM device in addition to storing the metadata on it so that it can | |
123 | later be started (for detailed metadata description see :ref:`ceph-volume-lvm-tags`). | |
124 | ||
125 | ||
126 | .. _ceph-volume-lvm-prepare_bluestore: | |
127 | ||
128 | ``bluestore`` | |
129 | ------------- | |
130 | The :term:`bluestore` objectstore is the default for new OSDs. It offers a bit | |
131 | more flexibility for devices. Bluestore supports the following configurations: | |
132 | ||
133 | * A block device, a block.wal, and a block.db device | |
134 | * A block device and a block.wal device | |
135 | * A block device and a block.db device | |
136 | * A single block device | |
137 | ||
138 | It can accept a whole device (or partition), or a logical volume for ``block``. | |
139 | If a physical device is provided it will then be turned into a logical volume. | |
140 | This allows a simpler approach at using LVM but at the cost of flexibility: | |
141 | there are no options or configurations to change how the LV is created. | |
142 | ||
143 | The ``block`` is specified with the ``--data`` flag, and in its simplest use | |
144 | case it looks like:: | |
145 | ||
146 | ceph-volume lvm prepare --bluestore --data vg/lv | |
147 | ||
148 | A raw device can be specified in the same way:: | |
149 | ||
150 | ceph-volume lvm prepare --bluestore --data /path/to/device | |
151 | ||
152 | ||
153 | If a ``block.db`` or a ``block.wal`` is needed (they are optional for | |
154 | bluestore) they can be specified with ``--block.db`` and ``--block.wal`` | |
155 | accordingly. These can be a physical device (they **must** be a partition) or | |
156 | a logical volume. | |
157 | ||
158 | For both ``block.db`` and ``block.wal`` partitions aren't made logical volumes | |
159 | because they can be used as-is. Logical Volumes are also allowed. | |
160 | ||
161 | While creating the OSD directory, the process will use a ``tmpfs`` mount to | |
162 | place all the files needed for the OSD. These files are initially created by | |
163 | ``ceph-osd --mkfs`` and are fully ephemeral. | |
164 | ||
165 | A symlink is always created for the ``block`` device, and optionally for | |
166 | ``block.db`` and ``block.wal``. For a cluster with a default name, and an OSD | |
167 | id of 0, the directory could look like:: | |
168 | ||
169 | # ls -l /var/lib/ceph/osd/ceph-0 | |
170 | lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block -> /dev/ceph-be2b6fbd-bcf2-4c51-b35d-a35a162a02f0/osd-block-25cf0a05-2bc6-44ef-9137-79d65bd7ad62 | |
171 | lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block.db -> /dev/sda1 | |
172 | lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block.wal -> /dev/ceph/osd-wal-0 | |
173 | -rw-------. 1 ceph ceph 37 Oct 20 13:05 ceph_fsid | |
174 | -rw-------. 1 ceph ceph 37 Oct 20 13:05 fsid | |
175 | -rw-------. 1 ceph ceph 55 Oct 20 13:05 keyring | |
176 | -rw-------. 1 ceph ceph 6 Oct 20 13:05 ready | |
177 | -rw-------. 1 ceph ceph 10 Oct 20 13:05 type | |
178 | -rw-------. 1 ceph ceph 2 Oct 20 13:05 whoami | |
179 | ||
180 | In the above case, a device was used for ``block`` so ``ceph-volume`` create | |
181 | a volume group and a logical volume using the following convention: | |
182 | ||
183 | * volume group name: ``ceph-{cluster fsid}`` or if the vg exists already | |
184 | ``ceph-{random uuid}`` | |
185 | ||
186 | * logical volume name: ``osd-block-{osd_fsid}`` | |
187 | ||
188 | ||
189 | Storing metadata | |
190 | ---------------- | |
191 | The following tags will get applied as part of the preparation process | |
192 | regardless of the type of volume (journal or data) or OSD objectstore: | |
193 | ||
194 | * ``cluster_fsid`` | |
195 | * ``encrypted`` | |
196 | * ``osd_fsid`` | |
197 | * ``osd_id`` | |
198 | ||
199 | For :term:`filestore` these tags will be added: | |
200 | ||
201 | * ``journal_device`` | |
202 | * ``journal_uuid`` | |
203 | ||
204 | For :term:`bluestore` these tags will be added: | |
205 | ||
206 | * ``block_device`` | |
207 | * ``block_uuid`` | |
208 | * ``db_device`` | |
209 | * ``db_uuid`` | |
210 | * ``wal_device`` | |
211 | * ``wal_uuid`` | |
212 | ||
213 | .. note:: For the complete lvm tag conventions see :ref:`ceph-volume-lvm-tag-api` | |
214 | ||
215 | ||
216 | Summary | |
217 | ------- | |
218 | To recap the ``prepare`` process for :term:`bluestore`: | |
219 | ||
220 | #. Accept a logical volume for block or a raw device (that will get converted | |
221 | to an lv) | |
222 | #. Accept partitions or logical volumes for ``block.wal`` or ``block.db`` | |
223 | #. Generate a UUID for the OSD | |
224 | #. Ask the monitor get an OSD ID reusing the generated UUID | |
225 | #. OSD data directory is created on a tmpfs mount. | |
226 | #. ``block``, ``block.wal``, and ``block.db`` are symlinked if defined. | |
227 | #. monmap is fetched for activation | |
228 | #. Data directory is populated by ``ceph-osd`` | |
229 | #. Logical Volumes are are assigned all the Ceph metadata using lvm tags | |
230 | ||
231 | ||
232 | And the ``prepare`` process for :term:`filestore`: | |
233 | ||
234 | #. Accept only logical volumes for data and journal (both required) | |
235 | #. Generate a UUID for the OSD | |
236 | #. Ask the monitor get an OSD ID reusing the generated UUID | |
237 | #. OSD data directory is created and data volume mounted | |
238 | #. Journal is symlinked from data volume to journal location | |
239 | #. monmap is fetched for activation | |
240 | #. devices is mounted and data directory is populated by ``ceph-osd`` | |
241 | #. data and journal volumes are assigned all the Ceph metadata using lvm tags |