]> git.proxmox.com Git - ceph.git/blob - ceph/doc/rbd/rbd-nomad.rst
import quincy beta 17.1.0
[ceph.git] / ceph / doc / rbd / rbd-nomad.rst
1 =========================
2 Block Devices and Nomad
3 =========================
4
5 Like Kubernetes, Nomad can use Ceph Block Device. This is made possible by
6 `ceph-csi`_, which allows you to dynamically provision RBD images or import
7 existing RBD images.
8
9 Every version of Nomad is compatible with `ceph-csi`_, but the reference
10 version of Nomad that was used to generate the procedures and guidance in this
11 document is Nomad v1.1.2, the latest version available at the time of the
12 writing of the document.
13
14 To use Ceph Block Devices with Nomad, you must install
15 and configure ``ceph-csi`` within your Nomad environment. The following
16 diagram shows the Nomad/Ceph technology stack.
17
18 .. ditaa::
19 +-------------------------+-------------------------+
20 | Container | ceph--csi |
21 | | node |
22 | ^ | ^ |
23 | | | | |
24 +----------+--------------+-------------------------+
25 | | | |
26 | v | |
27 | Nomad | |
28 | | |
29 +---------------------------------------------------+
30 | ceph--csi |
31 | controller |
32 +--------+------------------------------------------+
33 | |
34 | configures maps |
35 +---------------+ +----------------+
36 | |
37 v v
38 +------------------------+ +------------------------+
39 | | | rbd--nbd |
40 | Kernel Modules | +------------------------+
41 | | | librbd |
42 +------------------------+-+------------------------+
43 | RADOS Protocol |
44 +------------------------+-+------------------------+
45 | OSDs | | Monitors |
46 +------------------------+ +------------------------+
47
48 .. note::
49 Nomad has many possible task drivers, but this example uses only a Docker container.
50
51 .. important::
52 ``ceph-csi`` uses the RBD kernel modules by default, which may not support
53 all Ceph `CRUSH tunables`_ or `RBD image features`_.
54
55 Create a Pool
56 =============
57
58 By default, Ceph block devices use the ``rbd`` pool. Ensure that your Ceph
59 cluster is running, then create a pool for Nomad persistent storage:
60
61 .. prompt:: bash $
62
63 ceph osd pool create nomad
64
65 See `Create a Pool`_ for details on specifying the number of placement groups
66 for your pools. See `Placement Groups`_ for details on the number of placement
67 groups you should set for your pools.
68
69 A newly created pool must be initialized prior to use. Use the ``rbd`` tool
70 to initialize the pool:
71
72 .. prompt:: bash $
73
74 rbd pool init nomad
75
76 Configure ceph-csi
77 ==================
78
79 Ceph Client Authentication Setup
80 --------------------------------
81
82 Create a new user for Nomad and `ceph-csi`. Execute the following command and
83 record the generated key:
84
85 .. code-block:: console
86
87 $ ceph auth get-or-create client.nomad mon 'profile rbd' osd 'profile rbd pool=nomad' mgr 'profile rbd pool=nomad'
88 [client.nomad]
89 key = AQAlh9Rgg2vrDxAARy25T7KHabs6iskSHpAEAQ==
90
91
92 Configure Nomad
93 ---------------
94
95 Configuring Nomad to Allow Containers to Use Privileged Mode
96 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
97
98 By default, Nomad doesn't allow containers to use privileged mode. We must
99 configure Nomad so that it allows containers to use privileged mode. Edit the
100 Nomad configuration file by adding the following configuration block to
101 `/etc/nomad.d/nomad.hcl`::
102
103 plugin "docker" {
104 config {
105 allow_privileged = true
106 }
107 }
108
109 Loading the rbd module
110 ~~~~~~~~~~~~~~~~~~~~~~
111
112 Nomad must have the `rbd` module loaded. Run the following command to confirm that the `rbd` module is loaded:
113
114 .. code-block:: console
115
116 $ lsmod | grep rbd
117 rbd 94208 2
118 libceph 364544 1 rbd
119
120 If the `rbd` module is not loaded, load it:
121
122 .. prompt:: bash $
123
124 sudo modprobe rbd
125
126 Restarting Nomad
127 ~~~~~~~~~~~~~~~~
128
129 Restart Nomad:
130
131 .. prompt:: bash $
132
133 sudo systemctl restart nomad
134
135
136 Create ceph-csi controller and plugin nodes
137 ===========================================
138
139 The `ceph-csi`_ plugin requires two components:
140
141 - **Controller plugin**: communicates with the provider's API.
142 - **Node plugin**: executes tasks on the client.
143
144 .. note::
145 We'll set the ceph-csi's version in those files. See `ceph-csi release`_
146 for information about ceph-csi's compatibility with other versions.
147
148 Configure controller plugin
149 ---------------------------
150
151 The controller plugin requires the Ceph monitor addresses of the Ceph
152 cluster. Collect both (1) the Ceph cluster unique `fsid` and (2) the monitor
153 addresses:
154
155 .. code-block:: console
156
157 $ ceph mon dump
158 <...>
159 fsid b9127830-b0cc-4e34-aa47-9d1a2e9949a8
160 <...>
161 0: [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] mon.a
162 1: [v2:192.168.1.2:3300/0,v1:192.168.1.2:6789/0] mon.b
163 2: [v2:192.168.1.3:3300/0,v1:192.168.1.3:6789/0] mon.c
164
165 Generate a ``ceph-csi-plugin-controller.nomad`` file similar to the example
166 below. Substitute the `fsid` for "clusterID", and the monitor addresses for
167 "monitors"::
168
169 job "ceph-csi-plugin-controller" {
170 datacenters = ["dc1"]
171 group "controller" {
172 network {
173 port "metrics" {}
174 }
175 task "ceph-controller" {
176 template {
177 data = <<EOF
178 [{
179 "clusterID": "b9127830-b0cc-4e34-aa47-9d1a2e9949a8",
180 "monitors": [
181 "192.168.1.1",
182 "192.168.1.2",
183 "192.168.1.3"
184 ]
185 }]
186 EOF
187 destination = "local/config.json"
188 change_mode = "restart"
189 }
190 driver = "docker"
191 config {
192 image = "quay.io/cephcsi/cephcsi:v3.3.1"
193 volumes = [
194 "./local/config.json:/etc/ceph-csi-config/config.json"
195 ]
196 mounts = [
197 {
198 type = "tmpfs"
199 target = "/tmp/csi/keys"
200 readonly = false
201 tmpfs_options = {
202 size = 1000000 # size in bytes
203 }
204 }
205 ]
206 args = [
207 "--type=rbd",
208 "--controllerserver=true",
209 "--drivername=rbd.csi.ceph.com",
210 "--endpoint=unix://csi/csi.sock",
211 "--nodeid=${node.unique.name}",
212 "--instanceid=${node.unique.name}-controller",
213 "--pidlimit=-1",
214 "--logtostderr=true",
215 "--v=5",
216 "--metricsport=$${NOMAD_PORT_metrics}"
217 ]
218 }
219 resources {
220 cpu = 500
221 memory = 256
222 }
223 service {
224 name = "ceph-csi-controller"
225 port = "metrics"
226 tags = [ "prometheus" ]
227 }
228 csi_plugin {
229 id = "ceph-csi"
230 type = "controller"
231 mount_dir = "/csi"
232 }
233 }
234 }
235 }
236
237 Configure plugin node
238 ---------------------
239
240 Generate a ``ceph-csi-plugin-nodes.nomad`` file similar to the example below.
241 Substitute the `fsid` for "clusterID" and the monitor addresses for
242 "monitors"::
243
244 job "ceph-csi-plugin-nodes" {
245 datacenters = ["dc1"]
246 type = "system"
247 group "nodes" {
248 network {
249 port "metrics" {}
250 }
251 task "ceph-node" {
252 driver = "docker"
253 template {
254 data = <<EOF
255 [{
256 "clusterID": "b9127830-b0cc-4e34-aa47-9d1a2e9949a8",
257 "monitors": [
258 "192.168.1.1",
259 "192.168.1.2",
260 "192.168.1.3"
261 ]
262 }]
263 EOF
264 destination = "local/config.json"
265 change_mode = "restart"
266 }
267 config {
268 image = "quay.io/cephcsi/cephcsi:v3.3.1"
269 volumes = [
270 "./local/config.json:/etc/ceph-csi-config/config.json"
271 ]
272 mounts = [
273 {
274 type = "tmpfs"
275 target = "/tmp/csi/keys"
276 readonly = false
277 tmpfs_options = {
278 size = 1000000 # size in bytes
279 }
280 }
281 ]
282 args = [
283 "--type=rbd",
284 "--drivername=rbd.csi.ceph.com",
285 "--nodeserver=true",
286 "--endpoint=unix://csi/csi.sock",
287 "--nodeid=${node.unique.name}",
288 "--instanceid=${node.unique.name}-nodes",
289 "--pidlimit=-1",
290 "--logtostderr=true",
291 "--v=5",
292 "--metricsport=$${NOMAD_PORT_metrics}"
293 ]
294 privileged = true
295 }
296 resources {
297 cpu = 500
298 memory = 256
299 }
300 service {
301 name = "ceph-csi-nodes"
302 port = "metrics"
303 tags = [ "prometheus" ]
304 }
305 csi_plugin {
306 id = "ceph-csi"
307 type = "node"
308 mount_dir = "/csi"
309 }
310 }
311 }
312 }
313
314 Start plugin controller and node
315 --------------------------------
316
317 To start the plugin controller and the Nomad node, run the following commands:
318
319 .. prompt:: bash $
320
321 nomad job run ceph-csi-plugin-controller.nomad
322 nomad job run ceph-csi-plugin-nodes.nomad
323
324 The `ceph-csi`_ image will be downloaded.
325
326 Check the plugin status after a few minutes:
327
328 .. code-block:: console
329
330 $ nomad plugin status ceph-csi
331 ID = ceph-csi
332 Provider = rbd.csi.ceph.com
333 Version = 3.3.1
334 Controllers Healthy = 1
335 Controllers Expected = 1
336 Nodes Healthy = 1
337 Nodes Expected = 1
338
339 Allocations
340 ID Node ID Task Group Version Desired Status Created Modified
341 23b4db0c a61ef171 nodes 4 run running 3h26m ago 3h25m ago
342 fee74115 a61ef171 controller 6 run running 3h26m ago 3h25m ago
343
344 Using Ceph Block Devices
345 ========================
346
347 Create rbd image
348 ----------------
349
350 ``ceph-csi`` requires the cephx credentials for communicating with the Ceph
351 cluster. Generate a ``ceph-volume.hcl`` file similar to the example below,
352 using the newly created nomad user id and cephx key::
353
354 id = "ceph-mysql"
355 name = "ceph-mysql"
356 type = "csi"
357 plugin_id = "ceph-csi"
358 capacity_max = "200G"
359 capacity_min = "100G"
360
361 capability {
362 access_mode = "single-node-writer"
363 attachment_mode = "file-system"
364 }
365
366 secrets {
367 userID = "admin"
368 userKey = "AQAlh9Rgg2vrDxAARy25T7KHabs6iskSHpAEAQ=="
369 }
370
371 parameters {
372 clusterID = "b9127830-b0cc-4e34-aa47-9d1a2e9949a8"
373 pool = "nomad"
374 imageFeatures = "layering"
375 }
376
377 After the ``ceph-volume.hcl`` file has been generated, create the volume:
378
379 .. prompt:: bash $
380
381 nomad volume create ceph-volume.hcl
382
383 Use rbd image with a container
384 ------------------------------
385
386 As an exercise in using an rbd image with a container, modify the Hashicorp
387 `nomad stateful`_ example.
388
389 Generate a ``mysql.nomad`` file similar to the example below::
390
391 job "mysql-server" {
392 datacenters = ["dc1"]
393 type = "service"
394 group "mysql-server" {
395 count = 1
396 volume "ceph-mysql" {
397 type = "csi"
398 attachment_mode = "file-system"
399 access_mode = "single-node-writer"
400 read_only = false
401 source = "ceph-mysql"
402 }
403 network {
404 port "db" {
405 static = 3306
406 }
407 }
408 restart {
409 attempts = 10
410 interval = "5m"
411 delay = "25s"
412 mode = "delay"
413 }
414 task "mysql-server" {
415 driver = "docker"
416 volume_mount {
417 volume = "ceph-mysql"
418 destination = "/srv"
419 read_only = false
420 }
421 env {
422 MYSQL_ROOT_PASSWORD = "password"
423 }
424 config {
425 image = "hashicorp/mysql-portworx-demo:latest"
426 args = ["--datadir", "/srv/mysql"]
427 ports = ["db"]
428 }
429 resources {
430 cpu = 500
431 memory = 1024
432 }
433 service {
434 name = "mysql-server"
435 port = "db"
436 check {
437 type = "tcp"
438 interval = "10s"
439 timeout = "2s"
440 }
441 }
442 }
443 }
444 }
445
446 Start the job:
447
448 .. prompt:: bash $
449
450 nomad job run mysql.nomad
451
452 Check the status of the job:
453
454 .. code-block:: console
455
456 $ nomad job status mysql-server
457 ...
458 Status = running
459 ...
460 Allocations
461 ID Node ID Task Group Version Desired Status Created Modified
462 38070da7 9ad01c63 mysql-server 0 run running 6s ago 3s ago
463
464 To check that data are persistent, modify the database, purge the job, then
465 create it using the same file. The same RBD image will be used (re-used,
466 really).
467
468 .. _ceph-csi: https://github.com/ceph/ceph-csi/
469 .. _csi: https://www.nomadproject.io/docs/internals/plugins/csi
470 .. _Create a Pool: ../../rados/operations/pools#createpool
471 .. _Placement Groups: ../../rados/operations/placement-groups
472 .. _CRUSH tunables: ../../rados/operations/crush-map/#tunables
473 .. _RBD image features: ../rbd-config-ref/#image-features
474 .. _nomad stateful: https://learn.hashicorp.com/tutorials/nomad/stateful-workloads-csi-volumes?in=nomad/stateful-workloads#create-the-job-file
475 .. _ceph-csi release: https://github.com/ceph/ceph-csi#ceph-csi-container-images-and-release-compatibility