]> git.proxmox.com Git - ceph.git/blob - ceph/doc/mgr/modules.rst
9fb3b87ae3f3e7afa0ca95f29bcd9a56a337dbc0
[ceph.git] / ceph / doc / mgr / modules.rst
1
2
3 .. _mgr-module-dev:
4
5 ceph-mgr module developer's guide
6 =================================
7
8 .. warning::
9
10 This is developer documentation, describing Ceph internals that
11 are only relevant to people writing ceph-mgr modules.
12
13 Creating a module
14 -----------------
15
16 In pybind/mgr/, create a python module. Within your module, create a class
17 that inherits from ``MgrModule``. For ceph-mgr to detect your module, your
18 directory must contain a file called `module.py`.
19
20 The most important methods to override are:
21
22 * a ``serve`` member function for server-type modules. This
23 function should block forever.
24 * a ``notify`` member function if your module needs to
25 take action when new cluster data is available.
26 * a ``handle_command`` member function if your module
27 exposes CLI commands. But this approach for exposing commands
28 is deprecated. For more details, see :ref:`mgr-module-exposing-commands`.
29
30 Some modules interface with external orchestrators to deploy
31 Ceph services. These also inherit from ``Orchestrator``, which adds
32 additional methods to the base ``MgrModule`` class. See
33 :ref:`Orchestrator modules <orchestrator-modules>` for more on
34 creating these modules.
35
36 Installing a module
37 -------------------
38
39 Once your module is present in the location set by the
40 ``mgr module path`` configuration setting, you can enable it
41 via the ``ceph mgr module enable`` command::
42
43 ceph mgr module enable mymodule
44
45 Note that the MgrModule interface is not stable, so any modules maintained
46 outside of the Ceph tree are liable to break when run against any newer
47 or older versions of Ceph.
48
49 .. _mgr module dev logging:
50
51 Logging
52 -------
53
54 Logging in Ceph manager modules is done as in any other Python program. Just
55 import the ``logging`` package and get a logger instance with the
56 ``logging.getLogger`` function.
57
58 Each module has a ``log_level`` option that specifies the current Python
59 logging level of the module.
60 To change or query the logging level of the module use the following Ceph
61 commands::
62
63 ceph config get mgr mgr/<module_name>/log_level
64 ceph config set mgr mgr/<module_name>/log_level <info|debug|critical|error|warning|>
65
66 The logging level used upon the module's start is determined by the current
67 logging level of the mgr daemon, unless if the ``log_level`` option was
68 previously set with the ``config set ...`` command. The mgr daemon logging
69 level is mapped to the module python logging level as follows:
70
71 * <= 0 is CRITICAL
72 * <= 1 is WARNING
73 * <= 4 is INFO
74 * <= +inf is DEBUG
75
76 We can unset the module log level and fallback to the mgr daemon logging level
77 by running the following command::
78
79 ceph config set mgr mgr/<module_name>/log_level ''
80
81 By default, modules' logging messages are processed by the Ceph logging layer
82 where they will be recorded in the mgr daemon's log file.
83 But it's also possible to send a module's logging message to it's own file.
84
85 The module's log file will be located in the same directory as the mgr daemon's
86 log file with the following name pattern::
87
88 <mgr_daemon_log_file_name>.<module_name>.log
89
90 To enable the file logging on a module use the following command::
91
92 ceph config set mgr mgr/<module_name>/log_to_file true
93
94 When the module's file logging is enabled, module's logging messages stop
95 being written to the mgr daemon's log file and are only written to the
96 module's log file.
97
98 It's also possible to check the status and disable the file logging with the
99 following commands::
100
101 ceph config get mgr mgr/<module_name>/log_to_file
102 ceph config set mgr mgr/<module_name>/log_to_file false
103
104
105
106 .. _mgr-module-exposing-commands:
107
108 Exposing commands
109 -----------------
110
111 There are two approaches for exposing a command. The first one is to
112 use the ``@CLICommand`` decorator to decorate the method which handles
113 the command. like this
114
115 .. code:: python
116
117 @CLICommand('antigravity send to blackhole',
118 perm='rw')
119 def send_to_blackhole(self, oid: str, blackhole: Optional[str] = None, inbuf: Optional[str] = None):
120 '''
121 Send the specified object to black hole
122 '''
123 obj = self.find_object(oid)
124 if obj is None:
125 return HandleCommandResult(-errno.ENOENT, stderr=f"object '{oid}' not found")
126 if blackhole is not None and inbuf is not None:
127 try:
128 location = self.decrypt(blackhole, passphrase=inbuf)
129 except ValueError:
130 return HandleCommandResult(-errno.EINVAL, stderr='unable to decrypt location')
131 else:
132 location = blackhole
133 self.send_object_to(obj, location)
134 return HandleCommandResult(stdout=f'the black hole swallowed '{oid}'")
135
136 The first parameter passed to ``CLICommand`` is the "name" of the command.
137 Since there are lots of commands in Ceph, we tend to group related commands
138 with a common prefix. In this case, "antigravity" is used for this purpose.
139 As the author is probably designing a module which is also able to launch
140 rockets into the deep space.
141
142 The `type annotations <https://www.python.org/dev/peps/pep-0484/>`_ for the
143 method parameters are mandatory here, so the usage of the command can be
144 properly reported to the ``ceph`` CLI, and the manager daemon can convert
145 the serialized command parameters sent by the clients to the expected type
146 before passing them to the handler method. With properly implemented types,
147 one can also perform some sanity checks against the parameters!
148
149 The names of the parameters are part of the command interface, so please
150 try to take the backward compatibility into consideration when changing
151 them. But you **cannot** change name of ``inbuf`` parameter, it is used
152 to pass the content of the file specified by ``ceph --in-file`` option.
153
154 The docstring of the method is used for the description of the command.
155
156 The manager daemon cooks the usage of the command from these ingredients,
157 like::
158
159 antigravity send to blackhole <oid> [<blackhole>] Send the specified object to black hole
160
161 as part of the output of ``ceph --help``.
162
163 In addition to ``@CLICommand``, you could also use ``@CLIReadCommand`` or
164 ``@CLIWriteCommand`` if your command only requires read permissions or
165 write permissions respectively.
166
167 The second one is to set the ``COMMANDS`` class attribute of your module to
168 a list of dicts like this::
169
170 COMMANDS = [
171 {
172 "cmd": "foobar name=myarg,type=CephString",
173 "desc": "Do something awesome",
174 "perm": "rw",
175 # optional:
176 "poll": "true"
177 }
178 ]
179
180 The ``cmd`` part of each entry is parsed in the same way as internal
181 Ceph mon and admin socket commands (see mon/MonCommands.h in
182 the Ceph source for examples). Note that the "poll" field is optional,
183 and is set to False by default; this indicates to the ``ceph`` CLI
184 that it should call this command repeatedly and output results (see
185 ``ceph -h`` and its ``--period`` option).
186
187 Each command is expected to return a tuple ``(retval, stdout, stderr)``.
188 ``retval`` is an integer representing a libc error code (e.g. EINVAL,
189 EPERM, or 0 for no error), ``stdout`` is a string containing any
190 non-error output, and ``stderr`` is a string containing any progress or
191 error explanation output. Either or both of the two strings may be empty.
192
193 Implement the ``handle_command`` function to respond to the commands
194 when they are sent:
195
196
197 .. py:currentmodule:: mgr_module
198 .. automethod:: MgrModule.handle_command
199
200 Configuration options
201 ---------------------
202
203 Modules can load and store configuration options using the
204 ``set_module_option`` and ``get_module_option`` methods.
205
206 .. note:: Use ``set_module_option`` and ``get_module_option`` to
207 manage user-visible configuration options that are not blobs (like
208 certificates). If you want to persist module-internal data or
209 binary configuration data consider using the `KV store`_.
210
211 You must declare your available configuration options in the
212 ``MODULE_OPTIONS`` class attribute, like this:
213
214 .. code-block:: python
215
216 MODULE_OPTIONS = [
217 Option(name="my_option")
218 ]
219
220 If you try to use set_module_option or get_module_option on options not declared
221 in ``MODULE_OPTIONS``, an exception will be raised.
222
223 You may choose to provide setter commands in your module to perform
224 high level validation. Users can also modify configuration using
225 the normal `ceph config set` command, where the configuration options
226 for a mgr module are named like `mgr/<module name>/<option>`.
227
228 If a configuration option is different depending on which node the mgr
229 is running on, then use *localized* configuration (
230 ``get_localized_module_option``, ``set_localized_module_option``).
231 This may be necessary for options such as what address to listen on.
232 Localized options may also be set externally with ``ceph config set``,
233 where they key name is like ``mgr/<module name>/<mgr id>/<option>``
234
235 If you need to load and store data (e.g. something larger, binary, or multiline),
236 use the KV store instead of configuration options (see next section).
237
238 Hints for using config options:
239
240 * Reads are fast: ceph-mgr keeps a local in-memory copy, so in many cases
241 you can just do a get_module_option every time you use a option, rather than
242 copying it out into a variable.
243 * Writes block until the value is persisted (i.e. round trip to the monitor),
244 but reads from another thread will see the new value immediately.
245 * If a user has used `config set` from the command line, then the new
246 value will become visible to `get_module_option` immediately, although the
247 mon->mgr update is asynchronous, so `config set` will return a fraction
248 of a second before the new value is visible on the mgr.
249 * To delete a config value (i.e. revert to default), just pass ``None`` to
250 set_module_option.
251
252 .. automethod:: MgrModule.get_module_option
253 .. automethod:: MgrModule.set_module_option
254 .. automethod:: MgrModule.get_localized_module_option
255 .. automethod:: MgrModule.set_localized_module_option
256
257 KV store
258 --------
259
260 Modules have access to a private (per-module) key value store, which
261 is implemented using the monitor's "config-key" commands. Use
262 the ``set_store`` and ``get_store`` methods to access the KV store from
263 your module.
264
265 The KV store commands work in a similar way to the configuration
266 commands. Reads are fast, operating from a local cache. Writes block
267 on persistence and do a round trip to the monitor.
268
269 This data can be access from outside of ceph-mgr using the
270 ``ceph config-key [get|set]`` commands. Key names follow the same
271 conventions as configuration options. Note that any values updated
272 from outside of ceph-mgr will not be seen by running modules until
273 the next restart. Users should be discouraged from accessing module KV
274 data externally -- if it is necessary for users to populate data, modules
275 should provide special commands to set the data via the module.
276
277 Use the ``get_store_prefix`` function to enumerate keys within
278 a particular prefix (i.e. all keys starting with a particular substring).
279
280
281 .. automethod:: MgrModule.get_store
282 .. automethod:: MgrModule.set_store
283 .. automethod:: MgrModule.get_localized_store
284 .. automethod:: MgrModule.set_localized_store
285 .. automethod:: MgrModule.get_store_prefix
286
287
288 Accessing cluster data
289 ----------------------
290
291 Modules have access to the in-memory copies of the Ceph cluster's
292 state that the mgr maintains. Accessor functions as exposed
293 as members of MgrModule.
294
295 Calls that access the cluster or daemon state are generally going
296 from Python into native C++ routines. There is some overhead to this,
297 but much less than for example calling into a REST API or calling into
298 an SQL database.
299
300 There are no consistency rules about access to cluster structures or
301 daemon metadata. For example, an OSD might exist in OSDMap but
302 have no metadata, or vice versa. On a healthy cluster these
303 will be very rare transient states, but modules should be written
304 to cope with the possibility.
305
306 Note that these accessors must not be called in the modules ``__init__``
307 function. This will result in a circular locking exception.
308
309 .. automethod:: MgrModule.get
310 .. automethod:: MgrModule.get_server
311 .. automethod:: MgrModule.list_servers
312 .. automethod:: MgrModule.get_metadata
313 .. automethod:: MgrModule.get_daemon_status
314 .. automethod:: MgrModule.get_perf_schema
315 .. automethod:: MgrModule.get_counter
316 .. automethod:: MgrModule.get_mgr_id
317
318 Exposing health checks
319 ----------------------
320
321 Modules can raise first class Ceph health checks, which will be reported
322 in the output of ``ceph status`` and in other places that report on the
323 cluster's health.
324
325 If you use ``set_health_checks`` to report a problem, be sure to call
326 it again with an empty dict to clear your health check when the problem
327 goes away.
328
329 .. automethod:: MgrModule.set_health_checks
330
331 What if the mons are down?
332 --------------------------
333
334 The manager daemon gets much of its state (such as the cluster maps)
335 from the monitor. If the monitor cluster is inaccessible, whichever
336 manager was active will continue to run, with the latest state it saw
337 still in memory.
338
339 However, if you are creating a module that shows the cluster state
340 to the user then you may well not want to mislead them by showing
341 them that out of date state.
342
343 To check if the manager daemon currently has a connection to
344 the monitor cluster, use this function:
345
346 .. automethod:: MgrModule.have_mon_connection
347
348 Reporting if your module cannot run
349 -----------------------------------
350
351 If your module cannot be run for any reason (such as a missing dependency),
352 then you can report that by implementing the ``can_run`` function.
353
354 .. automethod:: MgrModule.can_run
355
356 Note that this will only work properly if your module can always be imported:
357 if you are importing a dependency that may be absent, then do it in a
358 try/except block so that your module can be loaded far enough to use
359 ``can_run`` even if the dependency is absent.
360
361 Sending commands
362 ----------------
363
364 A non-blocking facility is provided for sending monitor commands
365 to the cluster.
366
367 .. automethod:: MgrModule.send_command
368
369 Receiving notifications
370 -----------------------
371
372 The manager daemon calls the ``notify`` function on all active modules
373 when certain important pieces of cluster state are updated, such as the
374 cluster maps.
375
376 The actual data is not passed into this function, rather it is a cue for
377 the module to go and read the relevant structure if it is interested. Most
378 modules ignore most types of notification: to ignore a notification
379 simply return from this function without doing anything.
380
381 .. automethod:: MgrModule.notify
382
383 Accessing RADOS or CephFS
384 -------------------------
385
386 If you want to use the librados python API to access data stored in
387 the Ceph cluster, you can access the ``rados`` attribute of your
388 ``MgrModule`` instance. This is an instance of ``rados.Rados`` which
389 has been constructed for you using the existing Ceph context (an internal
390 detail of the C++ Ceph code) of the mgr daemon.
391
392 Always use this specially constructed librados instance instead of
393 constructing one by hand.
394
395 Similarly, if you are using libcephfs to access the file system, then
396 use the libcephfs ``create_with_rados`` to construct it from the
397 ``MgrModule.rados`` librados instance, and thereby inherit the correct context.
398
399 Remember that your module may be running while other parts of the cluster
400 are down: do not assume that librados or libcephfs calls will return
401 promptly -- consider whether to use timeouts or to block if the rest of
402 the cluster is not fully available.
403
404 Implementing standby mode
405 -------------------------
406
407 For some modules, it is useful to run on standby manager daemons as well
408 as on the active daemon. For example, an HTTP server can usefully
409 serve HTTP redirect responses from the standby managers so that
410 the user can point his browser at any of the manager daemons without
411 having to worry about which one is active.
412
413 Standby manager daemons look for a subclass of ``StandbyModule``
414 in each module. If the class is not found then the module is not
415 used at all on standby daemons. If the class is found, then
416 its ``serve`` method is called. Implementations of ``StandbyModule``
417 must inherit from ``mgr_module.MgrStandbyModule``.
418
419 The interface of ``MgrStandbyModule`` is much restricted compared to
420 ``MgrModule`` -- none of the Ceph cluster state is available to
421 the module. ``serve`` and ``shutdown`` methods are used in the same
422 way as a normal module class. The ``get_active_uri`` method enables
423 the standby module to discover the address of its active peer in
424 order to make redirects. See the ``MgrStandbyModule`` definition
425 in the Ceph source code for the full list of methods.
426
427 For an example of how to use this interface, look at the source code
428 of the ``dashboard`` module.
429
430 Communicating between modules
431 -----------------------------
432
433 Modules can invoke member functions of other modules.
434
435 .. automethod:: MgrModule.remote
436
437 Be sure to handle ``ImportError`` to deal with the case that the desired
438 module is not enabled.
439
440 If the remote method raises a python exception, this will be converted
441 to a RuntimeError on the calling side, where the message string describes
442 the exception that was originally thrown. If your logic intends
443 to handle certain errors cleanly, it is better to modify the remote method
444 to return an error value instead of raising an exception.
445
446 At time of writing, inter-module calls are implemented without
447 copies or serialization, so when you return a python object, you're
448 returning a reference to that object to the calling module. It
449 is recommend *not* to rely on this reference passing, as in future the
450 implementation may change to serialize arguments and return
451 values.
452
453
454 Shutting down cleanly
455 ---------------------
456
457 If a module implements the ``serve()`` method, it should also implement
458 the ``shutdown()`` method to shutdown cleanly: misbehaving modules
459 may otherwise prevent clean shutdown of ceph-mgr.
460
461 Limitations
462 -----------
463
464 It is not possible to call back into C++ code from a module's
465 ``__init__()`` method. For example calling ``self.get_module_option()`` at
466 this point will result in an assertion failure in ceph-mgr. For modules
467 that implement the ``serve()`` method, it usually makes sense to do most
468 initialization inside that method instead.
469
470 Debugging
471 ---------
472
473 Apparently, we can always use the :ref:`mgr module dev logging` facility
474 for debugging a ceph-mgr module. But some of us might miss `PDB`_ and the
475 interactive Python interpreter. Yes, we can have them as well when developing
476 ceph-mgr modules! ``ceph_mgr_repl.py`` can drop you into an interactive shell
477 talking to ``selftest`` module. With this tool, one can peek and poke the
478 ceph-mgr module, and use all the exposed facilities in quite the same way
479 how we use the Python command line interpreter. For using ``ceph_mgr_repl.py``,
480 we need to
481
482 #. ready a Ceph cluster
483 #. enable the ``selftest`` module
484 #. setup the necessary environment variables
485 #. launch the tool
486
487 .. _PDB: https://docs.python.org/3/library/pdb.html
488
489 Following is a sample session, in which the Ceph version is queried by
490 inputting ``print(mgr.version)`` at the prompt. And later
491 ``timeit`` module is imported to measure the execution time of
492 `mgr.get_mgr_id()`.
493
494 .. code-block:: console
495
496 $ cd build
497 $ MDS=0 MGR=1 OSD=3 MON=1 ../src/vstart.sh -n -x
498 $ bin/ceph mgr module enable selftest
499 $ ../src/pybind/ceph_mgr_repl.py --show-env
500 $ export PYTHONPATH=/home/me/ceph/src/pybind:/home/me/ceph/build/lib/cython_modules/lib.3:/home/me/ceph/src/python-common:$PYTHONPATH
501 $ export LD_LIBRARY_PATH=/home/me/ceph/build/lib:$LD_LIBRARY_PATH
502 $ export PYTHONPATH=/home/me/ceph/src/pybind:/home/me/ceph/build/lib/cython_modules/lib.3:/home/me/ceph/src/python-common:$PYTHONPATH
503 $ export LD_LIBRARY_PATH=/home/me/ceph/build/lib:$LD_LIBRARY_PATH
504 $ ../src/pybind/ceph_mgr_repl.py
505 $ ../src/pybind/ceph_mgr_repl.py
506 Python 3.9.2 (default, Feb 28 2021, 17:03:44)
507 [GCC 10.2.1 20210110] on linux
508 Type "help", "copyright", "credits" or "license" for more information.
509 (MgrModuleInteractiveConsole)
510 [mgr self-test eval] >>> print(mgr.version)
511 ceph version Development (no_version) quincy (dev)
512 [mgr self-test eval] >>> from timeit import timeit
513 [mgr self-test eval] >>> timeit(mgr.get_mgr_id)
514 0.16303414600042743
515 [mgr self-test eval] >>>
516
517 If you want to "talk" to a ceph-mgr module other than ``selftest`` using
518 this tool, you can either add a command to the module you want to debug
519 exactly like how ``mgr self-test eval`` command was added to ``selftest``. Or
520 we can make this simpler by promoting the ``eval()`` method to a dedicated
521 `Mixin`_ class and inherit your ``MgrModule`` subclass from it. And define
522 a command with it. Assuming the prefix of the command is ``mgr my-module eval``,
523 one can just put
524
525 .. prompt:: bash $
526
527 ../src/pybind/ceph_mgr_repl.py --prefix "mgr my-module eval"
528
529
530 .. _Mixin: _https://en.wikipedia.org/wiki/Mixin
531
532 Is something missing?
533 ---------------------
534
535 The ceph-mgr python interface is not set in stone. If you have a need
536 that is not satisfied by the current interface, please bring it up
537 on the ceph-devel mailing list. While it is desired to avoid bloating
538 the interface, it is not generally very hard to expose existing data
539 to the Python code when there is a good reason.
540