]> git.proxmox.com Git - mirror_frr.git/blob - doc/developer/logging.rst
Merge pull request #10928 from anlancs/zebra-cleanup-1
[mirror_frr.git] / doc / developer / logging.rst
1 .. _logging:
2
3 .. highlight:: c
4
5 Logging
6 =======
7
8 One of the most frequent decisions to make while writing code for FRR is what
9 to log, what level to log it at, and when to log it. Here is a list of
10 recommendations for these decisions.
11
12
13 printfrr()
14 ----------
15
16 ``printfrr()`` is FRR's modified version of ``printf()``, designed to make
17 life easier when printing nontrivial datastructures. The following variants
18 are available:
19
20 .. c:function:: ssize_t snprintfrr(char *buf, size_t len, const char *fmt, ...)
21 .. c:function:: ssize_t vsnprintfrr(char *buf, size_t len, const char *fmt, va_list)
22
23 These correspond to ``snprintf``/``vsnprintf``. If you pass NULL for buf
24 or 0 for len, no output is written but the return value is still calculated.
25
26 The return value is always the full length of the output, unconstrained by
27 `len`. It does **not** include the terminating ``\0`` character. A
28 malformed format string can result in a ``-1`` return value.
29
30 .. c:function:: ssize_t csnprintfrr(char *buf, size_t len, const char *fmt, ...)
31 .. c:function:: ssize_t vcsnprintfrr(char *buf, size_t len, const char *fmt, va_list)
32
33 Same as above, but the ``c`` stands for "continue" or "concatenate". The
34 output is appended to the string instead of overwriting it.
35
36 .. c:function:: char *asprintfrr(struct memtype *mt, const char *fmt, ...)
37 .. c:function:: char *vasprintfrr(struct memtype *mt, const char *fmt, va_list)
38
39 These functions allocate a dynamic buffer (using MTYPE `mt`) and print to
40 that. If the format string is malformed, they return a copy of the format
41 string, so the return value is always non-NULL and always dynamically
42 allocated with `mt`.
43
44 .. c:function:: char *asnprintfrr(struct memtype *mt, char *buf, size_t len, const char *fmt, ...)
45 .. c:function:: char *vasnprintfrr(struct memtype *mt, char *buf, size_t len, const char *fmt, va_list)
46
47 This variant tries to use the static buffer provided, but falls back to
48 dynamic allocation if it is insufficient.
49
50 The return value can be either `buf` or a newly allocated string using
51 `mt`. You MUST free it like this::
52
53 char *ret = asnprintfrr(MTYPE_FOO, buf, sizeof(buf), ...);
54 if (ret != buf)
55 XFREE(MTYPE_FOO, ret);
56
57 .. c:function:: ssize_t bprintfrr(struct fbuf *fb, const char *fmt, ...)
58 .. c:function:: ssize_t vbprintfrr(struct fbuf *fb, const char *fmt, va_list)
59
60 These are the "lowest level" functions, which the other variants listed
61 above use to implement their functionality on top. Mainly useful for
62 implementing printfrr extensions since those get a ``struct fbuf *`` to
63 write their output to.
64
65 .. c:macro:: FMT_NSTD(expr)
66
67 This macro turns off/on format warnings as needed when non-ISO-C
68 compatible printfrr extensions are used (e.g. ``%.*p`` or ``%Ld``.)::
69
70 vty_out(vty, "standard compatible %pI4\n", &addr);
71 FMT_NSTD(vty_out(vty, "non-standard %-47.*pHX\n", (int)len, buf));
72
73 When the frr-format plugin is in use, this macro is a no-op since the
74 frr-format plugin supports all printfrr extensions. Since the FRR CI
75 includes a system with the plugin enabled, this means format errors will
76 not slip by undetected even with FMT_NSTD.
77
78 .. note::
79
80 ``printfrr()`` does not support the ``%n`` format.
81
82 AS-Safety
83 ^^^^^^^^^
84
85 ``printfrr()`` are AS-Safe under the following conditions:
86
87 * the ``[v]as[n]printfrr`` variants are not AS-Safe (allocating memory)
88 * floating point specifiers are not AS-Safe (system printf is used for these)
89 * the positional ``%1$d`` syntax should not be used (8 arguments are supported
90 while AS-Safe)
91 * extensions are only AS-Safe if their printer is AS-Safe
92
93 printfrr Extensions
94 -------------------
95
96 ``printfrr()`` format strings can be extended with suffixes after `%p` or `%d`.
97 Printf features like field lengths can be used normally with these extensions,
98 e.g. ``%-15pI4`` works correctly, **except if the extension consumes the
99 width or precision**. Extensions that do so are listed below as ``%*pXX``
100 rather than ``%pXX``.
101
102 The extension specifier after ``%p`` or ``%d`` is always an uppercase letter;
103 by means of established pattern uppercase letters and numbers form the type
104 identifier which may be followed by lowercase flags.
105
106 You can grep the FRR source for ``printfrr_ext_autoreg`` to see all extended
107 printers and what exactly they do. More printers are likely to be added as
108 needed/useful, so the list here may be outdated.
109
110 .. note::
111
112 The ``zlog_*``/``flog_*`` and ``vty_out`` functions all use printfrr
113 internally, so these extensions are available there. However, they are
114 **not** available when calling ``snprintf`` directly. You need to call
115 ``snprintfrr`` instead.
116
117 Networking data types
118 ^^^^^^^^^^^^^^^^^^^^^
119
120 .. role:: frrfmtout(code)
121
122 .. frrfmt:: %pI4 (struct in_addr *, in_addr_t *)
123
124 :frrfmtout:`1.2.3.4`
125
126 ``%pI4s``: :frrfmtout:`*` — print star instead of ``0.0.0.0`` (for multicast)
127
128 .. frrfmt:: %pI6 (struct in6_addr *)
129
130 :frrfmtout:`fe80::1234`
131
132 ``%pI6s``: :frrfmtout:`*` — print star instead of ``::`` (for multicast)
133
134 .. frrfmt:: %pEA (struct ethaddr *)
135
136 :frrfmtout:`01:23:45:67:89:ab`
137
138 .. frrfmt:: %pIA (struct ipaddr *)
139
140 :frrfmtout:`1.2.3.4` / :frrfmtout:`fe80::1234`
141
142 ``%pIAs``: — print star instead of zero address (for multicast)
143
144 .. frrfmt:: %pFX (struct prefix *)
145
146 :frrfmtout:`1.2.3.0/24` / :frrfmtout:`fe80::1234/64`
147
148 This accepts the following types:
149
150 - :c:struct:`prefix`
151 - :c:struct:`prefix_ipv4`
152 - :c:struct:`prefix_ipv6`
153 - :c:struct:`prefix_eth`
154 - :c:struct:`prefix_evpn`
155 - :c:struct:`prefix_fs`
156
157 It does **not** accept the following types:
158
159 - :c:struct:`prefix_ls`
160 - :c:struct:`prefix_rd`
161 - :c:struct:`prefix_ptr`
162 - :c:struct:`prefix_sg` (use :frrfmt:`%pPSG4`)
163 - :c:union:`prefixptr` (dereference to get :c:struct:`prefix`)
164 - :c:union:`prefixconstptr` (dereference to get :c:struct:`prefix`)
165
166 Options:
167
168 ``%pFXh``: (address only) :frrfmtout:`1.2.3.0` / :frrfmtout:`fe80::1234`
169
170 .. frrfmt:: %pPSG4 (struct prefix_sg *)
171
172 :frrfmtout:`(*,1.2.3.4)`
173
174 This is *(S,G)* output for use in zebra. (Note prefix_sg is not a prefix
175 "subclass" like the other prefix_* structs.)
176
177 .. frrfmt:: %pSU (union sockunion *)
178
179 ``%pSU``: :frrfmtout:`1.2.3.4` / :frrfmtout:`fe80::1234`
180
181 ``%pSUs``: :frrfmtout:`1.2.3.4` / :frrfmtout:`fe80::1234%89`
182 (adds IPv6 scope ID as integer)
183
184 ``%pSUp``: :frrfmtout:`1.2.3.4:567` / :frrfmtout:`[fe80::1234]:567`
185 (adds port)
186
187 ``%pSUps``: :frrfmtout:`1.2.3.4:567` / :frrfmtout:`[fe80::1234%89]:567`
188 (adds port and scope ID)
189
190 .. frrfmt:: %pRN (struct route_node *, struct bgp_node *, struct agg_node *)
191
192 :frrfmtout:`192.168.1.0/24` (dst-only node)
193
194 :frrfmtout:`2001:db8::/32 from fe80::/64` (SADR node)
195
196 .. frrfmt:: %pNH (struct nexthop *)
197
198 ``%pNHvv``: :frrfmtout:`via 1.2.3.4, eth0` — verbose zebra format
199
200 ``%pNHv``: :frrfmtout:`1.2.3.4, via eth0` — slightly less verbose zebra format
201
202 ``%pNHs``: :frrfmtout:`1.2.3.4 if 15` — same as :c:func:`nexthop2str()`
203
204 ``%pNHcg``: :frrfmtout:`1.2.3.4` — compact gateway only
205
206 ``%pNHci``: :frrfmtout:`eth0` — compact interface only
207
208 .. frrfmt:: %dPF (int)
209
210 :frrfmtout:`AF_INET`
211
212 Prints an `AF_*` / `PF_*` constant. ``PF`` is used here to avoid confusion
213 with `AFI` constants, even though the FRR codebase prefers `AF_INET` over
214 `PF_INET` & co.
215
216 .. frrfmt:: %dSO (int)
217
218 :frrfmtout:`SOCK_STREAM`
219
220 Time/interval formats
221 ^^^^^^^^^^^^^^^^^^^^^
222
223 .. frrfmt:: %pTS (struct timespec *)
224
225 .. frrfmt:: %pTV (struct timeval *)
226
227 .. frrfmt:: %pTT (time_t *)
228
229 Above 3 options internally result in the same code being called, support
230 the same flags and produce equal output with one exception: ``%pTT``
231 has no sub-second precision and the formatter will never print a
232 (nonsensical) ``.000``.
233
234 Exactly one of ``I``, ``M`` or ``R`` must immediately follow after
235 ``TS``/``TV``/``TT`` to specify whether the input is an interval, monotonic
236 timestamp or realtime timestamp:
237
238 ``%pTVI``: input is an interval, not a timestamp. Print interval.
239
240 ``%pTVIs``: input is an interval, convert to wallclock by subtracting it
241 from current time (i.e. interval has passed **s**\ ince.)
242
243 ``%pTVIu``: input is an interval, convert to wallclock by adding it to
244 current time (i.e. **u**\ ntil interval has passed.)
245
246 ``%pTVM`` - input is a timestamp on CLOCK_MONOTONIC, convert to wallclock
247 time (by grabbing current CLOCK_MONOTONIC and CLOCK_REALTIME and doing the
248 math) and print calendaric date.
249
250 ``%pTVMs`` - input is a timestamp on CLOCK_MONOTONIC, print interval
251 **s**\ ince that timestamp (elapsed.)
252
253 ``%pTVMu`` - input is a timestamp on CLOCK_MONOTONIC, print interval
254 **u**\ ntil that timestamp (deadline.)
255
256 ``%pTVR`` - input is a timestamp on CLOCK_REALTIME, print calendaric date.
257
258 ``%pTVRs`` - input is a timestamp on CLOCK_REALTIME, print interval
259 **s**\ ince that timestamp.
260
261 ``%pTVRu`` - input is a timestamp on CLOCK_REALTIME, print interval
262 **u**\ ntil that timestamp.
263
264 ``%pTVA`` - reserved for CLOCK_TAI in case a PTP implementation is
265 interfaced to FRR. Not currently implemented.
266
267 .. note::
268
269 If ``%pTVRs`` or ``%pTVRu`` are used, this is generally an indication
270 that a CLOCK_MONOTONIC timestamp should be used instead (or added in
271 parallel.) CLOCK_REALTIME might be adjusted by NTP, PTP or similar
272 procedures, causing bogus intervals to be printed.
273
274 ``%pTVM`` on first look might be assumed to have the same problem, but
275 on closer thought the assumption is always that current system time is
276 correct. And since a CLOCK_MONOTONIC interval is also quite safe to
277 assume to be correct, the (past) absolute timestamp to be printed from
278 this can likely be correct even if it doesn't match what CLOCK_REALTIME
279 would have indicated at that point in the past. This logic does,
280 however, not quite work for *future* times.
281
282 Generally speaking, almost all use cases in FRR should (and do) use
283 CLOCK_MONOTONIC (through :c:func:`monotime()`.)
284
285 Flags common to printing calendar times and intervals:
286
287 ``p``: include spaces in appropriate places (depends on selected format.)
288
289 ``%p.3TV...``: specify sub-second resolution (use with ``FMT_NSTD`` to
290 suppress gcc warning.) As noted above, ``%pTT`` will never print sub-second
291 digits since there are none. Only some formats support printing sub-second
292 digits and the default may vary.
293
294 The following flags are available for printing calendar times/dates:
295
296 (no flag): :frrfmtout:`Sat Jan 1 00:00:00 2022` - print output from
297 ``ctime()``, in local time zone. Since FRR does not currently use/enable
298 locale support, this is always the C locale. (Locale support getting added
299 is unlikely for the time being and would likely break other things worse
300 than this.)
301
302 ``i``: :frrfmtout:`2022-01-01T00:00:00.123` - ISO8601 timestamp in local
303 time zone (note there is no ``Z`` or ``+00:00`` suffix.) Defaults to
304 millisecond precision.
305
306 ``ip``: :frrfmtout:`2022-01-01 00:00:00.123` - use readable form of ISO8601
307 with space instead of ``T`` separator.
308
309 The following flags are available for printing intervals:
310
311 (no flag): :frrfmtout:`9w9d09:09:09.123` - does not match any
312 preexisting format; added because it does not lose precision (like ``t``)
313 for longer intervals without printing huge numbers (like ``h``/``m``).
314 Defaults to millisecond precision. The week/day fields are left off if
315 they're zero, ``p`` adds a space after the respective letter.
316
317 ``t``: :frrfmtout:`9w9d09h`, :frrfmtout:`9d09h09m`, :frrfmtout:`09:09:09` -
318 this replaces :c:func:`frrtime_to_interval()`. ``p`` adds spaces after
319 week/day/hour letters.
320
321 ``d``: print decimal number of seconds. Defaults to millisecond precision.
322
323 ``x`` / ``tx`` / ``dx``: Like no flag / ``t`` / ``d``, but print
324 :frrfmtout:`-` for zero or negative intervals (for use with unset timers.)
325
326 ``h``: :frrfmtout:`09:09:09`
327
328 ``hx``: :frrfmtout:`09:09:09`, :frrfmtout:`--:--:--` - this replaces
329 :c:func:`pim_time_timer_to_hhmmss()`.
330
331 ``m``: :frrfmtout:`09:09`
332
333 ``mx``: :frrfmtout:`09:09`, :frrfmtout:`--:--` - this replaces
334 :c:func:`pim_time_timer_to_mmss()`.
335
336 FRR library helper formats
337 ^^^^^^^^^^^^^^^^^^^^^^^^^^
338
339 .. frrfmt:: %pTH (struct thread *)
340
341 Print remaining time on timer thread. Interval-printing flag characters
342 listed above for ``%pTV`` can be added, e.g. ``%pTHtx``.
343
344 ``NULL`` pointers are printed as ``-``.
345
346 .. frrfmt:: %pTHD (struct thread *)
347
348 Print debugging information for given thread. Sample output:
349
350 .. code-block:: none
351
352 {(thread *)NULL}
353 {(thread *)0x55a3b5818910 arg=0x55a3b5827c50 timer r=7.824 mld_t_query() &mld_ifp->t_query from pimd/pim6_mld.c:1369}
354 {(thread *)0x55a3b5827230 arg=0x55a3b5827c50 read fd=16 mld_t_recv() &mld_ifp->t_recv from pimd/pim6_mld.c:1186}
355
356 (The output is aligned to some degree.)
357
358 FRR daemon specific formats
359 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
360
361 The following formats are only available in specific daemons, as the code
362 implementing them is part of the daemon, not the library.
363
364 zebra
365 """""
366
367 .. frrfmt:: %pZN (struct route_node *)
368
369 Print information for a RIB node, including zebra-specific data.
370
371 :frrfmtout:`::/0 src fe80::/64 (MRIB)` (``%pZN``)
372
373 :frrfmtout:`1234` (``%pZNt`` - table number)
374
375 bgpd
376 """"
377
378 .. frrfmt:: %pBD (struct bgp_dest *)
379
380 Print prefix for a BGP destination.
381
382 :frrfmtout:`fe80::1234/64`
383
384 .. frrfmt:: %pBP (struct peer *)
385
386 :frrfmtout:`192.168.1.1(leaf1.frrouting.org)`
387
388 Print BGP peer's IP and hostname together.
389
390 pimd/pim6d
391 """"""""""
392
393 .. frrfmt:: %pPA (pim_addr *)
394
395 Format IP address according to IP version (pimd vs. pim6d) being compiled.
396
397 :frrfmtout:`fe80::1234` / :frrfmtout:`10.0.0.1`
398
399 :frrfmtout:`*` (``%pPAs`` - replace 0.0.0.0/:: with star)
400
401 .. frrfmt:: %pSG (pim_sgaddr *)
402
403 Format S,G pair according to IP version (pimd vs. pim6d) being compiled.
404 Braces are included.
405
406 :frrfmtout:`(*,224.0.0.0)`
407
408
409 General utility formats
410 ^^^^^^^^^^^^^^^^^^^^^^^
411
412 .. frrfmt:: %m (no argument)
413
414 :frrfmtout:`Permission denied`
415
416 Prints ``strerror(errno)``. Does **not** consume any input argument, don't
417 pass ``errno``!
418
419 (This is a GNU extension not specific to FRR. FRR guarantees it is
420 available on all systems in printfrr, though BSDs support it in printf too.)
421
422 .. frrfmt:: %pSQ (char *)
423
424 ([S]tring [Q]uote.) Like ``%s``, but produce a quoted string. Options:
425
426 ``n`` - treat ``NULL`` as empty string instead.
427
428 ``q`` - include ``""`` quotation marks. Note: ``NULL`` is printed as
429 ``(null)``, not ``"(null)"`` unless ``n`` is used too. This is
430 intentional.
431
432 ``s`` - use escaping suitable for RFC5424 syslog. This means ``]`` is
433 escaped too.
434
435 If a length is specified (``%*pSQ`` or ``%.*pSQ``), null bytes in the input
436 string do not end the string and are just printed as ``\x00``.
437
438 .. frrfmt:: %pSE (char *)
439
440 ([S]tring [E]scape.) Like ``%s``, but escape special characters.
441 Options:
442
443 ``n`` - treat ``NULL`` as empty string instead.
444
445 Unlike :frrfmt:`%pSQ`, this escapes many more characters that are fine for
446 a quoted string but not on their own.
447
448 If a length is specified (``%*pSE`` or ``%.*pSE``), null bytes in the input
449 string do not end the string and are just printed as ``\x00``.
450
451 .. frrfmt:: %pVA (struct va_format *)
452
453 Recursively invoke printfrr, with arguments passed in through:
454
455 .. c:struct:: va_format
456
457 .. c:member:: const char *fmt
458
459 Format string to use for the recursive printfrr call.
460
461 .. c:member:: va_list *va
462
463 Formatting arguments. Note this is passed as a pointer, not - as in
464 most other places - a direct struct reference. Internally uses
465 ``va_copy()`` so repeated calls can be made (e.g. for determining
466 output length.)
467
468 .. frrfmt:: %pFB (struct fbuf *)
469
470 Insert text from a ``struct fbuf *``, i.e. the output of a call to
471 :c:func:`bprintfrr()`.
472
473 .. frrfmt:: %*pHX (void *, char *, unsigned char *)
474
475 ``%pHX``: :frrfmtout:`12 34 56 78`
476
477 ``%pHXc``: :frrfmtout:`12:34:56:78` (separate with [c]olon)
478
479 ``%pHXn``: :frrfmtout:`12345678` (separate with [n]othing)
480
481 Insert hexdump. This specifier requires a precision or width to be
482 specified. A precision (``%.*pHX``) takes precedence, but generates a
483 compiler warning since precisions are undefined for ``%p`` in ISO C. If
484 no precision is given, the width is used instead (and normal handling of
485 the width is suppressed).
486
487 Note that width and precision are ``int`` arguments, not ``size_t``. Use
488 like::
489
490 char *buf;
491 size_t len;
492
493 snprintfrr(out, sizeof(out), "... %*pHX ...", (int)len, buf);
494
495 /* with padding to width - would generate a warning due to %.*p */
496 FMT_NSTD(snprintfrr(out, sizeof(out), "... %-47.*pHX ...", (int)len, buf));
497
498 .. frrfmt:: %*pHS (void *, char *, unsigned char *)
499
500 ``%pHS``: :frrfmtout:`hex.dump`
501
502 This is a complementary format for :frrfmt:`%*pHX` to print the text
503 representation for a hexdump. Non-printable characters are replaced with
504 a dot.
505
506 Integer formats
507 ^^^^^^^^^^^^^^^
508
509 .. note::
510
511 These formats currently only exist for advanced type checking with the
512 ``frr-format`` GCC plugin. They should not be used directly since they will
513 cause compiler warnings when used without the plugin. Use with
514 :c:macro:`FMT_NSTD` if necessary.
515
516 It is possible ISO C23 may introduce another format for these, possibly
517 ``%w64d`` discussed in `JTC 1/SC 22/WG 14/N2680 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2680.pdf>`_.
518
519 .. frrfmt:: %Lu (uint64_t)
520
521 :frrfmtout:`12345`
522
523 .. frrfmt:: %Ld (int64_t)
524
525 :frrfmtout:`-12345`
526
527 Log levels
528 ----------
529
530 Errors and warnings
531 ^^^^^^^^^^^^^^^^^^^
532
533 If it is something that the user will want to look at and maybe do
534 something, it is either an **error** or a **warning**.
535
536 We're expecting that warnings and errors are in some way visible to the
537 user (in the worst case by looking at the log after the network broke, but
538 maybe by a syslog collector from all routers.) Therefore, anything that
539 needs to get the user in the loop—and only these things—are warnings or
540 errors.
541
542 Note that this doesn't necessarily mean the user needs to fix something in
543 the FRR instance. It also includes when we detect something else needs
544 fixing, for example another router, the system we're running on, or the
545 configuration. The common point is that the user should probably do
546 *something*.
547
548 Deciding between a warning and an error is slightly less obvious; the rule
549 of thumb here is that an error will cause considerable fallout beyond its
550 direct effect. Closing a BGP session due to a malformed update is an error
551 since all routes from the peer are dropped; discarding one route because
552 its attributes don't make sense is a warning.
553
554 This also loosely corresponds to the kind of reaction we're expecting from
555 the user. An error is likely to need immediate response while a warning
556 might be snoozed for a bit and addressed as part of general maintenance.
557 If a problem will self-repair (e.g. by retransmits), it should be a
558 warning—unless the impact until that self-repair is very harsh.
559
560 Examples for warnings:
561
562 * a BGP update, LSA or LSP could not be processed, but operation is
563 proceeding and the broken pieces are likely to self-fix later
564 * some kind of controller cannot be reached, but we can work without it
565 * another router is using some unknown or unsupported capability
566
567 Examples for errors:
568
569 * dropping a BGP session due to malformed data
570 * a socket for routing protocol operation cannot be opened
571 * desynchronization from network state because something went wrong
572 * *everything that we as developers would really like to be notified about,
573 i.e. some assumption in the code isn't holding up*
574
575
576 Informational messages
577 ^^^^^^^^^^^^^^^^^^^^^^
578
579 Anything that provides introspection to the user during normal operation
580 is an **info** message.
581
582 This includes all kinds of operational state transitions and events,
583 especially if they might be interesting to the user during the course of
584 figuring out a warning or an error.
585
586 By itself, these messages should mostly be statements of fact. They might
587 indicate the order and relationship in which things happened. Also covered
588 are conditions that might be "operational issues" like a link failure due
589 to an unplugged cable. If it's pretty much the point of running a routing
590 daemon for, it's not a warning or an error, just business as usual.
591
592 The user should be able to see the state of these bits from operational
593 state output, i.e. `show interface` or `show foobar neighbors`. The log
594 message indicating the change may have been printed weeks ago, but the
595 state can always be viewed. (If some state change has an info message but
596 no "show" command, maybe that command needs to be added.)
597
598 Examples:
599
600 * all kinds of up/down state changes
601
602 * interface coming up or going down
603 * addresses being added or deleted
604 * peers and neighbors coming up or going down
605
606 * rejection of some routes due to user-configured route maps
607 * backwards compatibility handling because another system on the network
608 has a different or smaller feature set
609
610 .. note::
611 The previously used **notify** priority is replaced with *info* in all
612 cases. We don't currently have a well-defined use case for it.
613
614
615 Debug messages and asserts
616 ^^^^^^^^^^^^^^^^^^^^^^^^^^
617
618 Everything that is only interesting on-demand, or only while developing,
619 is a **debug** message. It might be interesting to the user for a
620 particularly evasive issue, but in general these are details that an
621 average user might not even be able to make sense of.
622
623 Most (or all?) debug messages should be behind a `debug foobar` category
624 switch that controls which subset of these messages is currently
625 interesting and thus printed. If a debug message doesn't have such a
626 guard, there should be a good explanation as to why.
627
628 Conversely, debug messages are the only thing that should be guarded by
629 these switches. Neither info nor warning or error messages should be
630 hidden in this way.
631
632 **Asserts** should only be used as pretty crashes. We are expecting that
633 asserts remain enabled in production builds, but please try to not use
634 asserts in a way that would cause a security problem if the assert wasn't
635 there (i.e. don't use them for length checks.)
636
637 The purpose of asserts is mainly to help development and bug hunting. If
638 the daemon crashes, then having some more information is nice, and the
639 assert can provide crucial hints that cut down on the time needed to track
640 an issue. That said, if the issue can be reasonably handled and/or isn't
641 going to crash the daemon, it shouldn't be an assert.
642
643 For anything else where internal constraints are violated but we're not
644 breaking due to it, it's an error instead (not a debug.) These require
645 "user action" of notifying the developers.
646
647 Examples:
648
649 * mismatched :code:`prev`/:code:`next` pointers in lists
650 * some field that is absolutely needed is :code:`NULL`
651 * any other kind of data structure corruption that will cause the daemon
652 to crash sooner or later, one way or another
653
654 Thread-local buffering
655 ----------------------
656
657 The core logging code in :file:`lib/zlog.c` allows setting up per-thread log
658 message buffers in order to improve logging performance. The following rules
659 apply for this buffering:
660
661 * Only messages of priority *DEBUG* or *INFO* are buffered.
662 * Any higher-priority message causes the thread's entire buffer to be flushed,
663 thus message ordering is preserved on a per-thread level.
664 * There is no guarantee on ordering between different threads; in most cases
665 this is arbitrary to begin with since the threads essentially race each
666 other in printing log messages. If an order is established with some
667 synchronization primitive, add calls to :c:func:`zlog_tls_buffer_flush()`.
668 * The buffers are only ever accessed by the thread they are created by. This
669 means no locking is necessary.
670
671 Both the main/default thread and additional threads created by
672 :c:func:`frr_pthread_new()` with the default :c:func:`frr_run()` handler will
673 initialize thread-local buffering and call :c:func:`zlog_tls_buffer_flush()`
674 when idle.
675
676 If some piece of code runs for an extended period, it may be useful to insert
677 calls to :c:func:`zlog_tls_buffer_flush()` in appropriate places:
678
679 .. c:function:: void zlog_tls_buffer_flush(void)
680
681 Write out any pending log messages that the calling thread may have in its
682 buffer. This function is safe to call regardless of the per-thread log
683 buffer being set up / in use or not.
684
685 When working with threads that do not use the :c:struct:`thread_master`
686 event loop, per-thread buffers can be managed with:
687
688 .. c:function:: void zlog_tls_buffer_init(void)
689
690 Set up thread-local buffering for log messages. This function may be
691 called repeatedly without adverse effects, but remember to call
692 :c:func:`zlog_tls_buffer_fini()` at thread exit.
693
694 .. warning::
695
696 If this function is called, but :c:func:`zlog_tls_buffer_flush()` is
697 not used, log message output will lag behind since messages will only be
698 written out when the buffer is full.
699
700 Exiting the thread without calling :c:func:`zlog_tls_buffer_fini()`
701 will cause buffered log messages to be lost.
702
703 .. c:function:: void zlog_tls_buffer_fini(void)
704
705 Flush pending messages and tear down thread-local log message buffering.
706 This function may be called repeatedly regardless of whether
707 :c:func:`zlog_tls_buffer_init()` was ever called.
708
709 Log targets
710 -----------
711
712 The actual logging subsystem (in :file:`lib/zlog.c`) is heavily separated
713 from the actual log writers. It uses an atomic linked-list (`zlog_targets`)
714 with RCU to maintain the log targets to be called. This list is intended to
715 function as "backend" only, it **is not used for configuration**.
716
717 Logging targets provide their configuration layer on top of this and maintain
718 their own capability to enumerate and store their configuration. Some targets
719 (e.g. syslog) are inherently single instance and just stuff their config in
720 global variables. Others (e.g. file/fd output) are multi-instance capable.
721 There is another layer boundary here between these and the VTY configuration
722 that they use.
723
724 Basic internals
725 ^^^^^^^^^^^^^^^
726
727 .. c:struct:: zlog_target
728
729 This struct needs to be filled in by any log target and then passed to
730 :c:func:`zlog_target_replace()`. After it has been registered,
731 **RCU semantics apply**. Most changes to associated data should make a
732 copy, change that, and then replace the entire struct.
733
734 Additional per-target data should be "appended" by embedding this struct
735 into a larger one, for use with `containerof()`, and
736 :c:func:`zlog_target_clone()` and :c:func:`zlog_target_free()` should be
737 used to allocate/free the entire container struct.
738
739 Do not use this structure to maintain configuration. It should only
740 contain (a copy of) the data needed to perform the actual logging. For
741 example, the syslog target uses this:
742
743 .. code-block:: c
744
745 struct zlt_syslog {
746 struct zlog_target zt;
747 int syslog_facility;
748 };
749
750 static void zlog_syslog(struct zlog_target *zt, struct zlog_msg *msgs[], size_t nmsgs)
751 {
752 struct zlt_syslog *zte = container_of(zt, struct zlt_syslog, zt);
753 size_t i;
754
755 for (i = 0; i < nmsgs; i++)
756 if (zlog_msg_prio(msgs[i]) <= zt->prio_min)
757 syslog(zlog_msg_prio(msgs[i]) | zte->syslog_facility, "%s",
758 zlog_msg_text(msgs[i], NULL));
759 }
760
761
762 .. c:function:: struct zlog_target *zlog_target_clone(struct memtype *mt, struct zlog_target *oldzt, size_t size)
763
764 Allocates a logging target struct. Note that the ``oldzt`` argument may be
765 ``NULL`` to allocate a "from scratch". If ``oldzt`` is not ``NULL``, the
766 generic bits in :c:struct:`zlog_target` are copied. **Target specific
767 bits are not copied.**
768
769 .. c:function:: struct zlog_target *zlog_target_replace(struct zlog_target *oldzt, struct zlog_target *newzt)
770
771 Adds, replaces or deletes a logging target (either ``oldzt`` or ``newzt`` may be ``NULL``.)
772
773 Returns ``oldzt`` for freeing. The target remains possibly in use by
774 other threads until the RCU cycle ends. This implies you cannot release
775 resources (e.g. memory, file descriptors) immediately.
776
777 The replace operation is not atomic; for a brief period it is possible that
778 messages are delivered on both ``oldzt`` and ``newzt``.
779
780 .. warning::
781
782 ``oldzt`` must remain **functional** until the RCU cycle ends.
783
784 .. c:function:: void zlog_target_free(struct memtype *mt, struct zlog_target *zt)
785
786 Counterpart to :c:func:`zlog_target_clone()`, frees a target (using RCU.)
787
788 .. c:member:: void (*zlog_target.logfn)(struct zlog_target *zt, struct zlog_msg *msgs[], size_t nmsg)
789
790 Called on a target to deliver "normal" logging messages. ``msgs`` is an
791 array of opaque structs containing the actual message. Use ``zlog_msg_*``
792 functions to access message data (this is done to allow some optimizations,
793 e.g. lazy formatting the message text and timestamp as needed.)
794
795 .. note::
796
797 ``logfn()`` must check each individual message's priority value against
798 the configured ``prio_min``. While the ``prio_min`` field is common to
799 all targets and used by the core logging code to early-drop unneeded log
800 messages, the array is **not** filtered for each ``logfn()`` call.
801
802 .. c:member:: void (*zlog_target.logfn_sigsafe)(struct zlog_target *zt, const char *text, size_t len)
803
804 Called to deliver "exception" logging messages (i.e. SEGV messages.)
805 Must be Async-Signal-Safe (may not allocate memory or call "complicated"
806 libc functions.) May be ``NULL`` if the log target cannot handle this.
807
808 Standard targets
809 ^^^^^^^^^^^^^^^^
810
811 :file:`lib/zlog_targets.c` provides the standard file / fd / syslog targets.
812 The syslog target is single-instance while file / fd targets can be
813 instantiated as needed. There are 3 built-in targets that are fully
814 autonomous without any config:
815
816 - startup logging to `stderr`, until either :c:func:`zlog_startup_end()` or
817 :c:func:`zlog_aux_init()` is called.
818 - stdout logging for non-daemon programs using :c:func:`zlog_aux_init()`
819 - crashlogs written to :file:`/var/tmp/frr.daemon.crashlog`
820
821 The regular CLI/command-line logging setup is handled by :file:`lib/log_vty.c`
822 which makes the appropriate instantiations of syslog / file / fd targets.
823
824 .. todo::
825
826 :c:func:`zlog_startup_end()` should do an explicit switchover from
827 startup stderr logging to configured logging. Currently, configured logging
828 starts in parallel as soon as the respective setup is executed. This results
829 in some duplicate logging.