doc/developer/logging.rst

   1 .. _logging:
   2
   3 .. highlight:: c
   4
   5 Logging
   6 =======
   7
   8 One of the most frequent decisions to make while writing code for FRR is what
   9 to log, what level to log it at, and when to log it.  Here is a list of
  10 recommendations for these decisions.
  11
  12
  13 printfrr()
  14 ----------
  15
  16 ``printfrr()`` is FRR's modified version of ``printf()``, designed to make
  17 life easier when printing nontrivial datastructures.  The following variants
  18 are available:
  19
  20 .. c:function:: ssize_t snprintfrr(char *buf, size_t len, const char *fmt, ...)
  21 .. c:function:: ssize_t vsnprintfrr(char *buf, size_t len, const char *fmt, va_list)
  22
  23    These correspond to ``snprintf``/``vsnprintf``.  If you pass NULL for buf
  24    or 0 for len, no output is written but the return value is still calculated.
  25
  26    The return value is always the full length of the output, unconstrained by
  27    `len`.  It does **not** include the terminating ``\0`` character.  A
  28    malformed format string can result in a ``-1`` return value.
  29
  30 .. c:function:: ssize_t csnprintfrr(char *buf, size_t len, const char *fmt, ...)
  31 .. c:function:: ssize_t vcsnprintfrr(char *buf, size_t len, const char *fmt, va_list)
  32
  33    Same as above, but the ``c`` stands for "continue" or "concatenate".  The
  34    output is appended to the string instead of overwriting it.
  35
  36 .. c:function:: char *asprintfrr(struct memtype *mt, const char *fmt, ...)
  37 .. c:function:: char *vasprintfrr(struct memtype *mt, const char *fmt, va_list)
  38
  39    These functions allocate a dynamic buffer (using MTYPE `mt`) and print to
  40    that.  If the format string is malformed, they return a copy of the format
  41    string, so the return value is always non-NULL and always dynamically
  42    allocated with `mt`.
  43
  44 .. c:function:: char *asnprintfrr(struct memtype *mt, char *buf, size_t len, const char *fmt, ...)
  45 .. c:function:: char *vasnprintfrr(struct memtype *mt, char *buf, size_t len, const char *fmt, va_list)
  46
  47    This variant tries to use the static buffer provided, but falls back to
  48    dynamic allocation if it is insufficient.
  49
  50    The return value can be either `buf` or a newly allocated string using
  51    `mt`.  You MUST free it like this::
  52
  53       char *ret = asnprintfrr(MTYPE_FOO, buf, sizeof(buf), ...);
  54       if (ret != buf)
  55          XFREE(MTYPE_FOO, ret);
  56
  57 .. c:function:: ssize_t bprintfrr(struct fbuf *fb, const char *fmt, ...)
  58 .. c:function:: ssize_t vbprintfrr(struct fbuf *fb, const char *fmt, va_list)
  59
  60    These are the "lowest level" functions, which the other variants listed
  61    above use to implement their functionality on top.  Mainly useful for
  62    implementing printfrr extensions since those get a ``struct fbuf *`` to
  63    write their output to.
  64
  65 .. c:macro:: FMT_NSTD(expr)
  66
  67    This macro turns off/on format warnings as needed when non-ISO-C
  68    compatible printfrr extensions are used (e.g. ``%.*p`` or ``%Ld``.)::
  69
  70       vty_out(vty, "standard compatible %pI4\n", &addr);
  71       FMT_NSTD(vty_out(vty, "non-standard %-47.*pHX\n", (int)len, buf));
  72
  73    When the frr-format plugin is in use, this macro is a no-op since the
  74    frr-format plugin supports all printfrr extensions.  Since the FRR CI
  75    includes a system with the plugin enabled, this means format errors will
  76    not slip by undetected even with FMT_NSTD.
  77
  78 .. note::
  79
  80    ``printfrr()`` does not support the ``%n`` format.
  81
  82 AS-Safety
  83 ^^^^^^^^^
  84
  85 ``printfrr()`` are AS-Safe under the following conditions:
  86
  87 * the ``[v]as[n]printfrr`` variants are not AS-Safe (allocating memory)
  88 * floating point specifiers are not AS-Safe (system printf is used for these)
  89 * the positional ``%1$d`` syntax should not be used (8 arguments are supported
  90   while AS-Safe)
  91 * extensions are only AS-Safe if their printer is AS-Safe
  92
  93 printfrr Extensions
  94 -------------------
  95
  96 ``printfrr()`` format strings can be extended with suffixes after `%p` or `%d`.
  97 Printf features like field lengths can be used normally with these extensions,
  98 e.g. ``%-15pI4`` works correctly, **except if the extension consumes the
  99 width or precision**.  Extensions that do so are listed below as ``%*pXX``
 100 rather than ``%pXX``.
 101
 102 The extension specifier after ``%p`` or ``%d`` is always an uppercase letter;
 103 by means of established pattern uppercase letters and numbers form the type
 104 identifier which may be followed by lowercase flags.
 105
 106 You can grep the FRR source for ``printfrr_ext_autoreg`` to see all extended
 107 printers and what exactly they do.  More printers are likely to be added as
 108 needed/useful, so the list here may be outdated.
 109
 110 .. note::
 111
 112    The ``zlog_*``/``flog_*`` and ``vty_out`` functions all use printfrr
 113    internally, so these extensions are available there.  However, they are
 114    **not** available when calling ``snprintf`` directly.  You need to call
 115    ``snprintfrr`` instead.
 116
 117 Networking data types
 118 ^^^^^^^^^^^^^^^^^^^^^
 119
 120 .. role:: frrfmtout(code)
 121
 122 .. frrfmt:: %pI4 (struct in_addr *, in_addr_t *)
 123
 124    :frrfmtout:`1.2.3.4`
 125
 126    ``%pI4s``: :frrfmtout:`*` — print star instead of ``0.0.0.0`` (for multicast)
 127
 128 .. frrfmt:: %pI6 (struct in6_addr *)
 129
 130    :frrfmtout:`fe80::1234`
 131
 132    ``%pI6s``: :frrfmtout:`*` — print star instead of ``::`` (for multicast)
 133
 134 .. frrfmt:: %pEA (struct ethaddr *)
 135
 136    :frrfmtout:`01:23:45:67:89:ab`
 137
 138 .. frrfmt:: %pIA (struct ipaddr *)
 139
 140    :frrfmtout:`1.2.3.4` / :frrfmtout:`fe80::1234`
 141
 142    ``%pIAs``: — print star instead of zero address (for multicast)
 143
 144 .. frrfmt:: %pFX (struct prefix *)
 145
 146    :frrfmtout:`1.2.3.0/24` / :frrfmtout:`fe80::1234/64`
 147
 148    This accepts the following types:
 149
 150    - :c:struct:`prefix`
 151    - :c:struct:`prefix_ipv4`
 152    - :c:struct:`prefix_ipv6`
 153    - :c:struct:`prefix_eth`
 154    - :c:struct:`prefix_evpn`
 155    - :c:struct:`prefix_fs`
 156
 157    It does **not** accept the following types:
 158
 159    - :c:struct:`prefix_ls`
 160    - :c:struct:`prefix_rd`
 161    - :c:struct:`prefix_sg` (use :frrfmt:`%pPSG4`)
 162    - :c:union:`prefixptr` (dereference to get :c:struct:`prefix`)
 163    - :c:union:`prefixconstptr` (dereference to get :c:struct:`prefix`)
 164
 165    Options:
 166
 167    ``%pFXh``: (address only) :frrfmtout:`1.2.3.0` / :frrfmtout:`fe80::1234`
 168
 169 .. frrfmt:: %pPSG4 (struct prefix_sg *)
 170
 171    :frrfmtout:`(*,1.2.3.4)`
 172
 173    This is *(S,G)* output for use in zebra.  (Note prefix_sg is not a prefix
 174    "subclass" like the other prefix_* structs.)
 175
 176 .. frrfmt:: %pSU (union sockunion *)
 177
 178    ``%pSU``: :frrfmtout:`1.2.3.4` / :frrfmtout:`fe80::1234`
 179
 180    ``%pSUs``: :frrfmtout:`1.2.3.4` / :frrfmtout:`fe80::1234%89`
 181    (adds IPv6 scope ID as integer)
 182
 183    ``%pSUp``: :frrfmtout:`1.2.3.4:567` / :frrfmtout:`[fe80::1234]:567`
 184    (adds port)
 185
 186    ``%pSUps``: :frrfmtout:`1.2.3.4:567` / :frrfmtout:`[fe80::1234%89]:567`
 187    (adds port and scope ID)
 188
 189 .. frrfmt:: %pRN (struct route_node *, struct bgp_node *, struct agg_node *)
 190
 191    :frrfmtout:`192.168.1.0/24` (dst-only node)
 192
 193    :frrfmtout:`2001:db8::/32 from fe80::/64` (SADR node)
 194
 195 .. frrfmt:: %pNH (struct nexthop *)
 196
 197    ``%pNHvv``: :frrfmtout:`via 1.2.3.4, eth0` — verbose zebra format
 198
 199    ``%pNHv``: :frrfmtout:`1.2.3.4, via eth0` — slightly less verbose zebra format
 200
 201    ``%pNHs``: :frrfmtout:`1.2.3.4 if 15` — same as :c:func:`nexthop2str()`
 202
 203    ``%pNHcg``: :frrfmtout:`1.2.3.4` — compact gateway only
 204
 205    ``%pNHci``: :frrfmtout:`eth0` — compact interface only
 206
 207 .. frrfmt:: %dPF (int)
 208
 209    :frrfmtout:`AF_INET`
 210
 211    Prints an `AF_*` / `PF_*` constant.  ``PF`` is used here to avoid confusion
 212    with `AFI` constants, even though the FRR codebase prefers `AF_INET` over
 213    `PF_INET` & co.
 214
 215 .. frrfmt:: %dSO (int)
 216
 217    :frrfmtout:`SOCK_STREAM`
 218
 219 Time/interval formats
 220 ^^^^^^^^^^^^^^^^^^^^^
 221
 222 .. frrfmt:: %pTS (struct timespec *)
 223
 224 .. frrfmt:: %pTV (struct timeval *)
 225
 226 .. frrfmt:: %pTT (time_t *)
 227
 228    Above 3 options internally result in the same code being called, support
 229    the same flags and produce equal output with one exception:  ``%pTT``
 230    has no sub-second precision and the formatter will never print a
 231    (nonsensical) ``.000``.
 232
 233    Exactly one of ``I``, ``M`` or ``R`` must immediately follow after
 234    ``TS``/``TV``/``TT`` to specify whether the input is an interval, monotonic
 235    timestamp or realtime timestamp:
 236
 237    ``%pTVI``: input is an interval, not a timestamp.  Print interval.
 238
 239    ``%pTVIs``: input is an interval, convert to wallclock by subtracting it
 240    from current time (i.e. interval has passed **s**\ ince.)
 241
 242    ``%pTVIu``: input is an interval, convert to wallclock by adding it to
 243    current time (i.e. **u**\ ntil interval has passed.)
 244
 245    ``%pTVM`` - input is a timestamp on CLOCK_MONOTONIC, convert to wallclock
 246    time (by grabbing current CLOCK_MONOTONIC and CLOCK_REALTIME and doing the
 247    math) and print calendaric date.
 248
 249    ``%pTVMs`` - input is a timestamp on CLOCK_MONOTONIC, print interval
 250    **s**\ ince that timestamp (elapsed.)
 251
 252    ``%pTVMu`` - input is a timestamp on CLOCK_MONOTONIC, print interval
 253    **u**\ ntil that timestamp (deadline.)
 254
 255    ``%pTVR`` - input is a timestamp on CLOCK_REALTIME, print calendaric date.
 256
 257    ``%pTVRs`` - input is a timestamp on CLOCK_REALTIME, print interval
 258    **s**\ ince that timestamp.
 259
 260    ``%pTVRu`` - input is a timestamp on CLOCK_REALTIME, print interval
 261    **u**\ ntil that timestamp.
 262
 263    ``%pTVA`` - reserved for CLOCK_TAI in case a PTP implementation is
 264    interfaced to FRR.  Not currently implemented.
 265
 266    .. note::
 267
 268       If ``%pTVRs`` or ``%pTVRu`` are used, this is generally an indication
 269       that a CLOCK_MONOTONIC timestamp should be used instead (or added in
 270       parallel.) CLOCK_REALTIME might be adjusted by NTP, PTP or similar
 271       procedures, causing bogus intervals to be printed.
 272
 273       ``%pTVM`` on first look might be assumed to have the same problem, but
 274       on closer thought the assumption is always that current system time is
 275       correct.  And since a CLOCK_MONOTONIC interval is also quite safe to
 276       assume to be correct, the (past) absolute timestamp to be printed from
 277       this can likely be correct even if it doesn't match what CLOCK_REALTIME
 278       would have indicated at that point in the past.  This logic does,
 279       however, not quite work for *future* times.
 280
 281       Generally speaking, almost all use cases in FRR should (and do) use
 282       CLOCK_MONOTONIC (through :c:func:`monotime()`.)
 283
 284    Flags common to printing calendar times and intervals:
 285
 286    ``p``: include spaces in appropriate places (depends on selected format.)
 287
 288    ``%p.3TV...``: specify sub-second resolution (use with ``FMT_NSTD`` to
 289    suppress gcc warning.)  As noted above, ``%pTT`` will never print sub-second
 290    digits since there are none.  Only some formats support printing sub-second
 291    digits and the default may vary.
 292
 293    The following flags are available for printing calendar times/dates:
 294
 295    (no flag): :frrfmtout:`Sat Jan  1 00:00:00 2022` - print output from
 296    ``ctime()``, in local time zone.  Since FRR does not currently use/enable
 297    locale support, this is always the C locale.  (Locale support getting added
 298    is unlikely for the time being and would likely break other things worse
 299    than this.)
 300
 301    ``i``: :frrfmtout:`2022-01-01T00:00:00.123` - ISO8601 timestamp in local
 302    time zone (note there is no ``Z`` or ``+00:00`` suffix.)  Defaults to
 303    millisecond precision.
 304
 305    ``ip``: :frrfmtout:`2022-01-01 00:00:00.123` - use readable form of ISO8601
 306    with space instead of ``T`` separator.
 307
 308    The following flags are available for printing intervals:
 309
 310    (no flag): :frrfmtout:`9w9d09:09:09.123` - does not match any
 311    preexisting format;  added because it does not lose precision (like ``t``)
 312    for longer intervals without printing huge numbers (like ``h``/``m``).
 313    Defaults to millisecond precision.  The week/day fields are left off if
 314    they're zero, ``p`` adds a space after the respective letter.
 315
 316    ``t``: :frrfmtout:`9w9d09h`, :frrfmtout:`9d09h09m`, :frrfmtout:`09:09:09` -
 317    this replaces :c:func:`frrtime_to_interval()`.  ``p`` adds spaces after
 318    week/day/hour letters.
 319
 320    ``d``: print decimal number of seconds.  Defaults to millisecond precision.
 321
 322    ``x`` / ``tx`` / ``dx``: Like no flag / ``t`` / ``d``, but print
 323    :frrfmtout:`-` for zero or negative intervals (for use with unset timers.)
 324
 325    ``h``: :frrfmtout:`09:09:09`
 326
 327    ``hx``: :frrfmtout:`09:09:09`, :frrfmtout:`--:--:--` - this replaces
 328    :c:func:`pim_time_timer_to_hhmmss()`.
 329
 330    ``m``: :frrfmtout:`09:09`
 331
 332    ``mx``: :frrfmtout:`09:09`, :frrfmtout:`--:--` - this replaces
 333    :c:func:`pim_time_timer_to_mmss()`.
 334
 335 FRR library helper formats
 336 ^^^^^^^^^^^^^^^^^^^^^^^^^^
 337
 338 .. frrfmt:: %pTH (struct event *)
 339
 340    Print remaining time on timer event. Interval-printing flag characters
 341    listed above for ``%pTV`` can be added, e.g. ``%pTHtx``.
 342
 343    ``NULL`` pointers are printed as ``-``.
 344
 345 .. frrfmt:: %pTHD (struct event *)
 346
 347    Print debugging information for given event.  Sample output:
 348
 349    .. code-block:: none
 350
 351       {(thread *)NULL}
 352       {(thread *)0x55a3b5818910 arg=0x55a3b5827c50 timer  r=7.824      mld_t_query() &mld_ifp->t_query from pimd/pim6_mld.c:1369}
 353       {(thread *)0x55a3b5827230 arg=0x55a3b5827c50 read   fd=16        mld_t_recv() &mld_ifp->t_recv from pimd/pim6_mld.c:1186}
 354
 355    (The output is aligned to some degree.)
 356
 357 FRR daemon specific formats
 358 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 359
 360 The following formats are only available in specific daemons, as the code
 361 implementing them is part of the daemon, not the library.
 362
 363 zebra
 364 """""
 365
 366 .. frrfmt:: %pZN (struct route_node *)
 367
 368    Print information for a RIB node, including zebra-specific data.
 369
 370    :frrfmtout:`::/0 src fe80::/64 (MRIB)` (``%pZN``)
 371
 372    :frrfmtout:`1234` (``%pZNt`` - table number)
 373
 374 bgpd
 375 """"
 376
 377 .. frrfmt:: %pBD (struct bgp_dest *)
 378
 379    Print prefix for a BGP destination.
 380
 381    :frrfmtout:`fe80::1234/64`
 382
 383 .. frrfmt:: %pBP (struct peer *)
 384
 385    :frrfmtout:`192.168.1.1(leaf1.frrouting.org)`
 386
 387    Print BGP peer's IP and hostname together.
 388
 389 pimd/pim6d
 390 """"""""""
 391
 392 .. frrfmt:: %pPA (pim_addr *)
 393
 394    Format IP address according to IP version (pimd vs. pim6d) being compiled.
 395
 396    :frrfmtout:`fe80::1234` / :frrfmtout:`10.0.0.1`
 397
 398    :frrfmtout:`*` (``%pPAs`` - replace 0.0.0.0/:: with star)
 399
 400 .. frrfmt:: %pSG (pim_sgaddr *)
 401
 402    Format S,G pair according to IP version (pimd vs. pim6d) being compiled.
 403    Braces are included.
 404
 405    :frrfmtout:`(*,224.0.0.0)`
 406
 407
 408 General utility formats
 409 ^^^^^^^^^^^^^^^^^^^^^^^
 410
 411 .. frrfmt:: %m (no argument)
 412
 413    :frrfmtout:`Permission denied`
 414
 415    Prints ``strerror(errno)``.  Does **not** consume any input argument, don't
 416    pass ``errno``!
 417
 418    (This is a GNU extension not specific to FRR.  FRR guarantees it is
 419    available on all systems in printfrr, though BSDs support it in printf too.)
 420
 421 .. frrfmt:: %pSQ (char *)
 422
 423    ([S]tring [Q]uote.)  Like ``%s``, but produce a quoted string.  Options:
 424
 425       ``n`` - treat ``NULL`` as empty string instead.
 426
 427       ``q`` - include ``""`` quotation marks.  Note: ``NULL`` is printed as
 428       ``(null)``, not ``"(null)"`` unless ``n`` is used too.  This is
 429       intentional.
 430
 431       ``s`` - use escaping suitable for RFC5424 syslog.  This means ``]`` is
 432       escaped too.
 433
 434    If a length is specified (``%*pSQ`` or ``%.*pSQ``), null bytes in the input
 435    string do not end the string and are just printed as ``\x00``.
 436
 437 .. frrfmt:: %pSE (char *)
 438
 439    ([S]tring [E]scape.)  Like ``%s``, but escape special characters.
 440    Options:
 441
 442       ``n`` - treat ``NULL`` as empty string instead.
 443
 444    Unlike :frrfmt:`%pSQ`, this escapes many more characters that are fine for
 445    a quoted string but not on their own.
 446
 447    If a length is specified (``%*pSE`` or ``%.*pSE``), null bytes in the input
 448    string do not end the string and are just printed as ``\x00``.
 449
 450 .. frrfmt:: %pVA (struct va_format *)
 451
 452    Recursively invoke printfrr, with arguments passed in through:
 453
 454    .. c:struct:: va_format
 455
 456       .. c:member:: const char *fmt
 457
 458          Format string to use for the recursive printfrr call.
 459
 460       .. c:member:: va_list *va
 461
 462          Formatting arguments.  Note this is passed as a pointer, not - as in
 463          most other places - a direct struct reference.  Internally uses
 464          ``va_copy()`` so repeated calls can be made (e.g. for determining
 465          output length.)
 466
 467 .. frrfmt:: %pFB (struct fbuf *)
 468
 469    Insert text from a ``struct fbuf *``, i.e. the output of a call to
 470    :c:func:`bprintfrr()`.
 471
 472 .. frrfmt:: %*pHX (void *, char *, unsigned char *)
 473
 474    ``%pHX``: :frrfmtout:`12 34 56 78`
 475
 476    ``%pHXc``: :frrfmtout:`12:34:56:78` (separate with [c]olon)
 477
 478    ``%pHXn``: :frrfmtout:`12345678` (separate with [n]othing)
 479
 480    Insert hexdump.  This specifier requires a precision or width to be
 481    specified.  A precision (``%.*pHX``) takes precedence, but generates a
 482    compiler warning since precisions are undefined for ``%p`` in ISO C.  If
 483    no precision is given, the width is used instead (and normal handling of
 484    the width is suppressed).
 485
 486    Note that width and precision are ``int`` arguments, not ``size_t``.  Use
 487    like::
 488
 489      char *buf;
 490      size_t len;
 491
 492      snprintfrr(out, sizeof(out), "... %*pHX ...", (int)len, buf);
 493
 494      /* with padding to width - would generate a warning due to %.*p */
 495      FMT_NSTD(snprintfrr(out, sizeof(out), "... %-47.*pHX ...", (int)len, buf));
 496
 497 .. frrfmt:: %*pHS (void *, char *, unsigned char *)
 498
 499    ``%pHS``: :frrfmtout:`hex.dump`
 500
 501    This is a complementary format for :frrfmt:`%*pHX` to print the text
 502    representation for a hexdump.  Non-printable characters are replaced with
 503    a dot.
 504
 505 .. frrfmt:: %pIS (struct iso_address *)
 506
 507    ([IS]o Network address) - Format ISO Network Address
 508
 509    ``%pIS``: :frrfmtout:`01.0203.04O5`
 510    ISO Network address is printed as separated byte. The number of byte of the
 511    address is embeded in the `iso_net` structure.
 512
 513    ``%pISl``: :frrfmtout:`01.0203.04O5.0607.0809.1011.1213.14` - long format to
 514    print the long version of the ISO Network address which include the System
 515    ID and the PSEUDO-ID of the IS-IS system
 516
 517    Note that the `ISO_ADDR_STRLEN` define gives the total size of the string
 518    that could be used in conjunction to snprintfrr. Use like::
 519
 520      char buf[ISO_ADDR_STRLEN];
 521      struct iso_address addr = {.addr_len = 4, .area_addr = {1, 2, 3, 4}};
 522      snprintfrr(buf, ISO_ADDR_STRLEN, "%pIS", &addr);
 523
 524 .. frrfmt:: %pSY (uint8_t *)
 525
 526    (IS-IS [SY]stem ID) - Format IS-IS System ID
 527
 528    ``%pSY``: :frrfmtout:`0102.0304.0506`
 529
 530 .. frrfmt:: %pPN (uint8_t *)
 531
 532    (IS-IS [P]seudo [N]ode System ID) - Format IS-IS Pseudo Node System ID
 533
 534    ``%pPN``: :frrfmtout:`0102.0304.0506.07`
 535
 536 .. frrfmt:: %pLS (uint8_t *)
 537
 538    (IS-IS [L]sp fragment [S]ystem ID) - Format IS-IS Pseudo System ID
 539
 540    ``%pLS``: :frrfmtout:`0102.0304.0506.07-08`
 541
 542    Note that the `ISO_SYSID_STRLEN` define gives the total size of the string
 543    that could be used in conjunction to snprintfrr. Use like::
 544
 545      char buf[ISO_SYSID_STRLEN];
 546      uint8_t id[8] = {1, 2, 3, 4 , 5 , 6 , 7, 8};
 547      snprintfrr(buf, SYS_ID_SIZE, "%pSY", id);
 548
 549
 550 Integer formats
 551 ^^^^^^^^^^^^^^^
 552
 553 .. note::
 554
 555    These formats currently only exist for advanced type checking with the
 556    ``frr-format`` GCC plugin.  They should not be used directly since they will
 557    cause compiler warnings when used without the plugin.  Use with
 558    :c:macro:`FMT_NSTD` if necessary.
 559
 560    It is possible ISO C23 may introduce another format for these, possibly
 561    ``%w64d`` discussed in `JTC 1/SC 22/WG 14/N2680 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2680.pdf>`_.
 562
 563 .. frrfmt:: %Lu (uint64_t)
 564
 565    :frrfmtout:`12345`
 566
 567 .. frrfmt:: %Ld (int64_t)
 568
 569    :frrfmtout:`-12345`
 570
 571 Log levels
 572 ----------
 573
 574 Errors and warnings
 575 ^^^^^^^^^^^^^^^^^^^
 576
 577 If it is something that the user will want to look at and maybe do
 578 something, it is either an **error** or a **warning**.
 579
 580 We're expecting that warnings and errors are in some way visible to the
 581 user (in the worst case by looking at the log after the network broke, but
 582 maybe by a syslog collector from all routers.)  Therefore, anything that
 583 needs to get the user in the loop—and only these things—are warnings or
 584 errors.
 585
 586 Note that this doesn't necessarily mean the user needs to fix something in
 587 the FRR instance.  It also includes when we detect something else needs
 588 fixing, for example another router, the system we're running on, or the
 589 configuration.  The common point is that the user should probably do
 590 *something*.
 591
 592 Deciding between a warning and an error is slightly less obvious; the rule
 593 of thumb here is that an error will cause considerable fallout beyond its
 594 direct effect.  Closing a BGP session due to a malformed update is an error
 595 since all routes from the peer are dropped; discarding one route because
 596 its attributes don't make sense is a warning.
 597
 598 This also loosely corresponds to the kind of reaction we're expecting from
 599 the user.  An error is likely to need immediate response while a warning
 600 might be snoozed for a bit and addressed as part of general maintenance.
 601 If a problem will self-repair (e.g. by retransmits), it should be a
 602 warning—unless the impact until that self-repair is very harsh.
 603
 604 Examples for warnings:
 605
 606 * a BGP update, LSA or LSP could not be processed, but operation is
 607   proceeding and the broken pieces are likely to self-fix later
 608 * some kind of controller cannot be reached, but we can work without it
 609 * another router is using some unknown or unsupported capability
 610
 611 Examples for errors:
 612
 613 * dropping a BGP session due to malformed data
 614 * a socket for routing protocol operation cannot be opened
 615 * desynchronization from network state because something went wrong
 616 * *everything that we as developers would really like to be notified about,
 617   i.e. some assumption in the code isn't holding up*
 618
 619
 620 Informational messages
 621 ^^^^^^^^^^^^^^^^^^^^^^
 622
 623 Anything that provides introspection to the user during normal operation
 624 is an **info** message.
 625
 626 This includes all kinds of operational state transitions and events,
 627 especially if they might be interesting to the user during the course of
 628 figuring out a warning or an error.
 629
 630 By itself, these messages should mostly be statements of fact.  They might
 631 indicate the order and relationship in which things happened.  Also covered
 632 are conditions that might be "operational issues" like a link failure due
 633 to an unplugged cable.  If it's pretty much the point of running a routing
 634 daemon for, it's not a warning or an error, just business as usual.
 635
 636 The user should be able to see the state of these bits from operational
 637 state output, i.e. `show interface` or `show foobar neighbors`.  The log
 638 message indicating the change may have been printed weeks ago, but the
 639 state can always be viewed.  (If some state change has an info message but
 640 no "show" command, maybe that command needs to be added.)
 641
 642 Examples:
 643
 644 * all kinds of up/down state changes
 645
 646   * interface coming up or going down
 647   * addresses being added or deleted
 648   * peers and neighbors coming up or going down
 649
 650 * rejection of some routes due to user-configured route maps
 651 * backwards compatibility handling because another system on the network
 652   has a different or smaller feature set
 653
 654 .. note::
 655    The previously used **notify** priority is replaced with *info* in all
 656    cases.  We don't currently have a well-defined use case for it.
 657
 658
 659 Debug messages and asserts
 660 ^^^^^^^^^^^^^^^^^^^^^^^^^^
 661
 662 Everything that is only interesting on-demand, or only while developing,
 663 is a **debug** message.  It might be interesting to the user for a
 664 particularly evasive issue, but in general these are details that an
 665 average user might not even be able to make sense of.
 666
 667 Most (or all?) debug messages should be behind a `debug foobar` category
 668 switch that controls which subset of these messages is currently
 669 interesting and thus printed.  If a debug message doesn't have such a
 670 guard, there should be a good explanation as to why.
 671
 672 Conversely, debug messages are the only thing that should be guarded by
 673 these switches.  Neither info nor warning or error messages should be
 674 hidden in this way.
 675
 676 **Asserts** should only be used as pretty crashes.  We are expecting that
 677 asserts remain enabled in production builds, but please try to not use
 678 asserts in a way that would cause a security problem if the assert wasn't
 679 there (i.e. don't use them for length checks.)
 680
 681 The purpose of asserts is mainly to help development and bug hunting.  If
 682 the daemon crashes, then having some more information is nice, and the
 683 assert can provide crucial hints that cut down on the time needed to track
 684 an issue.  That said, if the issue can be reasonably handled and/or isn't
 685 going to crash the daemon, it shouldn't be an assert.
 686
 687 For anything else where internal constraints are violated but we're not
 688 breaking due to it, it's an error instead (not a debug.)  These require
 689 "user action" of notifying the developers.
 690
 691 Examples:
 692
 693 * mismatched :code:`prev`/:code:`next` pointers in lists
 694 * some field that is absolutely needed is :code:`NULL`
 695 * any other kind of data structure corruption that will cause the daemon
 696   to crash sooner or later, one way or another
 697
 698 Thread-local buffering
 699 ----------------------
 700
 701 The core logging code in :file:`lib/zlog.c` allows setting up per-thread log
 702 message buffers in order to improve logging performance.  The following rules
 703 apply for this buffering:
 704
 705 * Only messages of priority *DEBUG* or *INFO* are buffered.
 706 * Any higher-priority message causes the thread's entire buffer to be flushed,
 707   thus message ordering is preserved on a per-thread level.
 708 * There is no guarantee on ordering between different threads;  in most cases
 709   this is arbitrary to begin with since the threads essentially race each
 710   other in printing log messages.  If an order is established with some
 711   synchronization primitive, add calls to :c:func:`zlog_tls_buffer_flush()`.
 712 * The buffers are only ever accessed by the thread they are created by.  This
 713   means no locking is necessary.
 714
 715 Both the main/default thread and additional threads created by
 716 :c:func:`frr_pthread_new()` with the default :c:func:`frr_run()` handler will
 717 initialize thread-local buffering and call :c:func:`zlog_tls_buffer_flush()`
 718 when idle.
 719
 720 If some piece of code runs for an extended period, it may be useful to insert
 721 calls to :c:func:`zlog_tls_buffer_flush()` in appropriate places:
 722
 723 .. c:function:: void zlog_tls_buffer_flush(void)
 724
 725    Write out any pending log messages that the calling thread may have in its
 726    buffer.  This function is safe to call regardless of the per-thread log
 727    buffer being set up / in use or not.
 728
 729 When working with threads that do not use the :c:struct:`thread_master`
 730 event loop, per-thread buffers can be managed with:
 731
 732 .. c:function:: void zlog_tls_buffer_init(void)
 733
 734    Set up thread-local buffering for log messages.  This function may be
 735    called repeatedly without adverse effects, but remember to call
 736    :c:func:`zlog_tls_buffer_fini()` at thread exit.
 737
 738    .. warning::
 739
 740       If this function is called, but :c:func:`zlog_tls_buffer_flush()` is
 741       not used, log message output will lag behind since messages will only be
 742       written out when the buffer is full.
 743
 744       Exiting the thread without calling :c:func:`zlog_tls_buffer_fini()`
 745       will cause buffered log messages to be lost.
 746
 747 .. c:function:: void zlog_tls_buffer_fini(void)
 748
 749    Flush pending messages and tear down thread-local log message buffering.
 750    This function may be called repeatedly regardless of whether
 751    :c:func:`zlog_tls_buffer_init()` was ever called.
 752
 753 Log targets
 754 -----------
 755
 756 The actual logging subsystem (in :file:`lib/zlog.c`) is heavily separated
 757 from the actual log writers.  It uses an atomic linked-list (`zlog_targets`)
 758 with RCU to maintain the log targets to be called.  This list is intended to
 759 function as "backend" only, it **is not used for configuration**.
 760
 761 Logging targets provide their configuration layer on top of this and maintain
 762 their own capability to enumerate and store their configuration.  Some targets
 763 (e.g. syslog) are inherently single instance and just stuff their config in
 764 global variables.  Others (e.g. file/fd output) are multi-instance capable.
 765 There is another layer boundary here between these and the VTY configuration
 766 that they use.
 767
 768 Basic internals
 769 ^^^^^^^^^^^^^^^
 770
 771 .. c:struct:: zlog_target
 772
 773    This struct needs to be filled in by any log target and then passed to
 774    :c:func:`zlog_target_replace()`.  After it has been registered,
 775    **RCU semantics apply**.  Most changes to associated data should make a
 776    copy, change that, and then replace the entire struct.
 777
 778    Additional per-target data should be "appended" by embedding this struct
 779    into a larger one, for use with `containerof()`, and
 780    :c:func:`zlog_target_clone()` and :c:func:`zlog_target_free()` should be
 781    used to allocate/free the entire container struct.
 782
 783    Do not use this structure to maintain configuration.  It should only
 784    contain (a copy of) the data needed to perform the actual logging.  For
 785    example, the syslog target uses this:
 786
 787    .. code-block:: c
 788
 789       struct zlt_syslog {
 790           struct zlog_target zt;
 791           int syslog_facility;
 792       };
 793
 794       static void zlog_syslog(struct zlog_target *zt, struct zlog_msg *msgs[], size_t nmsgs)
 795       {
 796           struct zlt_syslog *zte = container_of(zt, struct zlt_syslog, zt);
 797           size_t i;
 798
 799           for (i = 0; i < nmsgs; i++)
 800               if (zlog_msg_prio(msgs[i]) <= zt->prio_min)
 801                   syslog(zlog_msg_prio(msgs[i]) | zte->syslog_facility, "%s",
 802                          zlog_msg_text(msgs[i], NULL));
 803       }
 804
 805
 806 .. c:function:: struct zlog_target *zlog_target_clone(struct memtype *mt, struct zlog_target *oldzt, size_t size)
 807
 808    Allocates a logging target struct.  Note that the ``oldzt`` argument may be
 809    ``NULL`` to allocate a "from scratch".  If ``oldzt`` is not ``NULL``, the
 810    generic bits in :c:struct:`zlog_target` are copied.  **Target specific
 811    bits are not copied.**
 812
 813 .. c:function:: struct zlog_target *zlog_target_replace(struct zlog_target *oldzt, struct zlog_target *newzt)
 814
 815    Adds, replaces or deletes a logging target (either ``oldzt`` or ``newzt`` may be ``NULL``.)
 816
 817    Returns ``oldzt`` for freeing.  The target remains possibly in use by
 818    other threads until the RCU cycle ends.  This implies you cannot release
 819    resources (e.g. memory, file descriptors) immediately.
 820
 821    The replace operation is not atomic; for a brief period it is possible that
 822    messages are delivered on both ``oldzt`` and ``newzt``.
 823
 824    .. warning::
 825
 826       ``oldzt`` must remain **functional** until the RCU cycle ends.
 827
 828 .. c:function:: void zlog_target_free(struct memtype *mt, struct zlog_target *zt)
 829
 830    Counterpart to :c:func:`zlog_target_clone()`, frees a target (using RCU.)
 831
 832 .. c:member:: void (*zlog_target.logfn)(struct zlog_target *zt, struct zlog_msg *msgs[], size_t nmsg)
 833
 834    Called on a target to deliver "normal" logging messages.  ``msgs`` is an
 835    array of opaque structs containing the actual message.  Use ``zlog_msg_*``
 836    functions to access message data (this is done to allow some optimizations,
 837    e.g.  lazy formatting the message text and timestamp as needed.)
 838
 839    .. note::
 840
 841       ``logfn()`` must check each individual message's priority value against
 842       the configured ``prio_min``.  While the ``prio_min`` field is common to
 843       all targets and used by the core logging code to early-drop unneeded log
 844       messages, the array is **not** filtered for each ``logfn()`` call.
 845
 846 .. c:member:: void (*zlog_target.logfn_sigsafe)(struct zlog_target *zt, const char *text, size_t len)
 847
 848    Called to deliver "exception" logging messages (i.e. SEGV messages.)
 849    Must be Async-Signal-Safe (may not allocate memory or call "complicated"
 850    libc functions.)  May be ``NULL`` if the log target cannot handle this.
 851
 852 Standard targets
 853 ^^^^^^^^^^^^^^^^
 854
 855 :file:`lib/zlog_targets.c` provides the standard file / fd / syslog targets.
 856 The syslog target is single-instance while file / fd targets can be
 857 instantiated as needed.  There are 3 built-in targets that are fully
 858 autonomous without any config:
 859
 860 - startup logging to `stderr`, until either :c:func:`zlog_startup_end()` or
 861   :c:func:`zlog_aux_init()` is called.
 862 - stdout logging for non-daemon programs using :c:func:`zlog_aux_init()`
 863 - crashlogs written to :file:`/var/tmp/frr.daemon.crashlog`
 864
 865 The regular CLI/command-line logging setup is handled by :file:`lib/log_vty.c`
 866 which makes the appropriate instantiations of syslog / file / fd targets.
 867
 868 .. todo::
 869
 870   :c:func:`zlog_startup_end()` should do an explicit switchover from
 871   startup stderr logging to configured logging.  Currently, configured logging
 872   starts in parallel as soon as the respective setup is executed.  This results
 873   in some duplicate logging.