doc/developer/lists.rst

   1 List implementations
   2 ====================
   3
   4 .. note::
   5
   6    The term *list* is used generically for lists, skiplists, trees and hash
   7    tables in this document.
   8
   9 Common list interface
  10 ---------------------
  11
  12 FRR includes a set of list-like data structure implementations with abstracted
  13 common APIs.  The purpose of this is easily allow swapping out one
  14 data structure for another while also making the code easier to read and write.
  15 There is one API for unsorted lists and a similar but not identical API for
  16 sorted lists - and heaps use a middle ground of both.
  17
  18 For unsorted lists, the following implementations exist:
  19
  20 - single-linked list with tail pointer (e.g. STAILQ in BSD)
  21
  22 - double-linked list
  23
  24 - atomic single-linked list with tail pointer
  25
  26
  27 Being partially sorted, the oddball structure:
  28
  29 - an 8-ary heap
  30
  31
  32 For sorted lists, these data structures are implemented:
  33
  34 - single-linked list
  35
  36 - atomic single-linked list
  37
  38 - skiplist
  39
  40 - red-black tree (based on OpenBSD RB_TREE)
  41
  42 - hash table (note below)
  43
  44 Except for hash tables, each of the sorted data structures has a variant with
  45 unique and non-unique list items.  Hash tables always require unique items
  46 and mostly follow the "sorted" API but use the hash value as sorting
  47 key.  Also, iterating while modifying does not work with hash tables.
  48 Conversely, the heap always has non-unique items, but iterating while modifying
  49 doesn't work either.
  50
  51
  52 The following sorted structures are likely to be implemented at some point
  53 in the future:
  54
  55 - atomic skiplist
  56
  57 - atomic hash table (note below)
  58
  59
  60 The APIs are all designed to be as type-safe as possible.  This means that
  61 there will be a compiler warning when an item doesn't match the list, or
  62 the return value has a different type, or other similar situations.  **You
  63 should never use casts with these APIs.**  If a cast is neccessary in relation
  64 to these APIs, there is probably something wrong with the overall design.
  65
  66 Only the following pieces use dynamically allocated memory:
  67
  68 - the hash table itself is dynamically grown and shrunk
  69
  70 - skiplists store up to 4 next pointers inline but will dynamically allocate
  71   memory to hold an item's 5th up to 16th next pointer (if they exist)
  72
  73 - the heap uses a dynamically grown and shrunk array of items
  74
  75 Cheat sheet
  76 -----------
  77
  78 Available types:
  79
  80 ::
  81
  82    DECLARE_LIST
  83    DECLARE_ATOMLIST
  84    DECLARE_DLIST
  85
  86    DECLARE_HEAP
  87
  88    DECLARE_SORTLIST_UNIQ
  89    DECLARE_SORTLIST_NONUNIQ
  90    DECLARE_ATOMLIST_UNIQ
  91    DECLARE_ATOMLIST_NONUNIQ
  92    DECLARE_SKIPLIST_UNIQ
  93    DECLARE_SKIPLIST_NONUNIQ
  94    DECLARE_RBTREE_UNIQ
  95    DECLARE_RBTREE_NONUNIQ
  96
  97    DECLARE_HASH
  98
  99 Functions provided:
 100
 101 +------------------------------------+------+------+------+---------+------------+
 102 | Function                           | LIST | HEAP | HASH | \*_UNIQ | \*_NONUNIQ |
 103 +====================================+======+======+======+=========+============+
 104 | _init, _fini                       | yes  | yes  | yes  | yes     | yes        |
 105 +------------------------------------+------+------+------+---------+------------+
 106 | _first, _next, _next_safe          | yes  | yes  | yes  | yes     | yes        |
 107 +------------------------------------+------+------+------+---------+------------+
 108 | _add_head, _add_tail, _add_after   | yes  | --   | --   | --      | --         |
 109 +------------------------------------+------+------+------+---------+------------+
 110 | _add                               | --   | yes  | yes  | yes     | yes        |
 111 +------------------------------------+------+------+------+---------+------------+
 112 | _del, _pop                         | yes  | yes  | yes  | yes     | yes        |
 113 +------------------------------------+------+------+------+---------+------------+
 114 | _find                              | --   | --   | yes  | yes     | --         |
 115 +------------------------------------+------+------+------+---------+------------+
 116 | _find_lt, _find_gteq               | --   | --   | --   | yes     | yes        |
 117 +------------------------------------+------+------+------+---------+------------+
 118 | use with for_each() macros         | yes  | yes  | yes  | yes     | yes        |
 119 +------------------------------------+------+------+------+---------+------------+
 120
 121
 122
 123 Datastructure type setup
 124 ------------------------
 125
 126 Each of the data structures has a ``PREDECL_*`` and a ``DECLARE_*`` macro to
 127 set up an "instantiation" of the list.  This works somewhat similar to C++
 128 templating, though much simpler.
 129
 130 **In all following text, the Z prefix is replaced with a name choosen
 131 for the instance of the datastructure.**
 132
 133 The common setup pattern will look like this:
 134
 135 .. code-block:: c
 136
 137    #include <typesafe.h>
 138
 139    PREDECL_XXX(Z)
 140    struct item {
 141        int otherdata;
 142        struct Z_item mylistitem;
 143    }
 144
 145    struct Z_head mylisthead;
 146
 147    /* unsorted: */
 148    DECLARE_XXX(Z, struct item, mylistitem)
 149
 150    /* sorted, items that compare as equal cannot be added to list */
 151    int compare_func(const struct item *a, const struct item *b);
 152    DECLARE_XXX_UNIQ(Z, struct item, mylistitem, compare_func)
 153
 154    /* sorted, items that compare as equal can be added to list */
 155    int compare_func(const struct item *a, const struct item *b);
 156    DECLARE_XXX_NONUNIQ(Z, struct item, mylistitem, compare_func)
 157
 158    /* hash tables: */
 159    int compare_func(const struct item *a, const struct item *b);
 160    uint32_t hash_func(const struct item *a);
 161    DECLARE_XXX(Z, struct item, mylistitem, compare_func, hash_func)
 162
 163 ``XXX`` is replaced with the name of the data structure, e.g. ``SKIPLIST``
 164 or ``ATOMLIST``.  The ``DECLARE_XXX`` invocation can either occur in a `.h`
 165 file (if the list needs to be accessed from several C files) or it can be
 166 placed in a `.c` file (if the list is only accessed from that file.)  The
 167 ``PREDECL_XXX`` invocation defines the ``struct Z_item`` and ``struct
 168 Z_head`` types and must therefore occur before these are used.
 169
 170 To switch between compatible data structures, only these two lines need to be
 171 changes.  To switch to a data structure with a different API, some source
 172 changes are necessary.
 173
 174 Common iteration macros
 175 -----------------------
 176
 177 The following iteration macros work across all data structures:
 178
 179 .. c:function:: for_each(Z, &head, item)
 180
 181    Equivalent to:
 182
 183    .. code-block:: c
 184
 185       for (item = Z_first(&head); item; item = Z_next(&head, item))
 186
 187    Note that this will fail if the list is modified while being iterated
 188    over.
 189
 190 .. c:function:: for_each_safe(Z, &head, item)
 191
 192    Same as the previous, but the next element is pre-loaded into a "hidden"
 193    variable (named ``Z_safe``.)  Equivalent to:
 194
 195    .. code-block:: c
 196
 197       for (item = Z_first(&head); item; item = next) {
 198           next = Z_next_safe(&head, item);
 199           ...
 200       }
 201
 202    .. warning::
 203
 204       Iterating over hash tables while adding or removing items is not
 205       possible.  The iteration position will be corrupted when the hash
 206       tables is resized while iterating.  This will cause items to be
 207       skipped or iterated over twice.
 208
 209 .. c:function:: for_each_from(Z, &head, item, from)
 210
 211    Iterates over the list, starting at item ``from``.  This variant is "safe"
 212    as in the previous macro.  Equivalent to:
 213
 214    .. code-block:: c
 215
 216       for (item = from; item; item = from) {
 217           from = Z_next_safe(&head, item);
 218           ...
 219       }
 220
 221    .. note::
 222
 223       The ``from`` variable is written to.  This is intentional - you can
 224       resume iteration after breaking out of the loop by keeping the ``from``
 225       value persistent and reusing it for the next loop.
 226
 227 Common API
 228 ----------
 229
 230 The following documentation assumes that a list has been defined using
 231 ``Z`` as the name, and ``itemtype`` being the type of the list items (e.g.
 232 ``struct item``.)
 233
 234 .. c:function:: void Z_init(struct Z_head *)
 235
 236    Initializes the list for use.  For most implementations, this just sets
 237    some values.  Hash tables are the only implementation that allocates
 238    memory in this call.
 239
 240 .. c:function:: void Z_fini(struct Z_head *)
 241
 242    Reverse the effects of :c:func:`Z_init()`.  The list must be empty
 243    when this function is called.
 244
 245    .. warning::
 246
 247       This function may ``assert()`` if the list is not empty.
 248
 249 .. c:function:: size_t Z_count(struct Z_head *)
 250
 251    Returns the number of items in a structure.  All structures store a
 252    counter in their `Z_head` so that calling this function completes
 253    in O(1).
 254
 255    .. note::
 256
 257       For atomic lists with concurrent access, the value will already be
 258       outdated by the time this function returns and can therefore only be
 259       used as an estimate.
 260
 261 .. c:function:: itemtype *Z_first(struct Z_head *)
 262
 263    Returns the first item in the structure, or ``NULL`` if the structure is
 264    empty.  This is O(1) for all data structures except red-black trees
 265    where it is O(log n).
 266
 267 .. c:function:: itemtype *Z_pop(struct Z_head *)
 268
 269    Remove and return the first item in the structure, or ``NULL`` if the
 270    structure is empty.  Like :c:func:`Z_first`, this is O(1) for all
 271    data structures except red-black trees where it is O(log n) again.
 272
 273    This function can be used to build queues (with unsorted structures) or
 274    priority queues (with sorted structures.)
 275
 276    Another common pattern is deleting all list items:
 277
 278    .. code-block:: c
 279
 280       while ((item = Z_pop(head)))
 281           item_free(item);
 282
 283    .. note::
 284
 285       This function can - and should - be used with hash tables.  It is not
 286       affected by the "modification while iterating" problem.  To remove
 287       all items from a hash table, use the loop demonstrated above.
 288
 289 .. c:function:: itemtype *Z_next(struct Z_head *, itemtype *prev)
 290
 291    Return the item that follows after ``prev``, or ``NULL`` if ``prev`` is
 292    the last item.
 293
 294    .. warning::
 295
 296       ``prev`` must not be ``NULL``!  Use :c:func:`Z_next_safe()` if
 297       ``prev`` might be ``NULL``.
 298
 299 .. c:function:: itemtype *Z_next_safe(struct Z_head *, itemtype *prev)
 300
 301    Same as :c:func:`Z_next()`, except that ``NULL`` is returned if
 302    ``prev`` is ``NULL``.
 303
 304 .. c:function:: itemtype *Z_del(struct Z_head *, itemtype *item)
 305
 306    Remove ``item`` from the list and return it.
 307
 308    .. note::
 309
 310       This function's behaviour is undefined if ``item`` is not actually
 311       on the list.  Some structures return ``NULL`` in this case while others
 312       return ``item``.  The function may also call ``assert()`` (but most
 313       don't.)
 314
 315 .. todo::
 316
 317    ``Z_del_after()`` / ``Z_del_hint()``?
 318
 319 API for unsorted structures
 320 ---------------------------
 321
 322 Since the insertion position is not pre-defined for unsorted data, there
 323 are several functions exposed to insert data:
 324
 325 .. note::
 326
 327    ``item`` must not be ``NULL`` for any of the following functions.
 328
 329 .. c:function:: DECLARE_XXX(Z, type, field)
 330
 331    :param listtype XXX: ``LIST``, ``DLIST`` or ``ATOMLIST`` to select a data
 332       structure implementation.
 333    :param token Z: Gives the name prefix that is used for the functions
 334       created for this instantiation.  ``DECLARE_XXX(foo, ...)``
 335       gives ``struct foo_item``, ``foo_add_head()``, ``foo_count()``, etc.  Note
 336       that this must match the value given in ``PREDECL_XXX(foo)``.
 337    :param typename type: Specifies the data type of the list items, e.g.
 338       ``struct item``.  Note that ``struct`` must be added here, it is not
 339       automatically added.
 340    :param token field: References a struct member of ``type`` that must be
 341       typed as ``struct foo_item``.  This struct member is used to
 342       store "next" pointers or other data structure specific data.
 343
 344 .. c:function:: void Z_add_head(struct Z_head *, itemtype *item)
 345
 346    Insert an item at the beginning of the structure, before the first item.
 347    This is an O(1) operation for non-atomic lists.
 348
 349 .. c:function:: void Z_add_tail(struct Z_head *, itemtype *item)
 350
 351    Insert an item at the end of the structure, after the last item.
 352    This is also an O(1) operation for non-atomic lists.
 353
 354 .. c:function:: void Z_add_after(struct Z_head *, itemtype *after, itemtype *item)
 355
 356    Insert ``item`` behind ``after``. If ``after`` is ``NULL``, the item is
 357    inserted at the beginning of the list as with :c:func:`Z_add_head`.
 358    This is also an O(1) operation for non-atomic lists.
 359
 360    A common pattern is to keep a "previous" pointer around while iterating:
 361
 362    .. code-block:: c
 363
 364       itemtype *prev = NULL, *item;
 365
 366       for_each_safe(Z, head, item) {
 367           if (something) {
 368               Z_add_after(head, prev, item);
 369               break;
 370           }
 371           prev = item;
 372       }
 373
 374    .. todo::
 375
 376       maybe flip the order of ``item`` & ``after``?
 377       ``Z_add_after(head, item, after)``
 378
 379 API for sorted structures
 380 -------------------------
 381
 382 Sorted data structures do not need to have an insertion position specified,
 383 therefore the insertion calls are different from unsorted lists.  Also,
 384 sorted lists can be searched for a value.
 385
 386 .. c:function:: DECLARE_XXX_UNIQ(Z, type, field, compare_func)
 387
 388    :param listtype XXX: One of the following:
 389        ``SORTLIST`` (single-linked sorted list), ``SKIPLIST`` (skiplist),
 390        ``RBTREE`` (RB-tree) or ``ATOMSORT`` (atomic single-linked list).
 391    :param token Z: Gives the name prefix that is used for the functions
 392       created for this instantiation.  ``DECLARE_XXX(foo, ...)``
 393       gives ``struct foo_item``, ``foo_add()``, ``foo_count()``, etc.  Note
 394       that this must match the value given in ``PREDECL_XXX(foo)``.
 395    :param typename type: Specifies the data type of the list items, e.g.
 396       ``struct item``.  Note that ``struct`` must be added here, it is not
 397       automatically added.
 398    :param token field: References a struct member of ``type`` that must be
 399       typed as ``struct foo_item``.  This struct member is used to
 400       store "next" pointers or other data structure specific data.
 401    :param funcptr compare_func: Item comparison function, must have the
 402       following function signature:
 403       ``int function(const itemtype *, const itemtype*)``.  This function
 404       may be static if the list is only used in one file.
 405
 406 .. c:function:: DECLARE_XXX_NONUNIQ(Z, type, field, compare_func)
 407
 408    Same as above, but allow adding multiple items to the list that compare
 409    as equal in ``compare_func``.  Ordering between these items is undefined
 410    and depends on the list implementation.
 411
 412 .. c:function:: itemtype *Z_add(struct Z_head *, itemtype *item)
 413
 414    Insert an item at the appropriate sorted position.  If another item exists
 415    in the list that compares as equal (``compare_func()`` == 0), ``item`` is
 416    not inserted into the list and the already-existing item in the list is
 417    returned.  Otherwise, on successful insertion, ``NULL`` is returned.
 418
 419    For ``_NONUNIQ`` lists, this function always returns NULL since ``item``
 420    can always be successfully added to the list.
 421
 422 .. c:function:: itemtype *Z_find(struct Z_head *, const itemtype *ref)
 423
 424    Search the list for an item that compares equal to ``ref``.  If no equal
 425    item is found, return ``NULL``.
 426
 427    This function is likely used with a temporary stack-allocated value for
 428    ``ref`` like so:
 429
 430    .. code-block:: c
 431
 432       itemtype searchfor = { .foo = 123 };
 433
 434       itemtype *item = Z_find(head, &searchfor);
 435
 436    .. note::
 437
 438       The ``Z_find()`` function is only available for lists that contain
 439       unique items (i.e. ``DECLARE_XXX_UNIQ``.)  This is because on a list
 440       containing non-unique items, more than one item may compare as equal to
 441       the item that is searched for.
 442
 443 .. c:function:: itemtype *Z_find_gteq(struct Z_head *, const itemtype *ref)
 444
 445    Search the list for an item that compares greater or equal to
 446    ``ref``.  See :c:func:`Z_find()` above.
 447
 448 .. c:function:: itemtype *Z_find_lt(struct Z_head *, const itemtype *ref)
 449
 450    Search the list for an item that compares less than
 451    ``ref``.  See :c:func:`Z_find()` above.
 452
 453
 454 API for hash tables
 455 -------------------
 456
 457 .. c:function:: DECLARE_XXX(Z, type, field, compare_func, hash_func)
 458
 459    :param listtype XXX: Only ``HASH`` is currently available.
 460    :param token Z: Gives the name prefix that is used for the functions
 461       created for this instantiation.  ``DECLARE_XXX(foo, ...)``
 462       gives ``struct foo_item``, ``foo_add()``, ``foo_count()``, etc.  Note
 463       that this must match the value given in ``PREDECL_XXX(foo)``.
 464    :param typename type: Specifies the data type of the list items, e.g.
 465       ``struct item``.  Note that ``struct`` must be added here, it is not
 466       automatically added.
 467    :param token field: References a struct member of ``type`` that must be
 468       typed as ``struct foo_item``.  This struct member is used to
 469       store "next" pointers or other data structure specific data.
 470    :param funcptr compare_func: Item comparison function, must have the
 471       following function signature:
 472       ``int function(const itemtype *, const itemtype*)``.  This function
 473       may be static if the list is only used in one file.  For hash tables,
 474       this function is only used to check for equality, the ordering is
 475       ignored.
 476    :param funcptr hash_func: Hash calculation function, must have the
 477       following function signature:
 478       ``uint32_t function(const itemtype *)``.  The hash value for items
 479       stored in a hash table is cached in each item, so this value need not
 480       be cached by the user code.
 481
 482    .. warning::
 483
 484       Items that compare as equal cannot be inserted.  Refer to the notes
 485       about sorted structures in the previous section.
 486
 487 .. c:function:: void Z_init_size(struct Z_head *, size_t size)
 488
 489    Same as :c:func:`Z_init()` but preset the minimum hash table to
 490    ``size``.
 491
 492 Hash tables also support :c:func:`Z_add()` and :c:func:`Z_find()` with
 493 the same semantics as noted above. :c:func:`Z_find_gteq()` and
 494 :c:func:`Z_find_lt()` are **not** provided for hash tables.
 495
 496
 497 API for heaps
 498 -------------
 499
 500 Heaps provide the same API as the sorted data structures, except:
 501
 502 * none of the find functions (:c:func:`Z_find()`, :c:func:`Z_find_gteq()`
 503   or :c:func:`Z_find_lt()`) are available.
 504 * iterating over the heap yields the items in semi-random order, only the
 505   first item is guaranteed to be in order and actually the "lowest" item
 506   on the heap.  Being a heap, only the rebalancing performed on removing the
 507   first item (either through :c:func:`Z_pop()` or :c:func:`Z_del()`) causes
 508   the new lowest item to bubble up to the front.
 509 * all heap modifications are O(log n).  However, cacheline efficiency and
 510   latency is likely quite a bit better than with other data structures.
 511
 512 Atomic lists
 513 ------------
 514
 515 `atomlist.h` provides an unsorted and a sorted atomic single-linked list.
 516 Since atomic memory accesses can be considerably slower than plain memory
 517 accessses (depending on the CPU type), these lists should only be used where
 518 neccessary.
 519
 520 The following guarantees are provided regarding concurrent access:
 521
 522 - the operations are lock-free but not wait-free.
 523
 524   Lock-free means that it is impossible for all threads to be blocked.  Some
 525   thread will always make progress, regardless of what other threads do.  (This
 526   even includes a random thread being stopped by a debugger in a random
 527   location.)
 528
 529   Wait-free implies that the time any single thread might spend in one of the
 530   calls is bounded.  This is not provided here since it is not normally
 531   relevant to practical operations.  What this means is that if some thread is
 532   hammering a particular list with requests, it is possible that another
 533   thread is blocked for an extended time.  The lock-free guarantee still
 534   applies since the hammering thread is making progress.
 535
 536 - without a RCU mechanism in place, the point of contention for atomic lists
 537   is memory deallocation.  As it is, **a rwlock is required for correct
 538   operation**.  The *read* lock must be held for all accesses, including
 539   reading the list, adding items to the list, and removing items from the
 540   list.  The *write* lock must be acquired and released before deallocating
 541   any list element.  If this is not followed, an use-after-free can occur
 542   as a MT race condition when an element gets deallocated while another
 543   thread is accessing the list.
 544
 545   .. note::
 546
 547      The *write* lock does not need to be held for deleting items from the
 548      list, and there should not be any instructions between the
 549      ``pthread_rwlock_wrlock`` and ``pthread_rwlock_unlock``.  The write lock
 550      is used as a sequence point, not as an exclusion mechanism.
 551
 552 - insertion operations are always safe to do with the read lock held.
 553   Added items are immediately visible after the insertion call returns and
 554   should not be touched anymore.
 555
 556 - when removing a *particular* (pre-determined) item, the caller must ensure
 557   that no other thread is attempting to remove that same item.  If this cannot
 558   be guaranteed by architecture, a separate lock might need to be added.
 559
 560 - concurrent `pop` calls are always safe to do with only the read lock held.
 561   This does not fall under the previous rule since the `pop` call will select
 562   the next item if the first is already being removed by another thread.
 563
 564   **Deallocation locking still applies.**  Assume another thread starts
 565   reading the list, but gets task-switched by the kernel while reading the
 566   first item.  `pop` will happily remove and return that item.  If it is
 567   deallocated without acquiring and releasing the write lock, the other thread
 568   will later resume execution and try to access the now-deleted element.
 569
 570 - the list count should be considered an estimate.  Since there might be
 571   concurrent insertions or removals in progress, it might already be outdated
 572   by the time the call returns.  No attempt is made to have it be correct even
 573   for a nanosecond.
 574
 575 Overall, atomic lists are well-suited for MT queues; concurrent insertion,
 576 iteration and removal operations will work with the read lock held.
 577
 578 Code snippets
 579 ^^^^^^^^^^^^^
 580
 581 Iteration:
 582
 583 .. code-block:: c
 584
 585    struct item *i;
 586
 587    pthread_rwlock_rdlock(&itemhead_rwlock);
 588    for_each(itemlist, &itemhead, i) {
 589      /* lock must remain held while iterating */
 590      ...
 591    }
 592    pthread_rwlock_unlock(&itemhead_rwlock);
 593
 594 Head removal (pop) and deallocation:
 595
 596 .. code-block:: c
 597
 598    struct item *i;
 599
 600    pthread_rwlock_rdlock(&itemhead_rwlock);
 601    i = itemlist_pop(&itemhead);
 602    pthread_rwlock_unlock(&itemhead_rwlock);
 603
 604    /* i might still be visible for another thread doing an
 605     * for_each() (but won't be returned by another pop()) */
 606    ...
 607
 608    pthread_rwlock_wrlock(&itemhead_rwlock);
 609    pthread_rwlock_unlock(&itemhead_rwlock);
 610    /* i now guaranteed to be gone from the list.
 611     * note nothing between wrlock() and unlock() */
 612    XFREE(MTYPE_ITEM, i);
 613
 614 FRR lists
 615 ---------
 616
 617 .. TODO::
 618
 619    document
 620
 621 BSD lists
 622 ---------
 623
 624 .. TODO::
 625
 626    refer to external docs