ceph/src/boost/libs/graph_parallel/doc/breadth_first_search.rst

   1 .. Copyright (C) 2004-2008 The Trustees of Indiana University.
   2    Use, modification and distribution is subject to the Boost Software
   3    License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
   4    http://www.boost.org/LICENSE_1_0.txt)
   5
   6 ===========================
   7 |Logo| Breadth-First Search
   8 ===========================
   9
  10 ::
  11
  12   // named parameter version
  13   template <class Graph, class P, class T, class R>
  14   void breadth_first_search(Graph& G,
  15     typename graph_traits<Graph>::vertex_descriptor s,
  16     const bgl_named_params<P, T, R>& params);
  17
  18   // non-named parameter version
  19   template <class Graph, class Buffer, class BFSVisitor,
  20             class ColorMap>
  21   void breadth_first_search(const Graph& g,
  22      typename graph_traits<Graph>::vertex_descriptor s,
  23      Buffer& Q, BFSVisitor vis, ColorMap color);
  24
  25 The ``breadth_first_search()`` function performs a distributed breadth-first
  26 traversal of a directed or undirected graph. The distributed BFS is
  27 syntactically equivalent to its `sequential counterpart`_, which
  28 provides additional background and discussion. Differences in
  29 semantics are highlighted here, and we refer the reader to the
  30 documentation of the `sequential breadth-first search`_ for the
  31 remainder of the details.
  32
  33 This distributed breadth-first search algorithm implements a
  34 *level-synchronized* breadth-first search, meaning that all vertices
  35 in a given level of the BFS tree will be processed (potentially in
  36 parallel) before any vertices from a successive level in the tree are
  37 processed. Distributed breadth-first search visitors should account
  38 for this behavior, a topic discussed further in `Visitor Event
  39 Points`_.
  40
  41 .. contents::
  42
  43 Where Defined
  44 -------------
  45 <``boost/graph/breadth_first_search.hpp``>
  46
  47 Parameter Defaults
  48 ------------------
  49 All parameters of the `sequential breadth-first search`_ are supported
  50 and have essentially the same meaning. Only differences are documented
  51 here.
  52
  53 IN: ``Graph& g``
  54   The graph type must be a model of `Distributed Graph`_.
  55
  56
  57 IN: ``vertex_descriptor s``
  58   The start vertex must be the same in every process.
  59
  60
  61 IN: ``visitor(BFSVisitor vis)``
  62   The visitor must be a distributed BFS visitor. The suble differences
  63   between sequential and distributed BFS visitors are discussed in the
  64   section `Visitor Event Points`_.
  65
  66 UTIL/OUT: ``color_map(ColorMap color)``
  67   The color map must be a `Distributed Property Map`_ with the same
  68   process group as the graph ``g`` whose colors must monotonically
  69   darken (white -> gray -> black). The default value is a distributed
  70   ``iterator_property_map`` created from a ``std::vector`` of
  71   ``default_color_type``.
  72
  73
  74 UTIL: ``buffer(Buffer& Q)``
  75   The queue must be a distributed queue that passes vertices to their
  76   owning process. If already-visited vertices should not be visited
  77   again (as is typical for BFS), the queue must filter duplicates
  78   itself. The queue controls synchronization within the algorithm: its
  79   ``empty()`` method must not return ``true`` until all local queues
  80   are empty.
  81
  82   **Default:** A ``distributed_queue`` of a ``filtered_queue`` over a
  83     standard ``boost::queue``. This queue filters out duplicate
  84     vertices and distributes vertices appropriately.
  85
  86 Complexity
  87 ----------
  88 This algorithm performs *O(V + E)* work in *d + 1* BSP supersteps,
  89 where *d* is the diameter of the connected component being
  90 searched. Over all supersteps, *O(E)* messages of constant size will
  91 be transmitted.
  92
  93 Visitor Event Points
  94 --------------------
  95 The `BFS Visitor`_ concept defines 9 event points that will be
  96 triggered by the `sequential breadth-first search`_. The distributed
  97 BFS retains these nine event points, but the sequence of events
  98 triggered and the process in which each event occurs will change
  99 depending on the distribution of the graph.
 100
 101 ``initialize_vertex(s, g)``
 102   This will be invoked by every process for each local vertex.
 103
 104
 105 ``discover_vertex(u, g)``
 106   This will be invoked each time a process discovers a new vertex
 107   ``u``. Due to incomplete information in distributed property maps,
 108   this event may be triggered many times for the same vertex ``u``.
 109
 110
 111 ``examine_vertex(u, g)``
 112   This will be invoked by the process owning the vertex ``u``. If the
 113   distributed queue prevents duplicates, it will be invoked only
 114   once for a particular vertex ``u``.
 115
 116
 117 ``examine_edge(e, g)``
 118   This will be invoked by the process owning the source vertex of
 119   ``e``. If the distributed queue prevents duplicates, it will be
 120   invoked only once for a particular edge ``e``.
 121
 122
 123 ``tree_edge(e, g)``
 124   Similar to ``examine_edge``, this will be invoked by the process
 125   owning the source vertex and may be invoked only once. Unlike the
 126   sequential BFS, this event may be triggered even when the target has
 127   already been discovered (but by a different process). Thus, some
 128   ``non_tree_edge`` events in a sequential BFS may become
 129   ``tree_edge`` in a distributed BFS.
 130
 131
 132 ``non_tree_edge(e, g)``
 133   Some ``non_tree_edge`` events in a sequential BFS may become
 134   ``tree_edge`` events in a distributed BFS. See the description of
 135   ``tree_edge`` for additional details.
 136
 137
 138 ``gray_target(e, g)``
 139   As with ``tree_edge`` not knowing when another process has already
 140   discovered a vertex, ``gray_target`` events may occur in a
 141   distributed BFS when ``black_target`` events may occur in a
 142   sequential BFS, due to a lack of information on a given
 143   processor. The source of edge ``e`` will be local to the process
 144   executing this event.
 145
 146
 147 ``black_target(e, g)``
 148   See documentation for ``gray_target``
 149
 150
 151 ``finish_vertex(e, g)``
 152   See documentation for ``examine_vertex``.
 153
 154 Making Visitors Safe for Distributed BFS
 155 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 156 The three most important things to remember when updating an existing
 157 BFS visitor for distributed BFS or writing a new distributed BFS
 158 visitor are:
 159
 160 1. Be sure that all state is either entirely local or in a
 161    distributed data structure (most likely a property map!) using
 162    the same process group as the graph.
 163
 164 2. Be sure that the visitor doesn't require precise event sequences
 165    that cannot be guaranteed by distributed BFS, e.g., requiring
 166    ``tree_edge`` and ``non_tree_edge`` events to be completely
 167    distinct.
 168
 169 3. Be sure that the visitor can operate on incomplete
 170    information. This often includes using an appropriate reduction
 171    operation in a `distributed property map`_ and verifying that
 172    results written are "better" that what was previously written.
 173
 174 Distributed BFS Visitor Example
 175 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 176 To illustrate the differences between sequential and distributed BFS
 177 visitors, we consider a BFS visitor that places the distance from the
 178 source vertex to every other vertex in a property map. The sequential
 179 visitor is very simple::
 180
 181   template<typename DistanceMap>
 182   struct bfs_discovery_visitor : bfs_visitor<>
 183   {
 184     bfs_discovery_visitor(DistanceMap distance) : distance(distance) {}
 185
 186     template<typename Edge, typename Graph>
 187     void tree_edge(Edge e, const Graph& g)
 188     {
 189       std::size_t new_distance = get(distance, source(e, g)) + 1;
 190       put(distance, target(e, g), new_distance);
 191     }
 192
 193    private:
 194     DistanceMap distance;
 195   };
 196
 197 To revisit this code for distributed BFS, we consider the three points
 198 in the section `Making Visitors Safe for Distributed BFS`_:
 199
 200 1. The distance map will need to become distributed, because the
 201    distance to each vertex should be stored in the process owning the
 202    vertex. This is actually a requirement on the user to provide such
 203    a distributed property map, although in many cases the property map
 204    will automatically be distributed and no syntactic changes will be
 205    required.
 206
 207 2. This visitor *does* require a precise sequence of events that may
 208    change with a distributed BFS. The extraneous ``tree_edge`` events
 209    that may occur could result in attempts to put distances into the
 210    distance map multiple times for a single vertex. We therefore need
 211    to consider bullet #3.
 212
 213 3. Since multiple distance values may be written for each vertex, we
 214    must always choose the best value we can find to update the
 215    distance map. The distributed property map ``distance`` needs to
 216    resolve distances to the smallest distance it has seen. For
 217    instance, process 0 may find vertex ``u`` at level 3 but process 1
 218    finds it at level 5: the distance must remain at 3. To do this, we
 219    state that the property map's *role* is as a distance map, which
 220    introduces an appropriate reduction operation::
 221
 222           set_property_map_role(vertex_distance, distance);
 223
 224 The resulting distributed BFS visitor (which also applies, with no
 225 changes, in the sequential BFS) is very similar to our original
 226 sequential BFS visitor. Note the single-line difference in the
 227 constructor::
 228
 229   template<typename DistanceMap>
 230   struct bfs_discovery_visitor : bfs_visitor<>
 231   {
 232     bfs_discovery_visitor(DistanceMap distance) : distance(distance)
 233     {
 234       set_property_map_role(vertex_distance, distance);
 235     }
 236
 237     template<typename Edge, typename Graph>
 238     void tree_edge(Edge e, const Graph& g)
 239     {
 240       std::size_t new_distance = get(distance, source(e, g)) + 1;
 241       put(distance, target(e, g), new_distance);
 242     }
 243
 244    private:
 245     DistanceMap distance;
 246   };
 247
 248
 249 Performance
 250 -----------
 251 The performance of Breadth-First Search is illustrated by the
 252 following charts. Scaling and performance is reasonable for all of the
 253 graphs we have tried.
 254
 255 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=4
 256   :align: left
 257 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=4&speedup=1
 258
 259 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=4
 260   :align: left
 261 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=4&speedup=1
 262
 263 -----------------------------------------------------------------------------
 264
 265 Copyright (C) 2004 The Trustees of Indiana University.
 266
 267 Authors: Douglas Gregor and Andrew Lumsdaine
 268
 269 .. |Logo| image:: pbgl-logo.png
 270             :align: middle
 271             :alt: Parallel BGL
 272             :target: http://www.osl.iu.edu/research/pbgl
 273
 274 .. _sequential counterpart: http://www.boost.org/libs/graph/doc/breadth_first_search.html
 275 .. _sequential breadth-first search: http://www.boost.org/libs/graph/doc/breadth_first_search.html
 276 .. _Distributed Graph: DistributedGraph.html
 277 .. _Distributed Property Map: distributed_property_map.html
 278 .. _BFS Visitor: http://www.boost.org/libs/graph/doc/BFSVisitor.html