ceph/src/boost/libs/graph_parallel/doc/dehne_gotz_min_spanning_tree.rst

   1 .. Copyright (C) 2004-2008 The Trustees of Indiana University.
   2    Use, modification and distribution is subject to the Boost Software
   3    License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
   4    http://www.boost.org/LICENSE_1_0.txt)
   5
   6 ============================
   7 |Logo| Minimum Spanning Tree
   8 ============================
   9
  10 The Parallel BGL contains four `minimum spanning tree`_ (MST)
  11 algorithms [DG98]_ for use on undirected, weighted, distributed
  12 graphs. The graphs need not be connected: each algorithm will compute
  13 a minimum spanning forest (MSF) when provided with a disconnected
  14 graph.
  15
  16 The interface to each of the four algorithms is similar to the
  17 implementation of 'Kruskal's algorithm'_ in the sequential BGL. Each
  18 accepts, at a minimum, a graph, a weight map, and an output
  19 iterator. The edges of the MST (or MSF) will be output via the output
  20 iterator on process 0: other processes may receive edges on their
  21 output iterators, but the set may not be complete, depending on the
  22 algorithm. The algorithm parameters are documented together, because
  23 they do not vary greatly. See the section `Selecting an MST
  24 algorithm`_ for advice on algorithm selection.
  25
  26 The graph itself must model the `Vertex List Graph`_ concept and the
  27 Distributed Edge List Graph concept. Since the most common
  28 distributed graph structure, the `distributed adjacency list`_, only
  29 models the Distributed Vertex List Graph concept, it may only be used
  30 with these algorithms when wrapped in a suitable adaptor, such as the
  31 `vertex_list_adaptor`_.
  32
  33 .. contents::
  34
  35 Where Defined
  36 -------------
  37 <``boost/graph/distributed/dehne_gotz_min_spanning_tree.hpp``>
  38
  39 Parameters
  40 ----------
  41
  42 IN: ``Graph& g``
  43   The graph type must be a model of `Vertex List Graph`_ and
  44   `Distributed Edge List Graph`_.
  45
  46
  47
  48 IN/OUT: ``WeightMap weight``
  49   The weight map must be a `Distributed Property Map`_ and a `Readable
  50   Property Map`_ whose key type is the edge descriptor of the graph
  51   and whose value type is numerical.
  52
  53
  54
  55 IN/OUT: ``OutputIterator out``
  56   The output iterator through which the edges of the MSF will be
  57   written. Must be capable of accepting edge descriptors for output.
  58
  59
  60
  61
  62 IN: ``VertexIndexMap index``
  63   A mapping from vertex descriptors to indices in the range *[0,
  64   num_vertices(g))*. This must be a `Readable Property Map`_ whose
  65   key type is a vertex descriptor and whose value type is an integral
  66   type, typically the ``vertices_size_type`` of the graph.
  67
  68   **Default:** ``get(vertex_index, g)``
  69
  70
  71 IN/UTIL: ``RankMap rank_map``
  72   Stores the rank of each vertex, which is used for maintaining
  73   union-find data structures. This must be a `Read/Write Property Map`_
  74   whose key type is a vertex descriptor and whose value type is an
  75   integral type.
  76
  77   **Default:** An ``iterator_property_map`` built from an STL vector
  78   of the ``vertices_size_type`` of the graph and the vertex index map.
  79
  80
  81 IN/UTIL: ``ParentMap parent_map``
  82   Stores the parent (representative) of each vertex, which is used for
  83   maintaining union-find data structures. This must be a `Read/Write
  84   Property Map`_ whose key type is a vertex descriptor and whose value
  85   type is also a vertex descriptor.
  86
  87   **Default:** An ``iterator_property_map`` built from an STL vector
  88   of the ``vertex_descriptor`` of the graph and the vertex index map.
  89
  90
  91 IN/UTIL: ``SupervertexMap supervertex_map``
  92   Stores the supervertex index of each vertex, which is used for
  93   maintaining the supervertex list data structures. This must be a
  94   `Read/Write Property Map`_ whose key type is a vertex descriptor and
  95   whose value type is an integral type.
  96
  97   **Default:** An ``iterator_property_map`` built from an STL vector
  98   of the ``vertices_size_type`` of the graph and the vertex index map.
  99
 100
 101
 102 ``dense_boruvka_minimum_spanning_tree``
 103 ---------------------------------------
 104
 105 ::
 106
 107   namespace graph {
 108     template<typename Graph, typename WeightMap, typename OutputIterator,
 109              typename VertexIndexMap, typename RankMap, typename ParentMap,
 110              typename SupervertexMap>
 111     OutputIterator
 112     dense_boruvka_minimum_spanning_tree(const Graph& g, WeightMap weight_map,
 113                                       OutputIterator out,
 114                                       VertexIndexMap index,
 115                                       RankMap rank_map, ParentMap parent_map,
 116                                       SupervertexMap supervertex_map);
 117
 118     template<typename Graph, typename WeightMap, typename OutputIterator,
 119              typename VertexIndex>
 120     OutputIterator
 121     dense_boruvka_minimum_spanning_tree(const Graph& g, WeightMap weight_map,
 122                                       OutputIterator out, VertexIndex index);
 123
 124     template<typename Graph, typename WeightMap, typename OutputIterator>
 125     OutputIterator
 126     dense_boruvka_minimum_spanning_tree(const Graph& g, WeightMap weight_map,
 127                                       OutputIterator out);
 128   }
 129
 130 Description
 131 ~~~~~~~~~~~
 132
 133 The dense Boruvka distributed minimum spanning tree algorithm is a
 134 direct parallelization of the sequential MST algorithm by
 135 Boruvka. The algorithm first creates a *supervertex* out of each
 136 vertex. Then, in each iteration, it finds the smallest-weight edge
 137 incident to each vertex, collapses supervertices along these edges,
 138 and removals all self loops. The only difference between the
 139 sequential and parallel algorithms is that the parallel algorithm
 140 performs an all-reduce operation so that all processes have the
 141 global minimum set of edges.
 142
 143 Unlike the other three algorithms, this algorithm emits the complete
 144 list of edges in the minimum spanning forest via the output iterator
 145 on all processes. It may therefore be more useful than the others
 146 when parallelizing sequential BGL programs.
 147
 148 Complexity
 149 ~~~~~~~~~~
 150
 151 The distributed algorithm requires *O(log n)* BSP supersteps, each of
 152 which requires *O(m/p + n)* time and *O(n)* communication per
 153 process.
 154
 155 Performance
 156 ~~~~~~~~~~~
 157
 158 The following charts illustrate the performance of this algorithm on
 159 various random graphs. We see that the algorithm scales well up to 64
 160 or 128 processors, depending on the type of graph, for dense
 161 graphs. However, for sparse graphs performance tapers off as the
 162 number of processors surpases *m/n*, i.e., the average degree (which
 163 is 30 for this graph). This behavior is expected from the algorithm.
 164
 165 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=5
 166   :align: left
 167 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=5&speedup=1
 168
 169 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=5
 170   :align: left
 171 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=5&speedup=1
 172
 173 ``merge_local_minimum_spanning_trees``
 174 --------------------------------------
 175
 176 ::
 177
 178   namespace graph {
 179     template<typename Graph, typename WeightMap, typename OutputIterator,
 180              typename VertexIndexMap>
 181     OutputIterator
 182     merge_local_minimum_spanning_trees(const Graph& g, WeightMap weight,
 183                                        OutputIterator out,
 184                                        VertexIndexMap index);
 185
 186     template<typename Graph, typename WeightMap, typename OutputIterator>
 187     inline OutputIterator
 188     merge_local_minimum_spanning_trees(const Graph& g, WeightMap weight,
 189                                        OutputIterator out);
 190   }
 191
 192 Description
 193 ~~~~~~~~~~~
 194
 195 The merging local MSTs algorithm operates by computing minimum
 196 spanning forests from the edges stored on each process. Then the
 197 processes merge their edge lists along a tree. The child nodes cease
 198 participating in the computation, but the parent nodes recompute MSFs
 199 from the newly acquired edges. In the final round, the root of the
 200 tree computes the global MSFs, having received candidate edges from
 201 every other process via the tree.
 202
 203 Complexity
 204 ~~~~~~~~~~
 205
 206 This algorithm requires *O(log_D p)* BSP supersteps (where *D* is the
 207 number of children in the tree, and is currently fixed at 3). Each
 208 superstep requires *O((m/p) log (m/p) + n)* time and *O(m/p)*
 209 communication per process.
 210
 211 Performance
 212 ~~~~~~~~~~~
 213
 214 The following charts illustrate the performance of this algorithm on
 215 various random graphs. The algorithm only scales well for very dense
 216 graphs, where most of the work is performed in the initial stage and
 217 there is very little work in the later stages.
 218
 219 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=6
 220   :align: left
 221 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=6&speedup=1
 222
 223 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=6
 224   :align: left
 225 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=6&speedup=1
 226
 227
 228 ``boruvka_then_merge``
 229 ----------------------
 230
 231 ::
 232
 233   namespace graph {
 234     template<typename Graph, typename WeightMap, typename OutputIterator,
 235              typename VertexIndexMap, typename RankMap, typename ParentMap,
 236              typename SupervertexMap>
 237     OutputIterator
 238     boruvka_then_merge(const Graph& g, WeightMap weight, OutputIterator out,
 239                        VertexIndexMap index, RankMap rank_map,
 240                        ParentMap parent_map, SupervertexMap
 241                        supervertex_map);
 242
 243     template<typename Graph, typename WeightMap, typename OutputIterator,
 244              typename VertexIndexMap>
 245     inline OutputIterator
 246     boruvka_then_merge(const Graph& g, WeightMap weight, OutputIterator out,
 247                         VertexIndexMap index);
 248
 249     template<typename Graph, typename WeightMap, typename OutputIterator>
 250     inline OutputIterator
 251     boruvka_then_merge(const Graph& g, WeightMap weight, OutputIterator out);
 252   }
 253
 254 Description
 255 ~~~~~~~~~~~
 256
 257 This algorithm applies both Boruvka steps and local MSF merging steps
 258 together to achieve better asymptotic performance than either
 259 algorithm alone. It first executes Boruvka steps until only *n/(log_d
 260 p)^2* supervertices remain, then completes the MSF computation by
 261 performing local MSF merging on the remaining edges and
 262 supervertices.
 263
 264 Complexity
 265 ~~~~~~~~~~
 266
 267 This algorithm requires *log_D p* + *log log_D p* BSP supersteps. The
 268 time required by each superstep depends on the type of superstep
 269 being performed; see the distributed Boruvka or merging local MSFS
 270 algorithms for details.
 271
 272 Performance
 273 ~~~~~~~~~~~
 274
 275 The following charts illustrate the performance of this algorithm on
 276 various random graphs. We see that the algorithm scales well up to 64
 277 or 128 processors, depending on the type of graph, for dense
 278 graphs. However, for sparse graphs performance tapers off as the
 279 number of processors surpases *m/n*, i.e., the average degree (which
 280 is 30 for this graph). This behavior is expected from the algorithm.
 281
 282 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=7
 283   :align: left
 284 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=7&speedup=1
 285
 286 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=7
 287   :align: left
 288 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=7&speedup=1
 289
 290 ``boruvka_mixed_merge``
 291 -----------------------
 292
 293 ::
 294
 295   namespace {
 296     template<typename Graph, typename WeightMap, typename OutputIterator,
 297              typename VertexIndexMap, typename RankMap, typename ParentMap,
 298              typename SupervertexMap>
 299     OutputIterator
 300     boruvka_mixed_merge(const Graph& g, WeightMap weight, OutputIterator out,
 301                         VertexIndexMap index, RankMap rank_map,
 302                         ParentMap parent_map, SupervertexMap
 303                         supervertex_map);
 304
 305     template<typename Graph, typename WeightMap, typename OutputIterator,
 306              typename VertexIndexMap>
 307     inline OutputIterator
 308     boruvka_mixed_merge(const Graph& g, WeightMap weight, OutputIterator out,
 309                         VertexIndexMap index);
 310
 311     template<typename Graph, typename WeightMap, typename OutputIterator>
 312     inline OutputIterator
 313     boruvka_mixed_merge(const Graph& g, WeightMap weight, OutputIterator out);
 314   }
 315
 316 Description
 317 ~~~~~~~~~~~
 318
 319 This algorithm applies both Boruvka steps and local MSF merging steps
 320 together to achieve better asymptotic performance than either method
 321 alone. In each iteration, the algorithm first performs a Boruvka step
 322 and then merges the local MSFs computed based on the supervertex
 323 graph.
 324
 325 Complexity
 326 ~~~~~~~~~~
 327
 328 This algorithm requires *log_D p* BSP supersteps. The
 329 time required by each superstep depends on the type of superstep
 330 being performed; see the distributed Boruvka or merging local MSFS
 331 algorithms for details. However, the algorithm is
 332 communication-optional (requiring *O(n)* communication overall) when
 333 the graph is sufficiently dense, i.e., *m/n >= p*.
 334
 335 Performance
 336 ~~~~~~~~~~~
 337
 338 The following charts illustrate the performance of this algorithm on
 339 various random graphs. We see that the algorithm scales well up to 64
 340 or 128 processors, depending on the type of graph, for dense
 341 graphs. However, for sparse graphs performance tapers off as the
 342 number of processors surpases *m/n*, i.e., the average degree (which
 343 is 30 for this graph). This behavior is expected from the algorithm.
 344
 345 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=8
 346   :align: left
 347 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=8&speedup=1
 348
 349 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=8
 350   :align: left
 351 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=8&speedup=1
 352
 353
 354 Selecting an MST algorithm
 355 --------------------------
 356
 357 Dehne and Gotz reported [DG98]_ mixed results when evaluating these
 358 four algorithms. No particular algorithm was clearly better than the
 359 others in all cases. However, the asymptotically best algorithm
 360 (``boruvka_mixed_merge``) seemed to perform more poorly in their tests
 361 than the other merging-based algorithms. The following performance
 362 charts illustrate the performance of these four minimum spanning tree
 363 implementations.
 364
 365 Overall, ``dense_boruvka_minimum_spanning_tree`` gives the most
 366 consistent performance and scalability for the graphs we
 367 tested. Additionally, it may be more suitable for sequential programs
 368 that are being parallelized, because it emits complete MSF edge lists
 369 via the output iterators in every process.
 370
 371 Performance on Sparse Graphs
 372 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 373 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER&dataset=TimeSparse&columns=5,6,7,8
 374   :align: left
 375 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER&dataset=TimeSparse&columns=5,6,7,8&speedup=1
 376
 377 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SF&dataset=TimeSparse&columns=5,6,7,8
 378   :align: left
 379 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SF&dataset=TimeSparse&columns=5,6,7,8&speedup=1
 380
 381 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SW&dataset=TimeSparse&columns=5,6,7,8
 382   :align: left
 383 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SW&dataset=TimeSparse&columns=5,6,7,8&speedup=1
 384
 385 Performance on Dense Graphs
 386 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 387 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER&dataset=TimeDense&columns=5,6,7,8
 388   :align: left
 389 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER&dataset=TimeDense&columns=5,6,7,8&speedup=1
 390
 391 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SF&dataset=TimeDense&columns=5,6,7,8
 392   :align: left
 393 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SF&dataset=TimeDense&columns=5,6,7,8&speedup=1
 394
 395 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SW&dataset=TimeDense&columns=5,6,7,8
 396   :align: left
 397 .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SW&dataset=TimeDense&columns=5,6,7,8&speedup=1
 398
 399 -----------------------------------------------------------------------------
 400
 401 Copyright (C) 2004 The Trustees of Indiana University.
 402
 403 Authors: Douglas Gregor and Andrew Lumsdaine
 404
 405 .. |Logo| image:: pbgl-logo.png
 406             :align: middle
 407             :alt: Parallel BGL
 408             :target: http://www.osl.iu.edu/research/pbgl
 409
 410 .. _minimum spanning tree: http://www.boost.org/libs/graph/doc/graph_theory_review.html#sec:minimum-spanning-tree
 411 .. _Kruskal's algorithm: http://www.boost.org/libs/graph/doc/kruskal_min_spanning_tree.html
 412 .. _Vertex list graph: http://www.boost.org/libs/graph/doc/VertexListGraph.html
 413 .. _distributed adjacency list: distributed_adjacency_list.html
 414 .. _vertex_list_adaptor: vertex_list_adaptor.html
 415 .. _Distributed Edge List Graph: DistributedEdgeListGraph.html
 416 .. _Distributed property map: distributed_property_map.html
 417 .. _Readable Property Map: http://www.boost.org/libs/property_map/ReadablePropertyMap.html
 418 .. _Read/Write Property Map: http://www.boost.org/libs/property_map/ReadWritePropertyMap.html
 419
 420 .. [DG98] Frank Dehne and Silvia Gotz. *Practical Parallel Algorithms
 421     for Minimum Spanning Trees*. In Symposium on Reliable Distributed Systems,
 422     pages 366--371, 1998.
 423