]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | .. Copyright (C) 2004-2008 The Trustees of Indiana University. |
2 | Use, modification and distribution is subject to the Boost Software | |
3 | License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at | |
4 | http://www.boost.org/LICENSE_1_0.txt) | |
5 | ||
6 | =========================== | |
7 | |Logo| Connected Components | |
8 | =========================== | |
9 | ||
10 | :: | |
11 | ||
12 | namespace graph { | |
13 | // Default constructed ParentMap | |
14 | template<typename Graph, typename ComponentMap, typename ParentMap> | |
15 | typename property_traits<ComponentMap>::value_type | |
16 | connected_components( const Graph& g, ComponentMap c); | |
17 | ||
18 | // User supplied ParentMap | |
19 | template<typename Graph, typename ComponentMap, typename ParentMap> | |
20 | typename property_traits<ComponentMap>::value_type | |
21 | connected_components( const Graph& g, ComponentMap c, ParentMap p); | |
22 | } | |
23 | ||
24 | The ``connected_components()`` function computes the connected | |
25 | components of an undirected graph. The distributed connected | |
26 | components algorithm uses the sequential version of the connected | |
27 | components algorithm to compute the connected components of the local | |
28 | subgraph, then executes the parallel phase of the algorithm. The | |
29 | parallel portion of the connected components algorithm is loosely | |
30 | based on the work of Goddard, Kumar, and Prins. The interface is a | |
31 | superset of the interface to the BGL `sequential connected | |
32 | components`_ algorithm. | |
33 | ||
34 | Prior to executing the sequential phase of the algorithm, each process | |
35 | identifies the roots of its local components. An adjacency list of | |
36 | all vertices adjacent to members of the component is then constructed | |
37 | at the root vertex of each component. | |
38 | ||
39 | The parallel phase of the distributed connected components algorithm | |
40 | consists of a series of supersteps. In each superstep, each root | |
41 | attempts to hook to a member of it's adjacency list by assigning it's | |
42 | parent pointer to that vertex. Hooking is restricted to vertices | |
43 | which are logically less than the current vertex to prevent looping. | |
44 | Vertices which hook successfully are removed from the list of roots | |
45 | and placed on another list of completed vertices. All completed | |
46 | vertices now execute a pointer jumping step until every completed | |
47 | vertex has as its parent the root of a component. This pointer | |
48 | jumping step may be further optimized by the addition of Cycle | |
49 | Reduction (CR) rules developed by Johnson and Metaxas, however current | |
50 | performance evaluations indicate that this would have a negligible | |
51 | impact on the overall performance of the algorithm. These CR rules | |
52 | reduce the number of pointer jumping steps from *O(n)* to *O(log n)*. | |
53 | Following this pointer jumping step, roots which have hooked in this | |
54 | phase transmit their adjacency list to their new parent. The | |
55 | remaining roots receive these edges and execute a pruning step on | |
56 | their adjacency lists to remove vertices that are now members of their | |
57 | component. The parallel phase of the algorithm is complete when no | |
58 | root successfully hooks. Once the parallel phase is complete a final | |
59 | pointer jumping step is performed on all vertices to assign the parent | |
60 | pointers of the leaves of the initial local subgraph components to | |
61 | their final parent which has now been determined. | |
62 | ||
63 | The single largest performance bottleneck in the distributed connected | |
64 | components algorithm is the effect of poor vertex distribution on the | |
65 | algorithm. For sparse graphs with a single large component, many | |
66 | roots may hook to the same component, resulting in severe load | |
67 | imbalance at the process owning this component. Several methods of | |
68 | modifying the hooking strategy to avoid this behavior have been | |
69 | implemented but none has been successful as of yet. | |
70 | ||
71 | .. contents:: | |
72 | ||
73 | Where Defined | |
74 | ------------- | |
75 | <``boost/graph/connected_components.hpp``> | |
76 | ||
77 | Parameters | |
78 | ---------- | |
79 | ||
80 | IN: ``Graph& g`` | |
81 | The graph typed must be a model of `Distributed Graph`_. | |
82 | ||
83 | OUT: ``ComponentMap c`` | |
84 | The algorithm computes how many connected components are in the | |
85 | graph, and assigns each component an integer label. The algorithm | |
86 | then records to which component each vertex in the graph belongs by | |
87 | recording the component number in the component property map. The | |
88 | ``ComponentMap`` type must be a `Distributed Property Map`_. The | |
89 | value type must be the ``vertices_size_type`` of the graph. The key | |
90 | type must be the graph's vertex descriptor type. If you do not wish | |
91 | to compute component numbers, pass ``dummy_property_map`` as the | |
92 | component map and parent information will be provided in the parent | |
93 | map. | |
94 | ||
95 | UTIL: ``ParentMap p`` | |
96 | A parent map may be supplied to the algorithm, if not supplied the | |
97 | parent map will be constructed automatically. The ``ParentMap`` type | |
98 | must be a `Distributed Property Map`_. The value type and key type | |
99 | must be the graph's vertex descriptor type. | |
100 | ||
101 | OUT: ``property_traits<ComponentMap>::value_type`` | |
102 | The number of components found will be returned as the value type of | |
103 | the component map. | |
104 | ||
105 | Complexity | |
106 | ---------- | |
107 | ||
108 | The local phase of the algorithm is *O(V + E)*. The parallel phase of | |
109 | the algorithm requires at most *O(d)* supersteps where *d* is the | |
110 | number of initial roots. *d* is at most *O(V)* but becomes | |
111 | significantly smaller as *E* increases. The pointer jumping phase | |
112 | within each superstep requires at most *O(c)* steps on each of the | |
113 | completed roots where *c* is the length of the longest cycle. | |
114 | Application of CR rules can reduce this to *O(log c)*. | |
115 | ||
116 | Performance | |
117 | ----------- | |
118 | The following charts illustrate the performance of the Parallel BGL | |
119 | connected components algorithm. It performs well on very sparse and | |
120 | very dense graphs. However, for cases where the graph has a medium | |
121 | density with a giant connected component, the algorithm will perform | |
122 | poorly. This is a known problem with the algorithm and as far as we | |
123 | know all implemented algorithms have this degenerate behavior. | |
124 | ||
125 | .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=9 | |
126 | :align: left | |
127 | .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=9&speedup=1 | |
128 | ||
129 | .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=9 | |
130 | :align: left | |
131 | .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=9&speedup=1 | |
132 | ||
133 | ||
134 | ----------------------------------------------------------------------------- | |
135 | ||
136 | Copyright (C) 2004 The Trustees of Indiana University. | |
137 | ||
138 | Authors: Nick Edmonds, Douglas Gregor, and Andrew Lumsdaine | |
139 | ||
140 | .. |Logo| image:: pbgl-logo.png | |
141 | :align: middle | |
142 | :alt: Parallel BGL | |
143 | :target: http://www.osl.iu.edu/research/pbgl | |
144 | ||
145 | .. _Sequential connected components: http://www.boost.org/libs/graph/doc/connected_components.html | |
146 | .. _Distributed Graph: DistributedGraph.html | |
147 | .. _Distributed Property Map: distributed_property_map.html |