]> git.proxmox.com Git - ceph.git/blame - ceph/src/boost/libs/graph/doc/sparse_matrix_ordering.html
bump version to 12.2.2-pve1
[ceph.git] / ceph / src / boost / libs / graph / doc / sparse_matrix_ordering.html
CommitLineData
7c673cae
FG
1<HTML>
2<!--
3 Copyright (c) Jeremy Siek 2000
4
5 Distributed under the Boost Software License, Version 1.0.
6 (See accompanying file LICENSE_1_0.txt or copy at
7 http://www.boost.org/LICENSE_1_0.txt)
8 -->
9<Head>
10<Title>Sparse Matrix Ordering Example</Title>
11<BODY BGCOLOR="#ffffff" LINK="#0000ee" TEXT="#000000" VLINK="#551a8b"
12 ALINK="#ff0000">
13<IMG SRC="../../../boost.png"
14 ALT="C++ Boost" width="277" height="86">
15
16<BR Clear>
17
18<H1><A NAME="sec:sparse-matrix-ordering"></A>
19Sparse Matrix Ordering
20</H1>
21
22<P>
23Graph theory was identified as a powerful tool for sparse matrix
24computation when Seymour Parter used undirected graphs to model
25symmetric Gaussian elimination more than 30 years ago&nbsp;[<A HREF="bibliography.html#parter61:_gauss">28</A>].
26Graphs can be used to model symmetric matrices, factorizations and
27algorithms on non-symmetric matrices, such as fill paths in Gaussian
28elimination, strongly connected components in matrix irreducibility,
29bipartite matching, and alternating paths in linear dependence and
30structural singularity. Not only do graphs make it easier to
31understand and analyze sparse matrix algorithms, but they broaden the
32area of manipulating sparse matrices using existing graph algorithms
33and techniques&nbsp;[<A
34 HREF="bibliography.html#george93:graphtheory">13</A>]. In this section, we are
35going to illustrate how to use BGL in sparse matrix computation such
36as ordering algorithms. Before we go further into the sparse matrix
37algorithms, let us take a step back and review a few things.
38
39<P>
40
41<H2>Graphs and Sparse Matrices</H2>
42
43<P>
44A graph is fundamentally a way to represent a binary relation between
45objects. The nonzero pattern of a sparse matrix also describes a
46binary relation between unknowns. Therefore the nonzero pattern of a
47sparse matrix of a linear system can be modeled with a graph
48<I>G(V,E)</I>, whose <I>n</I> vertices in <I>V</I> represent the
49<I>n</I> unknowns, and where there is an edge from vertex <I>i</I> to
50vertex <I>j</I> when <i>A<sub>ij</sub></i> is nonzero. Thus, when a
51matrix has a symmetric nonzero pattern, the corresponding graph is
52undirected.
53
54<P>
55
56<H2>Sparse Matrix Ordering Algorithms</H2>
57
58<P>
59The process for solving a sparse symmetric positive definite linear
60system, <i>Ax=b</i>, can be divided into four stages as follows:
61<DL>
62<DT><STRONG>Ordering:</STRONG></DT>
63<DD>Find a permutation <i>P</i> of matrix <i>A</i>,
64</DD>
65<DT><STRONG>Symbolic factorization:</STRONG></DT>
66<DD>Set up a data structure for the Cholesky
67 factor <i>L</i> of <i>PAP<sup>T</sup></i>,
68</DD>
69<DT><STRONG>Numerical factorization:</STRONG></DT>
70<DD>Decompose <i>PAP<sup>T</sup></i> into <i>LL<sup>T</sup></i>,
71</DD>
72<DT><STRONG>Triangular system solution:</STRONG></DT>
73<DD>Solve <i>LL<sup>T</sup>Px=Pb</i> for <i>x</i>.
74</DD>
75</DL>
76Because the choice of permutation <i>P</i> will directly determine the
77number of fill-in elements (elements present in the non-zero structure
78of <i>L</i> that are not present in the non-zero structure of
79<i>A</i>), the ordering has a significant impact on the memory and
80computational requirements for the latter stages. However, finding
81the optimal ordering for <i>A</i> (in the sense of minimizing fill-in)
82has been proven to be NP-complete&nbsp;[<a
83href="bibliography.html#garey79:computers-and-intractability">30</a>]
84requiring that heuristics be used for all but simple (or specially
85structured) cases.
86
87<P>
88A widely used but rather simple ordering algorithm is a variant of the
89Cuthill-McKee orderings, the reverse Cuthill-McKee ordering algorithm.
90This algorithm can be used as a preordering method to improve ordering
91in more sophisticated methods such as minimum degree
92algorithms&nbsp;[<a href="bibliography.html#LIU:MMD">21</a>].
93
94<P>
95
96<H3><A NAME="SECTION001032100000000000000">
97Reverse Cuthill-McKee Ordering Algorithm</A>
98</H3>
99The original Cuthill-McKee ordering algorithm is primarily designed to
100reduce the profile of a matrix&nbsp;[<A
101 HREF="bibliography.html#george81:__sparse_pos_def">14</A>].
102George discovered that the reverse ordering often turned out to be
103superior to the original ordering in 1971. Here we describe the
104Reverse Cuthill-McKee algorithm in terms of a graph:
105
106<OL>
107<LI><I>Finding a starting vertex</I>: Determine a starting vertex
108 <I>r</I> and assign <i>r</i> to <I>x<sub>1</sub></I>.
109</LI>
110<LI><I>Main part</I>: For <I>i=1,...,N</I>, find all
111 the unnumbered neighbors of the vertex <I>x<sub>i</sub></I> and number them
112 in increasing order of degree.
113</LI>
114<LI><I>Reversing ordering</I>: The reverse Cuthill-McKee ordering
115 is given by <I>y<sub>1</sub>,...,y<sub>N</sub></I> where
116 <I>y<sub>i</sub></I> is <I>x<sub>N-i+1</sub></I> for <I>i = 1,...,N</I>.
117</LI>
118</OL>
119In the first step a good starting vertex needs to be determined. The
120study by George and Liu&nbsp;[<A
121HREF="bibliography.html#george81:__sparse_pos_def">14</A>] showed that
122a pair of vertices which are at maximum or near maximum ``distance''
123apart are good ones. They also proposed an algorithm to find such a
124starting vertex in &nbsp;[<A
125HREF="bibliography.html#george81:__sparse_pos_def">14</A>], which we
126show in <A
127HREF="#fig:ggcl_find_starting_vertex">Figure
1281</A> implemented using the BGL interface.
129
130<P>
131<P></P>
132<DIV ALIGN="CENTER"><A NAME="fig:ggcl_find_starting_vertex"></A>
133<table border><tr><td>
134<pre>
135namespace boost {
136 template &lt;class Graph, class Vertex, class Color, class Degree&gt;
137 Vertex
138 pseudo_peripheral_pair(Graph& G, const Vertex& u, int& ecc,
139 Color color, Degree degree)
140 {
141 typename property_traits&lt;Color&gt;::value_type c = get(color, u);
142
143 rcm_queue&lt;Vertex, Degree&gt; Q(degree);
144
145 typename boost::graph_traits&lt;Graph&gt;::vertex_iterator ui, ui_end;
146 for (boost::tie(ui, ui_end) = vertices(G); ui != ui_end; ++ui)
147 put(color, *ui, white(c));
148 breadth_first_search(G, u, Q, bfs_visitor&lt;&gt;(), color);
149
150 ecc = Q.eccentricity();
151 return Q.spouse();
152 }
153
154 template &lt;class Graph, class Vertex, class Color, class Degree&gt;
155 Vertex find_starting_node(Graph& G, Vertex r, Color c, Degree d) {
156 int eccen_r, eccen_x;
157 Vertex x = pseudo_peripheral_pair(G, r, eccen_r, c, d);
158 Vertex y = pseudo_peripheral_pair(G, x, eccen_x, c, d);
159
160 while (eccen_x &gt; eccen_r) {
161 r = x;
162 eccen_r = eccen_x;
163 x = y;
164 y = pseudo_peripheral_pair(G, x, eccen_x, c, d);
165 }
166 return x;
167 }
168} // namespace boost
169</pre>
170</td></tr></table>
171<CAPTION ALIGN="BOTTOM">
172<table><tr><td>
173<STRONG>Figure 1:</STRONG>
174The BGL implementation of <TT>find_starting_node</TT>. The key
175part <TT>pseudo_peripheral_pair</TT> is BFS with a custom queue
176 type virtually.
177</td></tr></table></CAPTION>
178</DIV>
179<P></P>
180
181The key part of step one is a custom queue type with BFS as shown in
182<A
183HREF="#fig:ggcl_find_starting_vertex">Figure
1841</A>. The main Cuthill-McKee algorithm shown in Figure&nbsp;<A
185HREF="#fig:cuthill-mckee">Figure 2</A>
186is a fenced priority queue with a generalized BFS. If
187<TT>inverse_permutation</TT> is a normal iterator, the result stored
188is the inverse permutation of original Cuthill-McKee ordering. On the
189other hand, if <TT>inverse_permutation</TT> is a reverse iterator, the
190result stored is the inverse permutation of reverse Cuthill-McKee
191ordering. Our implementation is quite concise because many components
192from BGL can be reused.
193
194
195<P></P>
196<DIV ALIGN="CENTER"><A NAME="fig:cuthill-mckee"></A><A NAME="6261"></A>
197<table border><tr><td>
198<pre>
199 template &lt; class Graph, class Vertex, class OutputIterator,
200 class Color, class Degree &gt;
201 inline void
202 cuthill_mckee_ordering(Graph& G,
203 Vertex s,
204 OutputIterator inverse_permutation,
205 Color color, Degree degree)
206 {
207 typedef typename property_traits&lt;Degree&gt;::value_type DS;
208 typename property_traits&lt;Color&gt;::value_type c = get(color, s);
209 typedef indirect_cmp&lt;Degree, std::greater&lt;DS&gt; &gt; Compare;
210 Compare comp(degree);
211 fenced_priority_queue&lt;Vertex, Compare &gt; Q(comp);
212
213 typedef cuthill_mckee_visitor&lt;OutputIterator&gt; CMVisitor;
214 CMVisitor cm_visitor(inverse_permutation);
215
216 typename boost::graph_traits&lt;Graph&gt;::vertex_iterator ui, ui_end;
217 for (boost::tie(ui, ui_end) = vertices(G); ui != ui_end; ++ui)
218 put(color, *ui, white(c));
219 breadth_first_search(G, s, Q, cm_visitor, color);
220 }
221</pre>
222</td></tr></table>
223<CAPTION ALIGN="BOTTOM">
224<TABLE>
225<TR><TD> <STRONG>Figure 2:</STRONG> The BGL implementation of
226Cuthill-McKee algoithm. </TD></TR> </TABLE></CAPTION> </DIV><P></P>
227<P>
228
229
230<P>
231
232<H3>Minimum Degree Ordering Algorithm</H3>
233
234<P>
235The pattern of another category of high-quality ordering algorithms in
236wide use is based on a greedy approach such that the ordering is
237chosen to minimize some quantity at each step of a simulated -step
238symmetric Gaussian elimination process. The algorithms using such an
239approach are typically distinguished by their greedy minimization
240criteria&nbsp;[<a href="bibliography.html#ng-raghavan">34</a>].
241
242<P>
243In graph terms, the basic ordering process used by most greedy
244algorithms is as follows:
245
246<OL>
247<LI><EM>Start:</EM> Construct undirected graph <i>G<sup>0</sup></i>
248 corresponding to matrix <i>A</i>
249
250</LI>
251<LI><EM>Iterate:</EM> For <i>k = 1, 2, ... </i> until
252 <i>G<sup>k</sup> = { }</i> do:
253
254<UL>
255<LI>Choose a vertex <i>v<sup>k</sup></i> from <i>G<sup>k</sup></i>
256 according to some criterion
257</LI>
258<LI>Eliminate <i>v<sup>k</sup></i> from <i>G<sup>k</sup></i> to form
259 <i>G<sup>k+1</sup></i>
260
261</LI>
262</UL>
263</LI>
264</OL>
265The resulting ordering is the sequence of vertices <i>{v<sup>0</sup>,
266v<sup>1</sup>,...}</i> selected by the algorithm.
267
268<P>
269One of the most important examples of such an algorithm is the
270<I>Minimum Degree</I> algorithm. At each step the minimum degree
271algorithm chooses the vertex with minimum degree in the corresponding
272graph as <i>v<sup>k</sup></i>. A number of enhancements to the basic
273minimum degree algorithm have been developed, such as the use of a
274quotient graph representation, mass elimination, incomplete degree
275update, multiple elimination, and external degree. See&nbsp;[<A
276href="bibliography.html#George:evolution">35</a>] for a historical
277survey of the minimum degree algorithm.
278
279<P>
280The BGL implementation of the Minimum Degree algorithm closely follows
281the algorithmic descriptions of the one in &nbsp;[<A
282HREF="bibliography.html#LIU:MMD">21</A>]. The implementation presently
283includes the enhancements for mass elimination, incomplete degree
284update, multiple elimination, and external degree.
285
286<P>
287In particular, we create a graph representation to improve the
288performance of the algorithm. It is based on a templated ``vector of
289vectors.'' The vector container used is an adaptor class built on top
290the STL <TT>std::vector</TT> class. Particular characteristics of this
291adaptor class include the following:
292
293<UL>
294<LI>Erasing elements does not shrink the associated memory.
295 Adding new elements after erasing will not need to allocate
296 additional memory.
297</LI>
298<LI>Additional memory is allocated efficiently on demand when new
299 elements are added (doubling the capacity every time it is
300 increased). This property comes from STL vector.
301</LI>
302</UL>
303
304<P>
305Note that this representation is similar to that used in Liu's
306implementation, with some important differences due to dynamic memory
307allocation. With the dynamic memory allocation we do not need to
308over-write portions of the graph that have been eliminated,
309allowing for a more efficient graph traversal. More importantly,
310information about the elimination graph is preserved allowing for
311trivial symbolic factorization. Since symbolic factorization can be
312an expensive part of the entire solution process, improving its
313performance can result in significant computational savings.
314
315<P>
316The overhead of dynamic memory allocation could conceivably compromise
317performance in some cases. However, in practice, memory allocation
318overhead does not contribute significantly to run-time for our
319implementation as shown in &nbsp;[] because it is not
320done very often and the cost gets amortized.
321
322<P>
323
324<H2>Proper Self-Avoiding Walk</H2>
325
326<P>
327The finite element method (FEM) is a flexible and attractive numerical
328approach for solving partial differential equations since it is
329straightforward to handle geometrically complicated domains. However,
330unstructured meshes generated by FEM does not provide an obvious
331labeling (numbering) of the unknowns while it is vital to have it for
332matrix-vector notation of the underlying algebraic equations. Special
333numbering techniques have been developed to optimize memory usage and
334locality of such algorithms. One novel technique is the self-avoiding
335walk&nbsp;[].
336
337<P>
338If we think a mesh is a graph, a self-avoiding walk(SAW) over an
339arbitrary unstructured two-dimensional mesh is an enumeration of all
340the triangles of the mesh such that two successive triangles shares an
341edge or a vertex. A proper SAW is a SAW where jumping twice over the
342same vertex is forbidden for three consecutive triangles in the walk.
343it can be used to improve parallel efficiency of several irregular
344algorithms, in particular issues related to reducing runtime memory
345access (improving locality) and interprocess communications. The
346reference&nbsp;[] has proved the existence
347of A PSAW for any arbitrary triangular mesh by extending an existing
348PSAW over a small piece of mesh to a larger part. The proof
349effectively provides a set of rules to construct a PSAW.
350
351<P>
352The algorithm family of constructing a PSAW on a mesh is to start from
353any random triangle of the mesh, choose new triangles sharing an edge
354with the current sub-mesh and extend the existing partial PSAW over
355the new triangles.
356
357<P>
358Let us define a dual graph of a mesh. Let a triangle in the mesh be a
359vertex and two triangles sharing an edge in the mesh means there is an
360edge between two vertices in the dual graph. By using a dual graph of
361a mesh, one way to implement the algorithm family of constructing a
362PSAW is to reuse BGL algorithms such as BFS and DFS with a customized
363visitor to provide operations during
364traversal. The customized visitor has the function <TT>tree_edge()</TT>
365which is to extend partial PSAW in case by case and function
366<TT>start_vertex()</TT> which is to set up the PSAW for the starting vertex.
367
368<br>
369<HR>
370<TABLE>
371<TR valign=top>
372<TD nowrap>Copyright &copy; 2000-2001</TD><TD>
373<A HREF="http://www.boost.org/people/jeremy_siek.htm">Jeremy Siek</A>, Indiana University (<A HREF="mailto:jsiek@osl.iu.edu">jsiek@osl.iu.edu</A>)
374</TD></TR></TABLE>
375
376</BODY>
377</HTML>