ceph/src/boost/libs/graph/doc/sparse_matrix_ordering.html

   1 <HTML>
   2 <!--
   3      Copyright (c) Jeremy Siek 2000
   4
   5      Distributed under the Boost Software License, Version 1.0.
   6      (See accompanying file LICENSE_1_0.txt or copy at
   7      http://www.boost.org/LICENSE_1_0.txt)
   8   -->
   9 <Head>
  10 <Title>Sparse Matrix Ordering Example</Title>
  11 <BODY BGCOLOR="#ffffff" LINK="#0000ee" TEXT="#000000" VLINK="#551a8b"
  12         ALINK="#ff0000">
  13 <IMG SRC="../../../boost.png"
  14      ALT="C++ Boost" width="277" height="86">
  15
  16 <BR Clear>
  17
  18 <H1><A NAME="sec:sparse-matrix-ordering"></A>
  19 Sparse Matrix Ordering
  20 </H1>
  21
  22 <P>
  23 Graph theory was identified as a powerful tool for sparse matrix
  24 computation when Seymour Parter used undirected graphs to model
  25 symmetric Gaussian elimination more than 30 years ago&nbsp;[<A HREF="bibliography.html#parter61:_gauss">28</A>].
  26 Graphs can be used to model symmetric matrices, factorizations and
  27 algorithms on non-symmetric matrices, such as fill paths in Gaussian
  28 elimination, strongly connected components in matrix irreducibility,
  29 bipartite matching, and alternating paths in linear dependence and
  30 structural singularity. Not only do graphs make it easier to
  31 understand and analyze sparse matrix algorithms, but they broaden the
  32 area of manipulating sparse matrices using existing graph algorithms
  33 and techniques&nbsp;[<A
  34  HREF="bibliography.html#george93:graphtheory">13</A>].  In this section, we are
  35 going to illustrate how to use BGL in sparse matrix computation such
  36 as ordering algorithms. Before we go further into the sparse matrix
  37 algorithms, let us take a step back and review a few things.
  38
  39 <P>
  40
  41 <H2>Graphs and Sparse Matrices</H2>
  42
  43 <P>
  44 A graph is fundamentally a way to represent a binary relation between
  45 objects. The nonzero pattern of a sparse matrix also describes a
  46 binary relation between unknowns. Therefore the nonzero pattern of a
  47 sparse matrix of a linear system can be modeled with a graph
  48 <I>G(V,E)</I>, whose <I>n</I> vertices in <I>V</I> represent the
  49 <I>n</I> unknowns, and where there is an edge from vertex <I>i</I> to
  50 vertex <I>j</I> when <i>A<sub>ij</sub></i> is nonzero.  Thus, when a
  51 matrix has a symmetric nonzero pattern, the corresponding graph is
  52 undirected.
  53
  54 <P>
  55
  56 <H2>Sparse Matrix Ordering Algorithms</H2>
  57
  58 <P>
  59 The process for solving a sparse symmetric positive definite linear
  60 system, <i>Ax=b</i>, can be divided into four stages as follows:
  61 <DL>
  62 <DT><STRONG>Ordering:</STRONG></DT>
  63 <DD>Find a permutation <i>P</i> of matrix <i>A</i>,
  64 </DD>
  65 <DT><STRONG>Symbolic factorization:</STRONG></DT>
  66 <DD>Set up a data structure for the Cholesky
  67  factor <i>L</i> of <i>PAP<sup>T</sup></i>,
  68 </DD>
  69 <DT><STRONG>Numerical factorization:</STRONG></DT>
  70 <DD>Decompose <i>PAP<sup>T</sup></i> into <i>LL<sup>T</sup></i>,
  71 </DD>
  72 <DT><STRONG>Triangular system solution:</STRONG></DT>
  73 <DD>Solve <i>LL<sup>T</sup>Px=Pb</i> for <i>x</i>.
  74 </DD>
  75 </DL>
  76 Because the choice of permutation <i>P</i> will directly determine the
  77 number of fill-in elements (elements present in the non-zero structure
  78 of <i>L</i> that are not present in the non-zero structure of
  79 <i>A</i>), the ordering has a significant impact on the memory and
  80 computational requirements for the latter stages.  However, finding
  81 the optimal ordering for <i>A</i> (in the sense of minimizing fill-in)
  82 has been proven to be NP-complete&nbsp;[<a
  83 href="bibliography.html#garey79:computers-and-intractability">30</a>]
  84 requiring that heuristics be used for all but simple (or specially
  85 structured) cases.
  86
  87 <P>
  88 A widely used but rather simple ordering algorithm is a variant of the
  89 Cuthill-McKee orderings, the reverse Cuthill-McKee ordering algorithm.
  90 This algorithm can be used as a preordering method to improve ordering
  91 in more sophisticated methods such as minimum degree
  92 algorithms&nbsp;[<a href="bibliography.html#LIU:MMD">21</a>].
  93
  94 <P>
  95
  96 <H3><A NAME="SECTION001032100000000000000">
  97 Reverse Cuthill-McKee Ordering Algorithm</A>
  98 </H3>
  99 The original Cuthill-McKee ordering algorithm is primarily designed to
 100 reduce the profile of a matrix&nbsp;[<A
 101  HREF="bibliography.html#george81:__sparse_pos_def">14</A>].
 102 George discovered that the reverse ordering often turned out to be
 103 superior to the original ordering in 1971. Here we describe the
 104 Reverse Cuthill-McKee algorithm in terms of a graph:
 105
 106 <OL>
 107 <LI><I>Finding a starting vertex</I>: Determine a starting vertex
 108  <I>r</I> and assign <i>r</i> to <I>x<sub>1</sub></I>.
 109 </LI>
 110 <LI><I>Main part</I>: For <I>i=1,...,N</I>, find all
 111  the unnumbered neighbors of the vertex <I>x<sub>i</sub></I> and number them
 112  in increasing order of degree.
 113 </LI>
 114 <LI><I>Reversing ordering</I>: The reverse Cuthill-McKee ordering
 115  is given by <I>y<sub>1</sub>,...,y<sub>N</sub></I> where
 116  <I>y<sub>i</sub></I> is <I>x<sub>N-i+1</sub></I> for <I>i = 1,...,N</I>.
 117 </LI>
 118 </OL>
 119 In the first step a good starting vertex needs to be determined. The
 120 study by George and Liu&nbsp;[<A
 121 HREF="bibliography.html#george81:__sparse_pos_def">14</A>] showed that
 122 a pair of vertices which are at maximum or near maximum ``distance''
 123 apart are good ones. They also proposed an algorithm to find such a
 124 starting vertex in &nbsp;[<A
 125 HREF="bibliography.html#george81:__sparse_pos_def">14</A>], which we
 126 show in <A
 127 HREF="#fig:ggcl_find_starting_vertex">Figure
 128 1</A> implemented using the BGL interface.
 129
 130 <P>
 131 <P></P>
 132 <DIV ALIGN="CENTER"><A NAME="fig:ggcl_find_starting_vertex"></A>
 133 <table border><tr><td>
 134 <pre>
 135 namespace boost {
 136   template &lt;class Graph, class Vertex, class Color, class Degree&gt;
 137   Vertex
 138   pseudo_peripheral_pair(Graph& G, const Vertex& u, int& ecc,
 139                          Color color, Degree degree)
 140   {
 141     typename property_traits&lt;Color&gt;::value_type c = get(color, u);
 142
 143     rcm_queue&lt;Vertex, Degree&gt; Q(degree);
 144
 145     typename boost::graph_traits&lt;Graph&gt;::vertex_iterator ui, ui_end;
 146     for (boost::tie(ui, ui_end) = vertices(G); ui != ui_end; ++ui)
 147       put(color, *ui, white(c));
 148     breadth_first_search(G, u, Q, bfs_visitor&lt;&gt;(), color);
 149
 150     ecc = Q.eccentricity();
 151     return Q.spouse();
 152   }
 153
 154   template &lt;class Graph, class Vertex, class Color, class Degree&gt;
 155   Vertex find_starting_node(Graph& G, Vertex r, Color c, Degree d) {
 156     int eccen_r, eccen_x;
 157     Vertex x = pseudo_peripheral_pair(G, r, eccen_r, c, d);
 158     Vertex y = pseudo_peripheral_pair(G, x, eccen_x, c, d);
 159
 160     while (eccen_x &gt; eccen_r) {
 161       r = x;
 162       eccen_r = eccen_x;
 163       x = y;
 164       y = pseudo_peripheral_pair(G, x, eccen_x, c, d);
 165     }
 166     return x;
 167   }
 168 } // namespace boost
 169 </pre>
 170 </td></tr></table>
 171 <CAPTION ALIGN="BOTTOM">
 172 <table><tr><td>
 173 <STRONG>Figure 1:</STRONG>
 174 The BGL implementation of <TT>find_starting_node</TT>. The key
 175 part <TT>pseudo_peripheral_pair</TT> is BFS with a custom queue
 176  type virtually.
 177 </td></tr></table></CAPTION>
 178 </DIV>
 179 <P></P>
 180
 181 The key part of step one is a custom queue type with BFS as shown in
 182 <A
 183 HREF="#fig:ggcl_find_starting_vertex">Figure
 184 1</A>.  The main Cuthill-McKee algorithm shown in Figure&nbsp;<A
 185 HREF="#fig:cuthill-mckee">Figure 2</A>
 186 is a fenced priority queue with a generalized BFS. If
 187 <TT>inverse_permutation</TT> is a normal iterator, the result stored
 188 is the inverse permutation of original Cuthill-McKee ordering. On the
 189 other hand, if <TT>inverse_permutation</TT> is a reverse iterator, the
 190 result stored is the inverse permutation of reverse Cuthill-McKee
 191 ordering.  Our implementation is quite concise because many components
 192 from BGL can be reused.
 193
 194
 195 <P></P>
 196 <DIV ALIGN="CENTER"><A NAME="fig:cuthill-mckee"></A><A NAME="6261"></A>
 197 <table border><tr><td>
 198 <pre>
 199   template &lt; class Graph, class Vertex, class OutputIterator,
 200              class Color, class Degree &gt;
 201   inline void
 202   cuthill_mckee_ordering(Graph& G,
 203                          Vertex s,
 204                          OutputIterator inverse_permutation,
 205                          Color color, Degree degree)
 206   {
 207     typedef typename property_traits&lt;Degree&gt;::value_type DS;
 208     typename property_traits&lt;Color&gt;::value_type c = get(color, s);
 209     typedef indirect_cmp&lt;Degree, std::greater&lt;DS&gt; &gt; Compare;
 210     Compare comp(degree);
 211     fenced_priority_queue&lt;Vertex, Compare &gt; Q(comp);
 212
 213     typedef cuthill_mckee_visitor&lt;OutputIterator&gt;  CMVisitor;
 214     CMVisitor cm_visitor(inverse_permutation);
 215
 216     typename boost::graph_traits&lt;Graph&gt;::vertex_iterator ui, ui_end;
 217     for (boost::tie(ui, ui_end) = vertices(G); ui != ui_end; ++ui)
 218       put(color, *ui, white(c));
 219     breadth_first_search(G, s, Q, cm_visitor, color);
 220   }
 221 </pre>
 222 </td></tr></table>
 223 <CAPTION ALIGN="BOTTOM">
 224 <TABLE>
 225 <TR><TD> <STRONG>Figure 2:</STRONG> The BGL implementation of
 226 Cuthill-McKee algoithm.  </TD></TR> </TABLE></CAPTION> </DIV><P></P>
 227 <P>
 228
 229
 230 <P>
 231
 232 <H3>Minimum Degree Ordering Algorithm</H3>
 233
 234 <P>
 235 The pattern of another category of high-quality ordering algorithms in
 236 wide use is based on a greedy approach such that the ordering is
 237 chosen to minimize some quantity at each step of a simulated -step
 238 symmetric Gaussian elimination process.  The algorithms using such an
 239 approach are typically distinguished by their greedy minimization
 240 criteria&nbsp;[<a href="bibliography.html#ng-raghavan">34</a>].
 241
 242 <P>
 243 In graph terms, the basic ordering process used by most greedy
 244 algorithms is as follows:
 245
 246 <OL>
 247 <LI><EM>Start:</EM> Construct undirected graph <i>G<sup>0</sup></i>
 248  corresponding to matrix <i>A</i>
 249
 250 </LI>
 251 <LI><EM>Iterate:</EM> For <i>k = 1, 2, ... </i> until
 252   <i>G<sup>k</sup> = { }</i> do:
 253
 254 <UL>
 255 <LI>Choose a vertex <i>v<sup>k</sup></i> from <i>G<sup>k</sup></i>
 256       according to some criterion
 257 </LI>
 258 <LI>Eliminate <i>v<sup>k</sup></i> from <i>G<sup>k</sup></i> to form
 259   <i>G<sup>k+1</sup></i>
 260
 261 </LI>
 262 </UL>
 263 </LI>
 264 </OL>
 265 The resulting ordering is the sequence of vertices <i>{v<sup>0</sup>,
 266 v<sup>1</sup>,...}</i> selected by the algorithm.
 267
 268 <P>
 269 One of the most important examples of such an algorithm is the
 270 <I>Minimum Degree</I> algorithm.  At each step the minimum degree
 271 algorithm chooses the vertex with minimum degree in the corresponding
 272 graph as <i>v<sup>k</sup></i>.  A number of enhancements to the basic
 273 minimum degree algorithm have been developed, such as the use of a
 274 quotient graph representation, mass elimination, incomplete degree
 275 update, multiple elimination, and external degree.  See&nbsp;[<A
 276 href="bibliography.html#George:evolution">35</a>] for a historical
 277 survey of the minimum degree algorithm.
 278
 279 <P>
 280 The BGL implementation of the Minimum Degree algorithm closely follows
 281 the algorithmic descriptions of the one in &nbsp;[<A
 282 HREF="bibliography.html#LIU:MMD">21</A>]. The implementation presently
 283 includes the enhancements for mass elimination, incomplete degree
 284 update, multiple elimination, and external degree.
 285
 286 <P>
 287 In particular, we create a graph representation to improve the
 288 performance of the algorithm. It is based on a templated ``vector of
 289 vectors.''  The vector container used is an adaptor class built on top
 290 the STL <TT>std::vector</TT> class.  Particular characteristics of this
 291 adaptor class include the following:
 292
 293 <UL>
 294 <LI>Erasing elements does not shrink the associated memory.
 295  Adding new elements after erasing will not need to allocate
 296   additional memory.
 297 </LI>
 298 <LI>Additional memory is allocated efficiently on demand when new
 299   elements are added (doubling the capacity every time it is
 300   increased).  This property comes from STL vector.
 301 </LI>
 302 </UL>
 303
 304 <P>
 305 Note that this representation is similar to that used in Liu's
 306 implementation, with some important differences due to dynamic memory
 307 allocation.  With the dynamic memory allocation we do not need to
 308 over-write portions of the graph that have been eliminated,
 309 allowing for a more efficient graph traversal.  More importantly,
 310 information about the elimination graph is preserved allowing for
 311 trivial symbolic factorization.  Since symbolic factorization can be
 312 an expensive part of the entire solution process, improving its
 313 performance can result in significant computational savings.
 314
 315 <P>
 316 The overhead of dynamic memory allocation could conceivably compromise
 317 performance in some cases. However, in practice, memory allocation
 318 overhead does not contribute significantly to run-time for our
 319 implementation as shown in &nbsp;[] because it is not
 320 done very often and the cost gets amortized.
 321
 322 <P>
 323
 324 <H2>Proper Self-Avoiding Walk</H2>
 325
 326 <P>
 327 The finite element method (FEM) is a flexible and attractive numerical
 328 approach for solving partial differential equations since it is
 329 straightforward to handle geometrically complicated domains. However,
 330 unstructured meshes generated by FEM does not provide an obvious
 331 labeling (numbering) of the unknowns while it is vital to have it for
 332 matrix-vector notation of the underlying algebraic equations. Special
 333 numbering techniques have been developed to optimize memory usage and
 334 locality of such algorithms. One novel technique is the self-avoiding
 335 walk&nbsp;[].
 336
 337 <P>
 338 If we think a mesh is a graph, a self-avoiding walk(SAW) over an
 339 arbitrary unstructured two-dimensional mesh is an enumeration of all
 340 the triangles of the mesh such that two successive triangles shares an
 341 edge or a vertex. A proper SAW is a SAW where jumping twice over the
 342 same vertex is forbidden for three consecutive triangles in the walk.
 343 it can be used to improve parallel efficiency of several irregular
 344 algorithms, in particular issues related to reducing runtime memory
 345 access (improving locality) and interprocess communications. The
 346 reference&nbsp;[] has proved the existence
 347 of A PSAW for any arbitrary triangular mesh by extending an existing
 348 PSAW over a small piece of mesh to a larger part. The proof
 349 effectively provides a set of rules to construct a PSAW.
 350
 351 <P>
 352 The algorithm family of constructing a PSAW on a mesh is to start from
 353 any random triangle of the mesh, choose new triangles sharing an edge
 354 with the current sub-mesh and extend the existing partial PSAW over
 355 the new triangles.
 356
 357 <P>
 358 Let us define a dual graph of a mesh.  Let a triangle in the mesh be a
 359 vertex and two triangles sharing an edge in the mesh means there is an
 360 edge between two vertices in the dual graph. By using a dual graph of
 361 a mesh, one way to implement the algorithm family of constructing a
 362 PSAW is to reuse  BGL algorithms such as BFS and DFS with a customized
 363 visitor to provide operations during
 364 traversal. The customized visitor has the function <TT>tree_edge()</TT>
 365 which is to extend partial PSAW in case by case and function
 366 <TT>start_vertex()</TT> which is to set up the PSAW for the starting vertex.
 367
 368 <br>
 369 <HR>
 370 <TABLE>
 371 <TR valign=top>
 372 <TD nowrap>Copyright &copy; 2000-2001</TD><TD>
 373 <A HREF="http://www.boost.org/people/jeremy_siek.htm">Jeremy Siek</A>, Indiana University (<A HREF="mailto:jsiek@osl.iu.edu">jsiek@osl.iu.edu</A>)
 374 </TD></TR></TABLE>
 375
 376 </BODY>
 377 </HTML>