[ceph.git] / ceph / src / boost / libs / graph / doc / sparse_matrix_ordering.html

<HTML>
<!--
     Copyright (c) Jeremy Siek 2000
    
     Distributed under the Boost Software License, Version 1.0.
     (See accompanying file LICENSE_1_0.txt or copy at
     http://www.boost.org/LICENSE_1_0.txt)
  -->
<Head>
<Title>Sparse Matrix Ordering Example</Title>
<BODY BGCOLOR="#ffffff" LINK="#0000ee" TEXT="#000000" VLINK="#551a8b" 
        ALINK="#ff0000"> 
<IMG SRC="../../../boost.png" 
     ALT="C++ Boost" width="277" height="86"> 

<BR Clear>

<H1><A NAME="sec:sparse-matrix-ordering"></A>
Sparse Matrix Ordering
</H1>

<P>
Graph theory was identified as a powerful tool for sparse matrix
computation when Seymour Parter used undirected graphs to model
symmetric Gaussian elimination more than 30 years ago&nbsp;[<A HREF="bibliography.html#parter61:_gauss">28</A>].
Graphs can be used to model symmetric matrices, factorizations and
algorithms on non-symmetric matrices, such as fill paths in Gaussian
elimination, strongly connected components in matrix irreducibility,
bipartite matching, and alternating paths in linear dependence and
structural singularity. Not only do graphs make it easier to
understand and analyze sparse matrix algorithms, but they broaden the
area of manipulating sparse matrices using existing graph algorithms
and techniques&nbsp;[<A
 HREF="bibliography.html#george93:graphtheory">13</A>].  In this section, we are
going to illustrate how to use BGL in sparse matrix computation such
as ordering algorithms. Before we go further into the sparse matrix
algorithms, let us take a step back and review a few things.

<P>

<H2>Graphs and Sparse Matrices</H2>

<P>
A graph is fundamentally a way to represent a binary relation between
objects. The nonzero pattern of a sparse matrix also describes a
binary relation between unknowns. Therefore the nonzero pattern of a
sparse matrix of a linear system can be modeled with a graph
<I>G(V,E)</I>, whose <I>n</I> vertices in <I>V</I> represent the
<I>n</I> unknowns, and where there is an edge from vertex <I>i</I> to
vertex <I>j</I> when <i>A<sub>ij</sub></i> is nonzero.  Thus, when a
matrix has a symmetric nonzero pattern, the corresponding graph is
undirected.

<P>

<H2>Sparse Matrix Ordering Algorithms</H2>

<P>
The process for solving a sparse symmetric positive definite linear
system, <i>Ax=b</i>, can be divided into four stages as follows:
<DL>
<DT><STRONG>Ordering:</STRONG></DT>
<DD>Find a permutation <i>P</i> of matrix <i>A</i>,
</DD>
<DT><STRONG>Symbolic factorization:</STRONG></DT>
<DD>Set up a data structure for the Cholesky
 factor <i>L</i> of <i>PAP<sup>T</sup></i>,
</DD>
<DT><STRONG>Numerical factorization:</STRONG></DT>
<DD>Decompose <i>PAP<sup>T</sup></i> into <i>LL<sup>T</sup></i>,
</DD>
<DT><STRONG>Triangular system solution:</STRONG></DT>
<DD>Solve <i>LL<sup>T</sup>Px=Pb</i> for <i>x</i>.
</DD>
</DL>
Because the choice of permutation <i>P</i> will directly determine the
number of fill-in elements (elements present in the non-zero structure
of <i>L</i> that are not present in the non-zero structure of
<i>A</i>), the ordering has a significant impact on the memory and
computational requirements for the latter stages.  However, finding
the optimal ordering for <i>A</i> (in the sense of minimizing fill-in)
has been proven to be NP-complete&nbsp;[<a
href="bibliography.html#garey79:computers-and-intractability">30</a>]
requiring that heuristics be used for all but simple (or specially
structured) cases.

<P>
A widely used but rather simple ordering algorithm is a variant of the
Cuthill-McKee orderings, the reverse Cuthill-McKee ordering algorithm.
This algorithm can be used as a preordering method to improve ordering
in more sophisticated methods such as minimum degree
algorithms&nbsp;[<a href="bibliography.html#LIU:MMD">21</a>].

<P>

<H3><A NAME="SECTION001032100000000000000">
Reverse Cuthill-McKee Ordering Algorithm</A>
</H3>
The original Cuthill-McKee ordering algorithm is primarily designed to
reduce the profile of a matrix&nbsp;[<A
 HREF="bibliography.html#george81:__sparse_pos_def">14</A>].
George discovered that the reverse ordering often turned out to be
superior to the original ordering in 1971. Here we describe the
Reverse Cuthill-McKee algorithm in terms of a graph:

<OL>
<LI><I>Finding a starting vertex</I>: Determine a starting vertex
 <I>r</I> and assign <i>r</i> to <I>x<sub>1</sub></I>.
</LI>
<LI><I>Main part</I>: For <I>i=1,...,N</I>, find all
 the unnumbered neighbors of the vertex <I>x<sub>i</sub></I> and number them
 in increasing order of degree.
</LI>
<LI><I>Reversing ordering</I>: The reverse Cuthill-McKee ordering
 is given by <I>y<sub>1</sub>,...,y<sub>N</sub></I> where
 <I>y<sub>i</sub></I> is <I>x<sub>N-i+1</sub></I> for <I>i = 1,...,N</I>.
</LI>
</OL>
In the first step a good starting vertex needs to be determined. The
study by George and Liu&nbsp;[<A
HREF="bibliography.html#george81:__sparse_pos_def">14</A>] showed that
a pair of vertices which are at maximum or near maximum ``distance''
apart are good ones. They also proposed an algorithm to find such a
starting vertex in &nbsp;[<A
HREF="bibliography.html#george81:__sparse_pos_def">14</A>], which we
show in <A
HREF="#fig:ggcl_find_starting_vertex">Figure
1</A> implemented using the BGL interface.

<P>
<P></P>
<DIV ALIGN="CENTER"><A NAME="fig:ggcl_find_starting_vertex"></A>
<table border><tr><td>
<pre>
namespace boost {
  template &lt;class Graph, class Vertex, class Color, class Degree&gt;
  Vertex 
  pseudo_peripheral_pair(Graph& G, const Vertex& u, int& ecc,
                         Color color, Degree degree)
  {
    typename property_traits&lt;Color&gt;::value_type c = get(color, u);

    rcm_queue&lt;Vertex, Degree&gt; Q(degree);
    
    typename boost::graph_traits&lt;Graph&gt;::vertex_iterator ui, ui_end;
    for (boost::tie(ui, ui_end) = vertices(G); ui != ui_end; ++ui)
      put(color, *ui, white(c));
    breadth_first_search(G, u, Q, bfs_visitor&lt;&gt;(), color);

    ecc = Q.eccentricity(); 
    return Q.spouse();
  }
  
  template &lt;class Graph, class Vertex, class Color, class Degree&gt; 
  Vertex find_starting_node(Graph& G, Vertex r, Color c, Degree d) {
    int eccen_r, eccen_x;
    Vertex x = pseudo_peripheral_pair(G, r, eccen_r, c, d);
    Vertex y = pseudo_peripheral_pair(G, x, eccen_x, c, d);

    while (eccen_x &gt; eccen_r) {
      r = x;
      eccen_r = eccen_x;
      x = y;
      y = pseudo_peripheral_pair(G, x, eccen_x, c, d);
    }
    return x;
  }
} // namespace boost
</pre>
</td></tr></table>
<CAPTION ALIGN="BOTTOM">
<table><tr><td>
<STRONG>Figure 1:</STRONG>
The BGL implementation of <TT>find_starting_node</TT>. The key
part <TT>pseudo_peripheral_pair</TT> is BFS with a custom queue
 type virtually.
</td></tr></table></CAPTION>
</DIV>
<P></P>

The key part of step one is a custom queue type with BFS as shown in
<A
HREF="#fig:ggcl_find_starting_vertex">Figure
1</A>.  The main Cuthill-McKee algorithm shown in Figure&nbsp;<A
HREF="#fig:cuthill-mckee">Figure 2</A>
is a fenced priority queue with a generalized BFS. If
<TT>inverse_permutation</TT> is a normal iterator, the result stored
is the inverse permutation of original Cuthill-McKee ordering. On the
other hand, if <TT>inverse_permutation</TT> is a reverse iterator, the
result stored is the inverse permutation of reverse Cuthill-McKee
ordering.  Our implementation is quite concise because many components
from BGL can be reused.


<P></P>
<DIV ALIGN="CENTER"><A NAME="fig:cuthill-mckee"></A><A NAME="6261"></A>
<table border><tr><td>
<pre>
  template &lt; class Graph, class Vertex, class OutputIterator, 
             class Color, class Degree &gt;
  inline void 
  cuthill_mckee_ordering(Graph& G, 
                         Vertex s,
                         OutputIterator inverse_permutation, 
                         Color color, Degree degree)
  {
    typedef typename property_traits&lt;Degree&gt;::value_type DS;
    typename property_traits&lt;Color&gt;::value_type c = get(color, s);
    typedef indirect_cmp&lt;Degree, std::greater&lt;DS&gt; &gt; Compare;
    Compare comp(degree);
    fenced_priority_queue&lt;Vertex, Compare &gt; Q(comp);

    typedef cuthill_mckee_visitor&lt;OutputIterator&gt;  CMVisitor;
    CMVisitor cm_visitor(inverse_permutation);
    
    typename boost::graph_traits&lt;Graph&gt;::vertex_iterator ui, ui_end;
    for (boost::tie(ui, ui_end) = vertices(G); ui != ui_end; ++ui)
      put(color, *ui, white(c));
    breadth_first_search(G, s, Q, cm_visitor, color);
  }  
</pre>
</td></tr></table>
<CAPTION ALIGN="BOTTOM">
<TABLE>
<TR><TD> <STRONG>Figure 2:</STRONG> The BGL implementation of
Cuthill-McKee algoithm.  </TD></TR> </TABLE></CAPTION> </DIV><P></P>
<P>


<P>

<H3>Minimum Degree Ordering Algorithm</H3>

<P>
The pattern of another category of high-quality ordering algorithms in
wide use is based on a greedy approach such that the ordering is
chosen to minimize some quantity at each step of a simulated -step
symmetric Gaussian elimination process.  The algorithms using such an
approach are typically distinguished by their greedy minimization
criteria&nbsp;[<a href="bibliography.html#ng-raghavan">34</a>].

<P>
In graph terms, the basic ordering process used by most greedy
algorithms is as follows:

<OL>
<LI><EM>Start:</EM> Construct undirected graph <i>G<sup>0</sup></i>
 corresponding to matrix <i>A</i>
 
</LI>
<LI><EM>Iterate:</EM> For <i>k = 1, 2, ... </i> until 
  <i>G<sup>k</sup> = { }</i> do:
    
<UL>
<LI>Choose a vertex <i>v<sup>k</sup></i> from <i>G<sup>k</sup></i>
      according to some criterion
</LI>
<LI>Eliminate <i>v<sup>k</sup></i> from <i>G<sup>k</sup></i> to form 
  <i>G<sup>k+1</sup></i>
 
</LI>
</UL>
</LI>
</OL>
The resulting ordering is the sequence of vertices <i>{v<sup>0</sup>,
v<sup>1</sup>,...}</i> selected by the algorithm.

<P>
One of the most important examples of such an algorithm is the
<I>Minimum Degree</I> algorithm.  At each step the minimum degree
algorithm chooses the vertex with minimum degree in the corresponding
graph as <i>v<sup>k</sup></i>.  A number of enhancements to the basic
minimum degree algorithm have been developed, such as the use of a
quotient graph representation, mass elimination, incomplete degree
update, multiple elimination, and external degree.  See&nbsp;[<A
href="bibliography.html#George:evolution">35</a>] for a historical
survey of the minimum degree algorithm.

<P>
The BGL implementation of the Minimum Degree algorithm closely follows
the algorithmic descriptions of the one in &nbsp;[<A
HREF="bibliography.html#LIU:MMD">21</A>]. The implementation presently
includes the enhancements for mass elimination, incomplete degree
update, multiple elimination, and external degree.

<P>
In particular, we create a graph representation to improve the
performance of the algorithm. It is based on a templated ``vector of
vectors.''  The vector container used is an adaptor class built on top
the STL <TT>std::vector</TT> class.  Particular characteristics of this
adaptor class include the following:

<UL>
<LI>Erasing elements does not shrink the associated memory.
 Adding new elements after erasing will not need to allocate
  additional memory.
</LI>
<LI>Additional memory is allocated efficiently on demand when new
  elements are added (doubling the capacity every time it is
  increased).  This property comes from STL vector.
</LI>
</UL>

<P>
Note that this representation is similar to that used in Liu's
implementation, with some important differences due to dynamic memory
allocation.  With the dynamic memory allocation we do not need to
over-write portions of the graph that have been eliminated,
allowing for a more efficient graph traversal.  More importantly,
information about the elimination graph is preserved allowing for
trivial symbolic factorization.  Since symbolic factorization can be
an expensive part of the entire solution process, improving its
performance can result in significant computational savings.

<P>
The overhead of dynamic memory allocation could conceivably compromise
performance in some cases. However, in practice, memory allocation
overhead does not contribute significantly to run-time for our
implementation as shown in &nbsp;[] because it is not
done very often and the cost gets amortized.

<P>

<H2>Proper Self-Avoiding Walk</H2>

<P>
The finite element method (FEM) is a flexible and attractive numerical
approach for solving partial differential equations since it is
straightforward to handle geometrically complicated domains. However,
unstructured meshes generated by FEM does not provide an obvious
labeling (numbering) of the unknowns while it is vital to have it for
matrix-vector notation of the underlying algebraic equations. Special
numbering techniques have been developed to optimize memory usage and
locality of such algorithms. One novel technique is the self-avoiding
walk&nbsp;[].

<P>
If we think a mesh is a graph, a self-avoiding walk(SAW) over an
arbitrary unstructured two-dimensional mesh is an enumeration of all
the triangles of the mesh such that two successive triangles shares an
edge or a vertex. A proper SAW is a SAW where jumping twice over the
same vertex is forbidden for three consecutive triangles in the walk.
it can be used to improve parallel efficiency of several irregular
algorithms, in particular issues related to reducing runtime memory
access (improving locality) and interprocess communications. The
reference&nbsp;[] has proved the existence
of A PSAW for any arbitrary triangular mesh by extending an existing
PSAW over a small piece of mesh to a larger part. The proof
effectively provides a set of rules to construct a PSAW.

<P>
The algorithm family of constructing a PSAW on a mesh is to start from
any random triangle of the mesh, choose new triangles sharing an edge
with the current sub-mesh and extend the existing partial PSAW over
the new triangles.

<P>
Let us define a dual graph of a mesh.  Let a triangle in the mesh be a
vertex and two triangles sharing an edge in the mesh means there is an
edge between two vertices in the dual graph. By using a dual graph of
a mesh, one way to implement the algorithm family of constructing a
PSAW is to reuse  BGL algorithms such as BFS and DFS with a customized
visitor to provide operations during
traversal. The customized visitor has the function <TT>tree_edge()</TT>
which is to extend partial PSAW in case by case and function
<TT>start_vertex()</TT> which is to set up the PSAW for the starting vertex.

<br>
<HR>
<TABLE>
<TR valign=top>
<TD nowrap>Copyright &copy; 2000-2001</TD><TD>
<A HREF="http://www.boost.org/people/jeremy_siek.htm">Jeremy Siek</A>, Indiana University (<A HREF="mailto:jsiek@osl.iu.edu">jsiek@osl.iu.edu</A>)
</TD></TR></TABLE>

</BODY>
</HTML>
Commit	Line	Data
7c673cae FG	1	<HTML>
	2	<!--
	3	Copyright (c) Jeremy Siek 2000
	4
	5	Distributed under the Boost Software License, Version 1.0.
	6	(See accompanying file LICENSE_1_0.txt or copy at
	7	http://www.boost.org/LICENSE_1_0.txt)
	8	-->
	9	<Head>
	10	<Title>Sparse Matrix Ordering Example</Title>
	11	<BODY BGCOLOR="#ffffff" LINK="#0000ee" TEXT="#000000" VLINK="#551a8b"
	12	ALINK="#ff0000">
	13	<IMG SRC="../../../boost.png"
	14	ALT="C++ Boost" width="277" height="86">
	15
	16	<BR Clear>
	17
	18	<H1><A NAME="sec:sparse-matrix-ordering"></A>
	19	Sparse Matrix Ordering
	20	</H1>
	21
	22	<P>
	23	Graph theory was identified as a powerful tool for sparse matrix
	24	computation when Seymour Parter used undirected graphs to model
	25	symmetric Gaussian elimination more than 30 years ago [<A HREF="bibliography.html#parter61:_gauss">28</A>].
	26	Graphs can be used to model symmetric matrices, factorizations and
	27	algorithms on non-symmetric matrices, such as fill paths in Gaussian
	28	elimination, strongly connected components in matrix irreducibility,
	29	bipartite matching, and alternating paths in linear dependence and
	30	structural singularity. Not only do graphs make it easier to
	31	understand and analyze sparse matrix algorithms, but they broaden the
	32	area of manipulating sparse matrices using existing graph algorithms
	33	and techniques [<A
	34	HREF="bibliography.html#george93:graphtheory">13</A>]. In this section, we are
	35	going to illustrate how to use BGL in sparse matrix computation such
	36	as ordering algorithms. Before we go further into the sparse matrix
	37	algorithms, let us take a step back and review a few things.
	38
	39	<P>
	40
	41	<H2>Graphs and Sparse Matrices</H2>
	42
	43	<P>
	44	A graph is fundamentally a way to represent a binary relation between
	45	objects. The nonzero pattern of a sparse matrix also describes a
	46	binary relation between unknowns. Therefore the nonzero pattern of a
	47	sparse matrix of a linear system can be modeled with a graph
	48	<I>G(V,E)</I>, whose <I>n</I> vertices in <I>V</I> represent the
	49	<I>n</I> unknowns, and where there is an edge from vertex <I>i</I> to
	50	vertex <I>j</I> when <i>A<sub>ij</sub></i> is nonzero. Thus, when a
	51	matrix has a symmetric nonzero pattern, the corresponding graph is
	52	undirected.
	53
	54	<P>
	55
	56	<H2>Sparse Matrix Ordering Algorithms</H2>
	57
	58	<P>
	59	The process for solving a sparse symmetric positive definite linear
	60	system, <i>Ax=b</i>, can be divided into four stages as follows:
	61	<DL>
	62	<DT><STRONG>Ordering:</STRONG></DT>
	63	<DD>Find a permutation <i>P</i> of matrix <i>A</i>,
	64	</DD>
65	<DT><STRONG>Symbolic factorization:</STRONG></DT>
66	<DD>Set up a data structure for the Cholesky
67	factor <i>L</i> of <i>PAP<sup>T</sup></i>,
68	</DD>
69	<DT><STRONG>Numerical factorization:</STRONG></DT>
70	<DD>Decompose <i>PAP<sup>T</sup></i> into <i>LL<sup>T</sup></i>,
71	</DD>
72	<DT><STRONG>Triangular system solution:</STRONG></DT>
73	<DD>Solve <i>LL<sup>T</sup>Px=Pb</i> for <i>x</i>.
74	</DD>
75	</DL>
76	Because the choice of permutation <i>P</i> will directly determine the
77	number of fill-in elements (elements present in the non-zero structure
78	of <i>L</i> that are not present in the non-zero structure of
79	<i>A</i>), the ordering has a significant impact on the memory and
80	computational requirements for the latter stages. However, finding
81	the optimal ordering for <i>A</i> (in the sense of minimizing fill-in)
82	has been proven to be NP-complete [<a
83	href="bibliography.html#garey79:computers-and-intractability">30</a>]
84	requiring that heuristics be used for all but simple (or specially
85	structured) cases.
86
87	<P>
88	A widely used but rather simple ordering algorithm is a variant of the
89	Cuthill-McKee orderings, the reverse Cuthill-McKee ordering algorithm.
90	This algorithm can be used as a preordering method to improve ordering
91	in more sophisticated methods such as minimum degree
92	algorithms [<a href="bibliography.html#LIU:MMD">21</a>].
93
94	<P>
95
96	<H3><A NAME="SECTION001032100000000000000">
97	Reverse Cuthill-McKee Ordering Algorithm</A>
98	</H3>
99	The original Cuthill-McKee ordering algorithm is primarily designed to
100	reduce the profile of a matrix [<A
101	HREF="bibliography.html#george81:__sparse_pos_def">14</A>].
102	George discovered that the reverse ordering often turned out to be
103	superior to the original ordering in 1971. Here we describe the
104	Reverse Cuthill-McKee algorithm in terms of a graph:
105
106	<OL>
107	<LI><I>Finding a starting vertex</I>: Determine a starting vertex
108	<I>r</I> and assign <i>r</i> to <I>x<sub>1</sub></I>.
109	</LI>
110	<LI><I>Main part</I>: For <I>i=1,...,N</I>, find all
111	the unnumbered neighbors of the vertex <I>x<sub>i</sub></I> and number them
112	in increasing order of degree.
113	</LI>
114	<LI><I>Reversing ordering</I>: The reverse Cuthill-McKee ordering
115	is given by <I>y<sub>1</sub>,...,y<sub>N</sub></I> where
116	<I>y<sub>i</sub></I> is <I>x<sub>N-i+1</sub></I> for <I>i = 1,...,N</I>.
117	</LI>
118	</OL>
119	In the first step a good starting vertex needs to be determined. The
120	study by George and Liu [<A
121	HREF="bibliography.html#george81:__sparse_pos_def">14</A>] showed that
122	a pair of vertices which are at maximum or near maximum ``distance''
123	apart are good ones. They also proposed an algorithm to find such a
124	starting vertex in  [<A
125	HREF="bibliography.html#george81:__sparse_pos_def">14</A>], which we
126	show in <A
127	HREF="#fig:ggcl_find_starting_vertex">Figure
128	1</A> implemented using the BGL interface.
129
130	<P>
131	<P></P>
132	<DIV ALIGN="CENTER"><A NAME="fig:ggcl_find_starting_vertex"></A>
133	<table border><tr><td>
134	<pre>
135	namespace boost {
136	template <class Graph, class Vertex, class Color, class Degree>
137	Vertex
138	pseudo_peripheral_pair(Graph& G, const Vertex& u, int& ecc,
139	Color color, Degree degree)
140	{
141	typename property_traits<Color>::value_type c = get(color, u);
142
143	rcm_queue<Vertex, Degree> Q(degree);
144
145	typename boost::graph_traits<Graph>::vertex_iterator ui, ui_end;
146	for (boost::tie(ui, ui_end) = vertices(G); ui != ui_end; ++ui)
147	put(color, *ui, white(c));
148	breadth_first_search(G, u, Q, bfs_visitor<>(), color);
149
150	ecc = Q.eccentricity();
151	return Q.spouse();
152	}
153
154	template <class Graph, class Vertex, class Color, class Degree>
155	Vertex find_starting_node(Graph& G, Vertex r, Color c, Degree d) {
156	int eccen_r, eccen_x;
157	Vertex x = pseudo_peripheral_pair(G, r, eccen_r, c, d);
158	Vertex y = pseudo_peripheral_pair(G, x, eccen_x, c, d);
159
160	while (eccen_x > eccen_r) {
161	r = x;
162	eccen_r = eccen_x;
163	x = y;
164	y = pseudo_peripheral_pair(G, x, eccen_x, c, d);
165	}
166	return x;
167	}
168	} // namespace boost
169	</pre>
170	</td></tr></table>
171	<CAPTION ALIGN="BOTTOM">
172	<table><tr><td>
173	<STRONG>Figure 1:</STRONG>
174	The BGL implementation of <TT>find_starting_node</TT>. The key
175	part <TT>pseudo_peripheral_pair</TT> is BFS with a custom queue
176	type virtually.
177	</td></tr></table></CAPTION>
178	</DIV>
179	<P></P>
180
181	The key part of step one is a custom queue type with BFS as shown in
182	<A
183	HREF="#fig:ggcl_find_starting_vertex">Figure
184	1</A>. The main Cuthill-McKee algorithm shown in Figure <A
185	HREF="#fig:cuthill-mckee">Figure 2</A>
186	is a fenced priority queue with a generalized BFS. If
187	<TT>inverse_permutation</TT> is a normal iterator, the result stored
188	is the inverse permutation of original Cuthill-McKee ordering. On the
189	other hand, if <TT>inverse_permutation</TT> is a reverse iterator, the
190	result stored is the inverse permutation of reverse Cuthill-McKee
191	ordering. Our implementation is quite concise because many components
192	from BGL can be reused.
193
194
195	<P></P>
196	<DIV ALIGN="CENTER"><A NAME="fig:cuthill-mckee"></A><A NAME="6261"></A>
197	<table border><tr><td>
198	<pre>
199	template < class Graph, class Vertex, class OutputIterator,
200	class Color, class Degree >
201	inline void
202	cuthill_mckee_ordering(Graph& G,
203	Vertex s,
204	OutputIterator inverse_permutation,
205	Color color, Degree degree)
206	{
207	typedef typename property_traits<Degree>::value_type DS;
208	typename property_traits<Color>::value_type c = get(color, s);
209	typedef indirect_cmp<Degree, std::greater<DS> > Compare;
210	Compare comp(degree);
211	fenced_priority_queue<Vertex, Compare > Q(comp);
212
213	typedef cuthill_mckee_visitor<OutputIterator> CMVisitor;
214	CMVisitor cm_visitor(inverse_permutation);
215
216	typename boost::graph_traits<Graph>::vertex_iterator ui, ui_end;
217	for (boost::tie(ui, ui_end) = vertices(G); ui != ui_end; ++ui)
218	put(color, *ui, white(c));
219	breadth_first_search(G, s, Q, cm_visitor, color);
220	}
221	</pre>
222	</td></tr></table>
223	<CAPTION ALIGN="BOTTOM">
224	<TABLE>
225	<TR><TD> <STRONG>Figure 2:</STRONG> The BGL implementation of
226	Cuthill-McKee algoithm. </TD></TR> </TABLE></CAPTION> </DIV><P></P>
227	<P>
228
229
230	<P>
231
232	<H3>Minimum Degree Ordering Algorithm</H3>
233
234	<P>
235	The pattern of another category of high-quality ordering algorithms in
236	wide use is based on a greedy approach such that the ordering is
237	chosen to minimize some quantity at each step of a simulated -step
238	symmetric Gaussian elimination process. The algorithms using such an
239	approach are typically distinguished by their greedy minimization
240	criteria [<a href="bibliography.html#ng-raghavan">34</a>].
241
242	<P>
243	In graph terms, the basic ordering process used by most greedy
244	algorithms is as follows:
245
246	<OL>
247	<LI><EM>Start:</EM> Construct undirected graph <i>G<sup>0</sup></i>
248	corresponding to matrix <i>A</i>
249
250	</LI>
251	<LI><EM>Iterate:</EM> For <i>k = 1, 2, ... </i> until
252	<i>G<sup>k</sup> = { }</i> do:
253
254	<UL>
255	<LI>Choose a vertex <i>v<sup>k</sup></i> from <i>G<sup>k</sup></i>
256	according to some criterion
257	</LI>
258	<LI>Eliminate <i>v<sup>k</sup></i> from <i>G<sup>k</sup></i> to form
259	<i>G<sup>k+1</sup></i>
260
261	</LI>
262	</UL>
263	</LI>
264	</OL>
265	The resulting ordering is the sequence of vertices <i>{v<sup>0</sup>,
266	v<sup>1</sup>,...}</i> selected by the algorithm.
267
268	<P>
269	One of the most important examples of such an algorithm is the
270	<I>Minimum Degree</I> algorithm. At each step the minimum degree
271	algorithm chooses the vertex with minimum degree in the corresponding
272	graph as <i>v<sup>k</sup></i>. A number of enhancements to the basic
273	minimum degree algorithm have been developed, such as the use of a
274	quotient graph representation, mass elimination, incomplete degree
275	update, multiple elimination, and external degree. See [<A
276	href="bibliography.html#George:evolution">35</a>] for a historical
277	survey of the minimum degree algorithm.
278
279	<P>
280	The BGL implementation of the Minimum Degree algorithm closely follows
281	the algorithmic descriptions of the one in  [<A
282	HREF="bibliography.html#LIU:MMD">21</A>]. The implementation presently
283	includes the enhancements for mass elimination, incomplete degree
284	update, multiple elimination, and external degree.
285
286	<P>
287	In particular, we create a graph representation to improve the
288	performance of the algorithm. It is based on a templated ``vector of
289	vectors.'' The vector container used is an adaptor class built on top
290	the STL <TT>std::vector</TT> class. Particular characteristics of this
291	adaptor class include the following:
292
293	<UL>
294	<LI>Erasing elements does not shrink the associated memory.
295	Adding new elements after erasing will not need to allocate
296	additional memory.
297	</LI>
298	<LI>Additional memory is allocated efficiently on demand when new
299	elements are added (doubling the capacity every time it is
300	increased). This property comes from STL vector.
301	</LI>
302	</UL>
303
304	<P>
305	Note that this representation is similar to that used in Liu's
306	implementation, with some important differences due to dynamic memory
307	allocation. With the dynamic memory allocation we do not need to
308	over-write portions of the graph that have been eliminated,
309	allowing for a more efficient graph traversal. More importantly,
310	information about the elimination graph is preserved allowing for
311	trivial symbolic factorization. Since symbolic factorization can be
312	an expensive part of the entire solution process, improving its
313	performance can result in significant computational savings.
314
315	<P>
316	The overhead of dynamic memory allocation could conceivably compromise
317	performance in some cases. However, in practice, memory allocation
318	overhead does not contribute significantly to run-time for our
319	implementation as shown in  [] because it is not
320	done very often and the cost gets amortized.
321
322	<P>
323
324	<H2>Proper Self-Avoiding Walk</H2>
325
326	<P>
327	The finite element method (FEM) is a flexible and attractive numerical
328	approach for solving partial differential equations since it is
329	straightforward to handle geometrically complicated domains. However,
330	unstructured meshes generated by FEM does not provide an obvious
331	labeling (numbering) of the unknowns while it is vital to have it for
332	matrix-vector notation of the underlying algebraic equations. Special
333	numbering techniques have been developed to optimize memory usage and
334	locality of such algorithms. One novel technique is the self-avoiding
335	walk [].
336
337	<P>
338	If we think a mesh is a graph, a self-avoiding walk(SAW) over an
339	arbitrary unstructured two-dimensional mesh is an enumeration of all
340	the triangles of the mesh such that two successive triangles shares an
341	edge or a vertex. A proper SAW is a SAW where jumping twice over the
342	same vertex is forbidden for three consecutive triangles in the walk.
343	it can be used to improve parallel efficiency of several irregular
344	algorithms, in particular issues related to reducing runtime memory
345	access (improving locality) and interprocess communications. The
346	reference [] has proved the existence
347	of A PSAW for any arbitrary triangular mesh by extending an existing
348	PSAW over a small piece of mesh to a larger part. The proof
349	effectively provides a set of rules to construct a PSAW.
350
351	<P>
352	The algorithm family of constructing a PSAW on a mesh is to start from
353	any random triangle of the mesh, choose new triangles sharing an edge
354	with the current sub-mesh and extend the existing partial PSAW over
355	the new triangles.
356
357	<P>
358	Let us define a dual graph of a mesh. Let a triangle in the mesh be a
359	vertex and two triangles sharing an edge in the mesh means there is an
360	edge between two vertices in the dual graph. By using a dual graph of
361	a mesh, one way to implement the algorithm family of constructing a
362	PSAW is to reuse BGL algorithms such as BFS and DFS with a customized
363	visitor to provide operations during
364	traversal. The customized visitor has the function <TT>tree_edge()</TT>
365	which is to extend partial PSAW in case by case and function
366	<TT>start_vertex()</TT> which is to set up the PSAW for the starting vertex.
367
368	<br>
369	<HR>
370	<TABLE>
371	<TR valign=top>
372	<TD nowrap>Copyright © 2000-2001</TD><TD>
373	<A HREF="http://www.boost.org/people/jeremy_siek.htm">Jeremy Siek</A>, Indiana University (<A HREF="mailto:jsiek@osl.iu.edu">jsiek@osl.iu.edu</A>)
374	</TD></TR></TABLE>
375
376	</BODY>
377	</HTML>