]> git.proxmox.com Git - ceph.git/blob - ceph/src/boost/libs/graph/doc/file_dependency_example.html
bump version to 12.2.2-pve1
[ceph.git] / ceph / src / boost / libs / graph / doc / file_dependency_example.html
1 <HTML>
2 <!--
3 Copyright (c) Jeremy Siek 2000
4
5 Distributed under the Boost Software License, Version 1.0.
6 (See accompanying file LICENSE_1_0.txt or copy at
7 http://www.boost.org/LICENSE_1_0.txt)
8 -->
9 <Head>
10 <Title>File Dependency Example</Title>
11 <BODY BGCOLOR="#ffffff" LINK="#0000ee" TEXT="#000000" VLINK="#551a8b"
12 ALINK="#ff0000">
13 <IMG SRC="../../../boost.png"
14 ALT="C++ Boost" width="277" height="86">
15
16 <BR Clear>
17
18
19 <H1><A NAME="sec:file-depend-eg"></A>
20 File Dependency Example
21 </H1>
22
23 <P>
24 One of the most common uses of the graph abstraction in computer
25 science is to track dependencies. An example of dependency tracking
26 that we deal with on a day to day basis is the compilation
27 dependencies for files in programs that we write. These dependencies
28 are used inside programs such as <TT>make</TT> or in an IDE such as
29 Visual C++ to minimize the number of files that must be recompiled
30 after some changes have been made.
31
32 <P>
33 <A HREF="#fig:file-dep">Figure 1</A> shows a graph that has a vertex
34 for each source file, object file, and library that is used in the
35 <TT>killerapp</TT> program. The edges in the graph show which files
36 are used in creating other files. The choice of which direction to
37 point the arrows is somewhat arbitrary. As long as we are consistent
38 in remembering that the arrows mean ``used by'' then things will work
39 out. The opposite direction would mean ``depends on''.
40
41 <P>
42
43 <P></P>
44 <DIV ALIGN="CENTER"><A NAME="fig:file-dep"></A>
45 <TABLE>
46 <CAPTION ALIGN="BOTTOM"><STRONG>Figure 1:</STRONG>
47 A graph representing file dependencies.</CAPTION>
48 <TR><TD><IMG SRC="./figs/file_dep.gif" width="331" height="351"></TD></TR>
49 </TABLE>
50 </DIV><P></P>
51
52 <P>
53 A compilation system such as <TT>make</TT> has to be able to answer a
54 number of questions:
55
56 <P>
57
58 <OL>
59 <LI>If we need to compile (or recompile) all of the files, what order
60 should that be done it?
61 </LI>
62 <LI>What files can be compiled in parallel?
63 </LI>
64 <LI>If a file is changed, which files must be recompiled?
65 </LI>
66 <LI>Are there any cycles in the dependencies? (which means the user
67 has made a mistake and an error should be emitted)
68 </LI>
69 </OL>
70
71 <P>
72 In the following examples we will formulate each of these questions in
73 terms of the dependency graph, and then find a graph algorithm to
74 provide the solution. The graph in <A HREF="#fig:file-dep">Figure
75 1</A> will be used in all of the following examples. The source code
76 for this example can be found in the file <a
77 href="../example/file_dependencies.cpp"><TT>examples/file_dependencies.cpp</TT></a>.
78
79 <P>
80
81 <H2>Graph Setup</H2>
82
83 <P>
84 Here we show the construction of the graph. First, these are the required
85 header files:
86
87 <P>
88 <PRE>
89 #include &lt;iostream&gt; // std::cout
90 #include &lt;utility&gt; // std::pair
91 #include &lt;boost/graph/graph_traits.hpp&gt;
92 #include &lt;boost/graph/adjacency_list.hpp&gt;
93 #include &lt;boost/graph/topological_sort.hpp&gt;
94 </PRE>
95
96 <P>
97 For simplicity we have
98 constructed the graph &quot;by-hand&quot;. A compilation system such
99 as <TT>make</TT> would instead parse a <TT>Makefile</TT> to get the
100 list of files and to set-up the dependencies. We use the
101 <TT>adjacency_list</TT> class to represent the graph. The
102 <TT>vecS</TT> selector means that a <TT>std::vector</TT> will be used
103 to represent each edge-list, which provides efficient traversal. The
104 <TT>bidirectionalS</TT> selector means we want a directed graph with access to both the edges outgoing from each vertex and the edges incoming to each vertex, and the
105 <TT>color_property</TT> attaches a color property to each vertex of the
106 graph. The color property will be used in several of the algorithms in
107 the following sections.
108
109 <P>
110 <PRE>
111 enum files_e { dax_h, yow_h, boz_h, zow_h, foo_cpp,
112 foo_o, bar_cpp, bar_o, libfoobar_a,
113 zig_cpp, zig_o, zag_cpp, zag_o,
114 libzigzag_a, killerapp, N };
115 const char* name[] = { "dax.h", "yow.h", "boz.h", "zow.h", "foo.cpp",
116 "foo.o", "bar.cpp", "bar.o", "libfoobar.a",
117 "zig.cpp", "zig.o", "zag.cpp", "zag.o",
118 "libzigzag.a", "killerapp" };
119
120 typedef std::pair&lt;int, int&gt; Edge;
121 Edge used_by[] = {
122 Edge(dax_h, foo_cpp), Edge(dax_h, bar_cpp), Edge(dax_h, yow_h),
123 Edge(yow_h, bar_cpp), Edge(yow_h, zag_cpp),
124 Edge(boz_h, bar_cpp), Edge(boz_h, zig_cpp), Edge(boz_h, zag_cpp),
125 Edge(zow_h, foo_cpp),
126 Edge(foo_cpp, foo_o),
127 Edge(foo_o, libfoobar_a),
128 Edge(bar_cpp, bar_o),
129 Edge(bar_o, libfoobar_a),
130 Edge(libfoobar_a, libzigzag_a),
131 Edge(zig_cpp, zig_o),
132 Edge(zig_o, libzigzag_a),
133 Edge(zag_cpp, zag_o),
134 Edge(zag_o, libzigzag_a),
135 Edge(libzigzag_a, killerapp)
136 };
137
138 using namespace boost;
139 typedef adjacency_list&lt;vecS, vecS, bidirectionalS,
140 property&lt;vertex_color_t, default_color_type&gt;
141 &gt; Graph;
142 Graph g(used_by, used_by + sizeof(used_by) / sizeof(Edge), N);
143 typedef graph_traits&lt;Graph&gt;::vertex_descriptor Vertex;
144 </PRE>
145
146 <P>
147
148 <H2>Compilation Order (All Files)</H2>
149
150 <P>
151 On the first invocation of <TT>make</TT> for a particular project, all
152 of the files must be compiled. Given the dependencies between the
153 various files, what is the correct order in which to compile and link
154 them? First we need to formulate this in terms of a graph. Finding a
155 compilation order is the same as ordering the vertices in the graph.
156 The constraint on the ordering is the file dependencies which we have
157 represented as edges. So if there is an edge <i>(u,v)</i> in the graph
158 then <i>v</i> better not come before <i>u</i> in the ordering. It
159 turns out that this kind of constrained ordering is called a
160 <I>topological sort</I>. Therefore, answering the question of
161 compilation order is as easy as calling the BGL algorithm <a
162 href="./topological_sort.html"><TT>topological_sort()</TT></a>. The
163 traditional form of the output for topological sort is a linked-list
164 of the sorted vertices. The BGL algorithm instead puts the sorted
165 vertices into any <a
166 href="http://www.sgi.com/tech/stl/OutputIterator.html">OutputIterator</a>,
167 which allows for much more flexibility. Here we use the
168 <TT>std::front_insert_iterator</TT> to create an output iterator that
169 inserts the vertices on the front of a linked list. Other possible
170 options are writing the output to a file or inserting into a different
171 STL or custom-made container.
172
173 <P>
174 <PRE>
175 typedef std::list&lt;Vertex&gt; MakeOrder;
176 MakeOrder make_order;
177 boost::topological_sort(g, std::front_inserter(make_order));
178
179 std::cout &lt;&lt; "make ordering: ";
180 for (MakeOrder::iterator i = make_order.begin();
181 i != make_order.end(); ++i)
182 std::cout &lt;&lt; name[*i] &lt;&lt; " ";
183 std::cout &lt;&lt; std::endl;
184 </PRE>
185 The output of this is:
186 <PRE>
187 make ordering: zow.h boz.h zig.cpp zig.o dax.h yow.h zag.cpp \
188 zag.o bar.cpp bar.o foo.cpp foo.o libfoobar.a libzigzag.a killerapp
189 </PRE>
190
191 <P>
192
193 <H2><A NAME="sec:parallel-compilation"></A>
194 Parallel Compilation
195 </H2>
196
197 <P>
198 Another question the compilation system might need to answer is: what
199 files can be compiled simultaneously? This would allow the system to
200 spawn threads and utilize multiple processors to speed up the build.
201 This question can also be put in a slightly different way: what is the
202 earliest time that a file can be built assuming that an unlimited
203 number of files can be built at the same time? The main criteria for
204 when a file can be built is that all of the files it depends on must
205 already be built. To simplify things for this example, we'll assume
206 that each file takes 1 time unit to build (even header files). For
207 parallel compilation, we can build all of the files corresponding to
208 vertices with no dependencies, e.g., those that have
209 an <i>in-degree</i> of 0, in the first step. For all other files, the
210 main observation for determining the ``time slot'' for a file is that
211 the time slot must be one more than the maximum time-slot of the files
212 it depends on.
213
214 <P>We start by creating a vector <code>time</code> that will store the
215 time step at which each file can be built. We initialize every value
216 with time step zero.</p>
217
218 <P>
219 <PRE>
220 std::vector&lt;int&gt; time(N, 0);
221 </PRE>
222
223 <p>Now, we want to visit the vertices against in topological order,
224 from those files that need to be built first until those that need
225 to be built last. However, instead of printing out the order
226 immediately, we will compute the time step in which each file should
227 be built based on the time steps of the files it depends on. We
228 only need to consider those files whose in-degree is greater than
229 zero.</p>
230
231 <pre>
232 for (i = make_order.begin(); i != make_order.end(); ++i) {
233 if (in_degree (*i, g) &gt; 0) {
234 Graph::in_edge_iterator j, j_end;
235 int maxdist = 0;
236 for (boost::tie(j, j_end) = in_edges(*i, g); j != j_end; ++j)
237 maxdist = std::max(time[source(*j, g)], maxdist);
238 time[*i]=maxdist+1;
239 }
240 }
241 </pre>
242
243 <P>
244 Last, we output the time-slot that we've calculated for each vertex.
245
246 <P>
247 <PRE>
248 std::cout &lt;&lt; "parallel make ordering, " &lt;&lt; std::endl
249 &lt;&lt; " vertices with same group number" &lt;&lt; std::endl
250 &lt;&lt; " can be made in parallel" &lt;&lt; std::endl &lt;&lt; std::endl;
251 for (boost::tie(i, iend) = vertices(g); i != iend; ++i)
252 std::cout &lt;&lt; "time_slot[" &lt;&lt; name[*i] &lt;&lt; "] = " &lt;&lt; time[*i] &lt;&lt; std::endl;
253 </PRE>
254 The output is:
255 <PRE>
256 parallel make ordering,
257 vertices with same group number
258 can be made in parallel
259 time_slot[dax.h] = 0
260 time_slot[yow.h] = 1
261 time_slot[boz.h] = 0
262 time_slot[zow.h] = 0
263 time_slot[foo.cpp] = 1
264 time_slot[foo.o] = 2
265 time_slot[bar.cpp] = 2
266 time_slot[bar.o] = 3
267 time_slot[libfoobar.a] = 4
268 time_slot[zig.cpp] = 1
269 time_slot[zig.o] = 2
270 time_slot[zag.cpp] = 2
271 time_slot[zag.o] = 3
272 time_slot[libzigzag.a] = 5
273 time_slot[killerapp] = 6
274 </PRE>
275
276 <P>
277
278 <H2><A NAME="SECTION001014000000000000000"></A>
279 <A NAME="sec:cycles"></A>
280 <BR>
281 Cyclic Dependencies
282 </H2>
283
284 <P>
285 Another question the compilation system needs to be able to answer is
286 whether there are any cycles in the dependencies. If there are cycles,
287 the system will need to report an error to the user so that the cycles
288 can be removed. One easy way to detect a cycle is to run a <a
289 href="./depth_first_search.html">depth-first search</a>, and if the
290 search runs into an already discovered vertex (of the current search
291 tree), then there is a cycle. The BGL graph search algorithms (which
292 includes
293 <TT>depth_first_search()</TT>) are all extensible via the
294 <I>visitor</I> mechanism. A visitor is similar to a function object,
295 but it has several methods instead of just the one
296 <TT>operator()</TT>. The visitor's methods are called at certain
297 points within the algorithm, thereby giving the user a way to extend
298 the functionality of the graph search algorithms. See Section <A
299 HREF="visitor_concepts.html">Visitor Concepts</A>
300 for a detailed description of visitors.
301
302 <P>
303 We will create a visitor class and fill in the <TT>back_edge()</TT>
304 method, which is the <a href="./DFSVisitor.html">DFSVisitor</a> method
305 that is called when DFS explores an edge to an already discovered
306 vertex. A call to this method indicates the existence of a
307 cycle. Inheriting from <TT>dfs_visitor&lt;&gt;</TT>
308 provides the visitor with empty versions of the other visitor methods.
309 Once our visitor is created, we can construct and object and pass it
310 to the BGL algorithm. Visitor objects are passed by value inside of
311 the BGL algorithms, so the <TT>has_cycle</TT> flag is stored by
312 reference in this visitor.
313
314 <P>
315 <PRE>
316 struct cycle_detector : public dfs_visitor&lt;&gt;
317 {
318 cycle_detector( bool&amp; has_cycle)
319 : _has_cycle(has_cycle) { }
320
321 template &lt;class Edge, class Graph&gt;
322 void back_edge(Edge, Graph&amp;) {
323 _has_cycle = true;
324 }
325 protected:
326 bool&amp; _has_cycle;
327 };
328 </PRE>
329
330 <P>
331 We can now invoke the BGL <TT>depth_first_search()</TT>
332 algorithm and pass in the cycle detector visitor.
333
334 <P>
335 <PRE>
336 bool has_cycle = false;
337 cycle_detector vis(has_cycle);
338 boost::depth_first_search(g, visitor(vis));
339 std::cout &lt;&lt; "The graph has a cycle? " &lt;&lt; has_cycle &lt;&lt; std::endl;
340 </PRE>
341
342 <P>
343 The graph in <A HREF="#fig:file-dep">Figure 1</A> (ignoring the dotted
344 line) did not have any cycles, but if we add a dependency between
345 <TT>bar.cpp</TT> and <TT>dax.h</TT> there there will be. Such a
346 dependency would be flagged as a user error.
347
348 <P>
349 <PRE>
350 add_edge(bar_cpp, dax_h, g);
351 </PRE>
352
353
354 <br>
355 <HR>
356 <TABLE>
357 <TR valign=top>
358 <TD nowrap>Copyright &copy; 2000-2001</TD><TD>
359 <A HREF="http://www.boost.org/people/jeremy_siek.htm">Jeremy Siek</A>, Indiana University (<A HREF="mailto:jsiek@osl.iu.edu">jsiek@osl.iu.edu</A>)
360 </TD></TR></TABLE>
361
362 </BODY>
363 </HTML>