]>
Commit | Line | Data |
---|---|---|
223e47cc LB |
1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" |
2 | "http://www.w3.org/TR/html4/strict.dtd"> | |
3 | ||
4 | <html> | |
5 | <head> | |
6 | <title>Kaleidoscope: Adding JIT and Optimizer Support</title> | |
7 | <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> | |
8 | <meta name="author" content="Chris Lattner"> | |
9 | <meta name="author" content="Erick Tryzelaar"> | |
10 | <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> | |
11 | </head> | |
12 | ||
13 | <body> | |
14 | ||
15 | <h1>Kaleidoscope: Adding JIT and Optimizer Support</h1> | |
16 | ||
17 | <ul> | |
18 | <li><a href="index.html">Up to Tutorial Index</a></li> | |
19 | <li>Chapter 4 | |
20 | <ol> | |
21 | <li><a href="#intro">Chapter 4 Introduction</a></li> | |
22 | <li><a href="#trivialconstfold">Trivial Constant Folding</a></li> | |
23 | <li><a href="#optimizerpasses">LLVM Optimization Passes</a></li> | |
24 | <li><a href="#jit">Adding a JIT Compiler</a></li> | |
25 | <li><a href="#code">Full Code Listing</a></li> | |
26 | </ol> | |
27 | </li> | |
28 | <li><a href="OCamlLangImpl5.html">Chapter 5</a>: Extending the Language: Control | |
29 | Flow</li> | |
30 | </ul> | |
31 | ||
32 | <div class="doc_author"> | |
33 | <p> | |
34 | Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a> | |
35 | and <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a> | |
36 | </p> | |
37 | </div> | |
38 | ||
39 | <!-- *********************************************************************** --> | |
40 | <h2><a name="intro">Chapter 4 Introduction</a></h2> | |
41 | <!-- *********************************************************************** --> | |
42 | ||
43 | <div> | |
44 | ||
45 | <p>Welcome to Chapter 4 of the "<a href="index.html">Implementing a language | |
46 | with LLVM</a>" tutorial. Chapters 1-3 described the implementation of a simple | |
47 | language and added support for generating LLVM IR. This chapter describes | |
48 | two new techniques: adding optimizer support to your language, and adding JIT | |
49 | compiler support. These additions will demonstrate how to get nice, efficient code | |
50 | for the Kaleidoscope language.</p> | |
51 | ||
52 | </div> | |
53 | ||
54 | <!-- *********************************************************************** --> | |
55 | <h2><a name="trivialconstfold">Trivial Constant Folding</a></h2> | |
56 | <!-- *********************************************************************** --> | |
57 | ||
58 | <div> | |
59 | ||
60 | <p><b>Note:</b> the default <tt>IRBuilder</tt> now always includes the constant | |
61 | folding optimisations below.<p> | |
62 | ||
63 | <p> | |
64 | Our demonstration for Chapter 3 is elegant and easy to extend. Unfortunately, | |
65 | it does not produce wonderful code. For example, when compiling simple code, | |
66 | we don't get obvious optimizations:</p> | |
67 | ||
68 | <div class="doc_code"> | |
69 | <pre> | |
70 | ready> <b>def test(x) 1+2+x;</b> | |
71 | Read function definition: | |
72 | define double @test(double %x) { | |
73 | entry: | |
74 | %addtmp = fadd double 1.000000e+00, 2.000000e+00 | |
75 | %addtmp1 = fadd double %addtmp, %x | |
76 | ret double %addtmp1 | |
77 | } | |
78 | </pre> | |
79 | </div> | |
80 | ||
81 | <p>This code is a very, very literal transcription of the AST built by parsing | |
82 | the input. As such, this transcription lacks optimizations like constant folding | |
83 | (we'd like to get "<tt>add x, 3.0</tt>" in the example above) as well as other | |
84 | more important optimizations. Constant folding, in particular, is a very common | |
85 | and very important optimization: so much so that many language implementors | |
86 | implement constant folding support in their AST representation.</p> | |
87 | ||
88 | <p>With LLVM, you don't need this support in the AST. Since all calls to build | |
89 | LLVM IR go through the LLVM builder, it would be nice if the builder itself | |
90 | checked to see if there was a constant folding opportunity when you call it. | |
91 | If so, it could just do the constant fold and return the constant instead of | |
92 | creating an instruction. This is exactly what the <tt>LLVMFoldingBuilder</tt> | |
93 | class does. | |
94 | ||
95 | <p>All we did was switch from <tt>LLVMBuilder</tt> to | |
96 | <tt>LLVMFoldingBuilder</tt>. Though we change no other code, we now have all of our | |
97 | instructions implicitly constant folded without us having to do anything | |
98 | about it. For example, the input above now compiles to:</p> | |
99 | ||
100 | <div class="doc_code"> | |
101 | <pre> | |
102 | ready> <b>def test(x) 1+2+x;</b> | |
103 | Read function definition: | |
104 | define double @test(double %x) { | |
105 | entry: | |
106 | %addtmp = fadd double 3.000000e+00, %x | |
107 | ret double %addtmp | |
108 | } | |
109 | </pre> | |
110 | </div> | |
111 | ||
112 | <p>Well, that was easy :). In practice, we recommend always using | |
113 | <tt>LLVMFoldingBuilder</tt> when generating code like this. It has no | |
114 | "syntactic overhead" for its use (you don't have to uglify your compiler with | |
115 | constant checks everywhere) and it can dramatically reduce the amount of | |
116 | LLVM IR that is generated in some cases (particular for languages with a macro | |
117 | preprocessor or that use a lot of constants).</p> | |
118 | ||
119 | <p>On the other hand, the <tt>LLVMFoldingBuilder</tt> is limited by the fact | |
120 | that it does all of its analysis inline with the code as it is built. If you | |
121 | take a slightly more complex example:</p> | |
122 | ||
123 | <div class="doc_code"> | |
124 | <pre> | |
125 | ready> <b>def test(x) (1+2+x)*(x+(1+2));</b> | |
126 | ready> Read function definition: | |
127 | define double @test(double %x) { | |
128 | entry: | |
129 | %addtmp = fadd double 3.000000e+00, %x | |
130 | %addtmp1 = fadd double %x, 3.000000e+00 | |
131 | %multmp = fmul double %addtmp, %addtmp1 | |
132 | ret double %multmp | |
133 | } | |
134 | </pre> | |
135 | </div> | |
136 | ||
137 | <p>In this case, the LHS and RHS of the multiplication are the same value. We'd | |
138 | really like to see this generate "<tt>tmp = x+3; result = tmp*tmp;</tt>" instead | |
139 | of computing "<tt>x*3</tt>" twice.</p> | |
140 | ||
141 | <p>Unfortunately, no amount of local analysis will be able to detect and correct | |
142 | this. This requires two transformations: reassociation of expressions (to | |
143 | make the add's lexically identical) and Common Subexpression Elimination (CSE) | |
144 | to delete the redundant add instruction. Fortunately, LLVM provides a broad | |
145 | range of optimizations that you can use, in the form of "passes".</p> | |
146 | ||
147 | </div> | |
148 | ||
149 | <!-- *********************************************************************** --> | |
150 | <h2><a name="optimizerpasses">LLVM Optimization Passes</a></h2> | |
151 | <!-- *********************************************************************** --> | |
152 | ||
153 | <div> | |
154 | ||
155 | <p>LLVM provides many optimization passes, which do many different sorts of | |
156 | things and have different tradeoffs. Unlike other systems, LLVM doesn't hold | |
157 | to the mistaken notion that one set of optimizations is right for all languages | |
158 | and for all situations. LLVM allows a compiler implementor to make complete | |
159 | decisions about what optimizations to use, in which order, and in what | |
160 | situation.</p> | |
161 | ||
162 | <p>As a concrete example, LLVM supports both "whole module" passes, which look | |
163 | across as large of body of code as they can (often a whole file, but if run | |
164 | at link time, this can be a substantial portion of the whole program). It also | |
165 | supports and includes "per-function" passes which just operate on a single | |
166 | function at a time, without looking at other functions. For more information | |
167 | on passes and how they are run, see the <a href="../WritingAnLLVMPass.html">How | |
168 | to Write a Pass</a> document and the <a href="../Passes.html">List of LLVM | |
169 | Passes</a>.</p> | |
170 | ||
171 | <p>For Kaleidoscope, we are currently generating functions on the fly, one at | |
172 | a time, as the user types them in. We aren't shooting for the ultimate | |
173 | optimization experience in this setting, but we also want to catch the easy and | |
174 | quick stuff where possible. As such, we will choose to run a few per-function | |
175 | optimizations as the user types the function in. If we wanted to make a "static | |
176 | Kaleidoscope compiler", we would use exactly the code we have now, except that | |
177 | we would defer running the optimizer until the entire file has been parsed.</p> | |
178 | ||
179 | <p>In order to get per-function optimizations going, we need to set up a | |
180 | <a href="../WritingAnLLVMPass.html#passmanager">Llvm.PassManager</a> to hold and | |
181 | organize the LLVM optimizations that we want to run. Once we have that, we can | |
182 | add a set of optimizations to run. The code looks like this:</p> | |
183 | ||
184 | <div class="doc_code"> | |
185 | <pre> | |
186 | (* Create the JIT. *) | |
187 | let the_execution_engine = ExecutionEngine.create Codegen.the_module in | |
188 | let the_fpm = PassManager.create_function Codegen.the_module in | |
189 | ||
190 | (* Set up the optimizer pipeline. Start with registering info about how the | |
191 | * target lays out data structures. *) | |
192 | TargetData.add (ExecutionEngine.target_data the_execution_engine) the_fpm; | |
193 | ||
194 | (* Do simple "peephole" optimizations and bit-twiddling optzn. *) | |
195 | add_instruction_combining the_fpm; | |
196 | ||
197 | (* reassociate expressions. *) | |
198 | add_reassociation the_fpm; | |
199 | ||
200 | (* Eliminate Common SubExpressions. *) | |
201 | add_gvn the_fpm; | |
202 | ||
203 | (* Simplify the control flow graph (deleting unreachable blocks, etc). *) | |
204 | add_cfg_simplification the_fpm; | |
205 | ||
206 | ignore (PassManager.initialize the_fpm); | |
207 | ||
208 | (* Run the main "interpreter loop" now. *) | |
209 | Toplevel.main_loop the_fpm the_execution_engine stream; | |
210 | </pre> | |
211 | </div> | |
212 | ||
213 | <p>The meat of the matter here, is the definition of "<tt>the_fpm</tt>". It | |
214 | requires a pointer to the <tt>the_module</tt> to construct itself. Once it is | |
215 | set up, we use a series of "add" calls to add a bunch of LLVM passes. The | |
216 | first pass is basically boilerplate, it adds a pass so that later optimizations | |
217 | know how the data structures in the program are laid out. The | |
218 | "<tt>the_execution_engine</tt>" variable is related to the JIT, which we will | |
219 | get to in the next section.</p> | |
220 | ||
221 | <p>In this case, we choose to add 4 optimization passes. The passes we chose | |
222 | here are a pretty standard set of "cleanup" optimizations that are useful for | |
223 | a wide variety of code. I won't delve into what they do but, believe me, | |
224 | they are a good starting place :).</p> | |
225 | ||
226 | <p>Once the <tt>Llvm.PassManager.</tt> is set up, we need to make use of it. | |
227 | We do this by running it after our newly created function is constructed (in | |
228 | <tt>Codegen.codegen_func</tt>), but before it is returned to the client:</p> | |
229 | ||
230 | <div class="doc_code"> | |
231 | <pre> | |
232 | let codegen_func the_fpm = function | |
233 | ... | |
234 | try | |
235 | let ret_val = codegen_expr body in | |
236 | ||
237 | (* Finish off the function. *) | |
238 | let _ = build_ret ret_val builder in | |
239 | ||
240 | (* Validate the generated code, checking for consistency. *) | |
241 | Llvm_analysis.assert_valid_function the_function; | |
242 | ||
243 | (* Optimize the function. *) | |
244 | let _ = PassManager.run_function the_function the_fpm in | |
245 | ||
246 | the_function | |
247 | </pre> | |
248 | </div> | |
249 | ||
250 | <p>As you can see, this is pretty straightforward. The <tt>the_fpm</tt> | |
251 | optimizes and updates the LLVM Function* in place, improving (hopefully) its | |
252 | body. With this in place, we can try our test above again:</p> | |
253 | ||
254 | <div class="doc_code"> | |
255 | <pre> | |
256 | ready> <b>def test(x) (1+2+x)*(x+(1+2));</b> | |
257 | ready> Read function definition: | |
258 | define double @test(double %x) { | |
259 | entry: | |
260 | %addtmp = fadd double %x, 3.000000e+00 | |
261 | %multmp = fmul double %addtmp, %addtmp | |
262 | ret double %multmp | |
263 | } | |
264 | </pre> | |
265 | </div> | |
266 | ||
267 | <p>As expected, we now get our nicely optimized code, saving a floating point | |
268 | add instruction from every execution of this function.</p> | |
269 | ||
270 | <p>LLVM provides a wide variety of optimizations that can be used in certain | |
271 | circumstances. Some <a href="../Passes.html">documentation about the various | |
272 | passes</a> is available, but it isn't very complete. Another good source of | |
273 | ideas can come from looking at the passes that <tt>Clang</tt> runs to get | |
274 | started. The "<tt>opt</tt>" tool allows you to experiment with passes from the | |
275 | command line, so you can see if they do anything.</p> | |
276 | ||
277 | <p>Now that we have reasonable code coming out of our front-end, lets talk about | |
278 | executing it!</p> | |
279 | ||
280 | </div> | |
281 | ||
282 | <!-- *********************************************************************** --> | |
283 | <h2><a name="jit">Adding a JIT Compiler</a></h2> | |
284 | <!-- *********************************************************************** --> | |
285 | ||
286 | <div> | |
287 | ||
288 | <p>Code that is available in LLVM IR can have a wide variety of tools | |
289 | applied to it. For example, you can run optimizations on it (as we did above), | |
290 | you can dump it out in textual or binary forms, you can compile the code to an | |
291 | assembly file (.s) for some target, or you can JIT compile it. The nice thing | |
292 | about the LLVM IR representation is that it is the "common currency" between | |
293 | many different parts of the compiler. | |
294 | </p> | |
295 | ||
296 | <p>In this section, we'll add JIT compiler support to our interpreter. The | |
297 | basic idea that we want for Kaleidoscope is to have the user enter function | |
298 | bodies as they do now, but immediately evaluate the top-level expressions they | |
299 | type in. For example, if they type in "1 + 2;", we should evaluate and print | |
300 | out 3. If they define a function, they should be able to call it from the | |
301 | command line.</p> | |
302 | ||
303 | <p>In order to do this, we first declare and initialize the JIT. This is done | |
304 | by adding a global variable and a call in <tt>main</tt>:</p> | |
305 | ||
306 | <div class="doc_code"> | |
307 | <pre> | |
308 | ... | |
309 | let main () = | |
310 | ... | |
311 | <b>(* Create the JIT. *) | |
312 | let the_execution_engine = ExecutionEngine.create Codegen.the_module in</b> | |
313 | ... | |
314 | </pre> | |
315 | </div> | |
316 | ||
317 | <p>This creates an abstract "Execution Engine" which can be either a JIT | |
318 | compiler or the LLVM interpreter. LLVM will automatically pick a JIT compiler | |
319 | for you if one is available for your platform, otherwise it will fall back to | |
320 | the interpreter.</p> | |
321 | ||
322 | <p>Once the <tt>Llvm_executionengine.ExecutionEngine.t</tt> is created, the JIT | |
323 | is ready to be used. There are a variety of APIs that are useful, but the | |
324 | simplest one is the "<tt>Llvm_executionengine.ExecutionEngine.run_function</tt>" | |
325 | function. This method JIT compiles the specified LLVM Function and returns a | |
326 | function pointer to the generated machine code. In our case, this means that we | |
327 | can change the code that parses a top-level expression to look like this:</p> | |
328 | ||
329 | <div class="doc_code"> | |
330 | <pre> | |
331 | (* Evaluate a top-level expression into an anonymous function. *) | |
332 | let e = Parser.parse_toplevel stream in | |
333 | print_endline "parsed a top-level expr"; | |
334 | let the_function = Codegen.codegen_func the_fpm e in | |
335 | dump_value the_function; | |
336 | ||
337 | (* JIT the function, returning a function pointer. *) | |
338 | let result = ExecutionEngine.run_function the_function [||] | |
339 | the_execution_engine in | |
340 | ||
341 | print_string "Evaluated to "; | |
342 | print_float (GenericValue.as_float Codegen.double_type result); | |
343 | print_newline (); | |
344 | </pre> | |
345 | </div> | |
346 | ||
347 | <p>Recall that we compile top-level expressions into a self-contained LLVM | |
348 | function that takes no arguments and returns the computed double. Because the | |
349 | LLVM JIT compiler matches the native platform ABI, this means that you can just | |
350 | cast the result pointer to a function pointer of that type and call it directly. | |
351 | This means, there is no difference between JIT compiled code and native machine | |
352 | code that is statically linked into your application.</p> | |
353 | ||
354 | <p>With just these two changes, lets see how Kaleidoscope works now!</p> | |
355 | ||
356 | <div class="doc_code"> | |
357 | <pre> | |
358 | ready> <b>4+5;</b> | |
359 | define double @""() { | |
360 | entry: | |
361 | ret double 9.000000e+00 | |
362 | } | |
363 | ||
364 | <em>Evaluated to 9.000000</em> | |
365 | </pre> | |
366 | </div> | |
367 | ||
368 | <p>Well this looks like it is basically working. The dump of the function | |
369 | shows the "no argument function that always returns double" that we synthesize | |
370 | for each top level expression that is typed in. This demonstrates very basic | |
371 | functionality, but can we do more?</p> | |
372 | ||
373 | <div class="doc_code"> | |
374 | <pre> | |
375 | ready> <b>def testfunc(x y) x + y*2; </b> | |
376 | Read function definition: | |
377 | define double @testfunc(double %x, double %y) { | |
378 | entry: | |
379 | %multmp = fmul double %y, 2.000000e+00 | |
380 | %addtmp = fadd double %multmp, %x | |
381 | ret double %addtmp | |
382 | } | |
383 | ||
384 | ready> <b>testfunc(4, 10);</b> | |
385 | define double @""() { | |
386 | entry: | |
387 | %calltmp = call double @testfunc(double 4.000000e+00, double 1.000000e+01) | |
388 | ret double %calltmp | |
389 | } | |
390 | ||
391 | <em>Evaluated to 24.000000</em> | |
392 | </pre> | |
393 | </div> | |
394 | ||
395 | <p>This illustrates that we can now call user code, but there is something a bit | |
396 | subtle going on here. Note that we only invoke the JIT on the anonymous | |
397 | functions that <em>call testfunc</em>, but we never invoked it | |
398 | on <em>testfunc</em> itself. What actually happened here is that the JIT | |
399 | scanned for all non-JIT'd functions transitively called from the anonymous | |
400 | function and compiled all of them before returning | |
401 | from <tt>run_function</tt>.</p> | |
402 | ||
403 | <p>The JIT provides a number of other more advanced interfaces for things like | |
404 | freeing allocated machine code, rejit'ing functions to update them, etc. | |
405 | However, even with this simple code, we get some surprisingly powerful | |
406 | capabilities - check this out (I removed the dump of the anonymous functions, | |
407 | you should get the idea by now :) :</p> | |
408 | ||
409 | <div class="doc_code"> | |
410 | <pre> | |
411 | ready> <b>extern sin(x);</b> | |
412 | Read extern: | |
413 | declare double @sin(double) | |
414 | ||
415 | ready> <b>extern cos(x);</b> | |
416 | Read extern: | |
417 | declare double @cos(double) | |
418 | ||
419 | ready> <b>sin(1.0);</b> | |
420 | <em>Evaluated to 0.841471</em> | |
421 | ||
422 | ready> <b>def foo(x) sin(x)*sin(x) + cos(x)*cos(x);</b> | |
423 | Read function definition: | |
424 | define double @foo(double %x) { | |
425 | entry: | |
426 | %calltmp = call double @sin(double %x) | |
427 | %multmp = fmul double %calltmp, %calltmp | |
428 | %calltmp2 = call double @cos(double %x) | |
429 | %multmp4 = fmul double %calltmp2, %calltmp2 | |
430 | %addtmp = fadd double %multmp, %multmp4 | |
431 | ret double %addtmp | |
432 | } | |
433 | ||
434 | ready> <b>foo(4.0);</b> | |
435 | <em>Evaluated to 1.000000</em> | |
436 | </pre> | |
437 | </div> | |
438 | ||
439 | <p>Whoa, how does the JIT know about sin and cos? The answer is surprisingly | |
440 | simple: in this example, the JIT started execution of a function and got to a | |
441 | function call. It realized that the function was not yet JIT compiled and | |
442 | invoked the standard set of routines to resolve the function. In this case, | |
443 | there is no body defined for the function, so the JIT ended up calling | |
444 | "<tt>dlsym("sin")</tt>" on the Kaleidoscope process itself. Since | |
445 | "<tt>sin</tt>" is defined within the JIT's address space, it simply patches up | |
446 | calls in the module to call the libm version of <tt>sin</tt> directly.</p> | |
447 | ||
448 | <p>The LLVM JIT provides a number of interfaces (look in the | |
449 | <tt>llvm_executionengine.mli</tt> file) for controlling how unknown functions | |
450 | get resolved. It allows you to establish explicit mappings between IR objects | |
451 | and addresses (useful for LLVM global variables that you want to map to static | |
452 | tables, for example), allows you to dynamically decide on the fly based on the | |
453 | function name, and even allows you to have the JIT compile functions lazily the | |
454 | first time they're called.</p> | |
455 | ||
456 | <p>One interesting application of this is that we can now extend the language | |
457 | by writing arbitrary C code to implement operations. For example, if we add: | |
458 | </p> | |
459 | ||
460 | <div class="doc_code"> | |
461 | <pre> | |
462 | /* putchard - putchar that takes a double and returns 0. */ | |
463 | extern "C" | |
464 | double putchard(double X) { | |
465 | putchar((char)X); | |
466 | return 0; | |
467 | } | |
468 | </pre> | |
469 | </div> | |
470 | ||
471 | <p>Now we can produce simple output to the console by using things like: | |
472 | "<tt>extern putchard(x); putchard(120);</tt>", which prints a lowercase 'x' on | |
473 | the console (120 is the ASCII code for 'x'). Similar code could be used to | |
474 | implement file I/O, console input, and many other capabilities in | |
475 | Kaleidoscope.</p> | |
476 | ||
477 | <p>This completes the JIT and optimizer chapter of the Kaleidoscope tutorial. At | |
478 | this point, we can compile a non-Turing-complete programming language, optimize | |
479 | and JIT compile it in a user-driven way. Next up we'll look into <a | |
480 | href="OCamlLangImpl5.html">extending the language with control flow | |
481 | constructs</a>, tackling some interesting LLVM IR issues along the way.</p> | |
482 | ||
483 | </div> | |
484 | ||
485 | <!-- *********************************************************************** --> | |
486 | <h2><a name="code">Full Code Listing</a></h2> | |
487 | <!-- *********************************************************************** --> | |
488 | ||
489 | <div> | |
490 | ||
491 | <p> | |
492 | Here is the complete code listing for our running example, enhanced with the | |
493 | LLVM JIT and optimizer. To build this example, use: | |
494 | </p> | |
495 | ||
496 | <div class="doc_code"> | |
497 | <pre> | |
498 | # Compile | |
499 | ocamlbuild toy.byte | |
500 | # Run | |
501 | ./toy.byte | |
502 | </pre> | |
503 | </div> | |
504 | ||
505 | <p>Here is the code:</p> | |
506 | ||
507 | <dl> | |
508 | <dt>_tags:</dt> | |
509 | <dd class="doc_code"> | |
510 | <pre> | |
511 | <{lexer,parser}.ml>: use_camlp4, pp(camlp4of) | |
512 | <*.{byte,native}>: g++, use_llvm, use_llvm_analysis | |
513 | <*.{byte,native}>: use_llvm_executionengine, use_llvm_target | |
514 | <*.{byte,native}>: use_llvm_scalar_opts, use_bindings | |
515 | </pre> | |
516 | </dd> | |
517 | ||
518 | <dt>myocamlbuild.ml:</dt> | |
519 | <dd class="doc_code"> | |
520 | <pre> | |
521 | open Ocamlbuild_plugin;; | |
522 | ||
523 | ocaml_lib ~extern:true "llvm";; | |
524 | ocaml_lib ~extern:true "llvm_analysis";; | |
525 | ocaml_lib ~extern:true "llvm_executionengine";; | |
526 | ocaml_lib ~extern:true "llvm_target";; | |
527 | ocaml_lib ~extern:true "llvm_scalar_opts";; | |
528 | ||
529 | flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"]);; | |
530 | dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];; | |
531 | </pre> | |
532 | </dd> | |
533 | ||
534 | <dt>token.ml:</dt> | |
535 | <dd class="doc_code"> | |
536 | <pre> | |
537 | (*===----------------------------------------------------------------------=== | |
538 | * Lexer Tokens | |
539 | *===----------------------------------------------------------------------===*) | |
540 | ||
541 | (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of | |
542 | * these others for known things. *) | |
543 | type token = | |
544 | (* commands *) | |
545 | | Def | Extern | |
546 | ||
547 | (* primary *) | |
548 | | Ident of string | Number of float | |
549 | ||
550 | (* unknown *) | |
551 | | Kwd of char | |
552 | </pre> | |
553 | </dd> | |
554 | ||
555 | <dt>lexer.ml:</dt> | |
556 | <dd class="doc_code"> | |
557 | <pre> | |
558 | (*===----------------------------------------------------------------------=== | |
559 | * Lexer | |
560 | *===----------------------------------------------------------------------===*) | |
561 | ||
562 | let rec lex = parser | |
563 | (* Skip any whitespace. *) | |
564 | | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream | |
565 | ||
566 | (* identifier: [a-zA-Z][a-zA-Z0-9] *) | |
567 | | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] -> | |
568 | let buffer = Buffer.create 1 in | |
569 | Buffer.add_char buffer c; | |
570 | lex_ident buffer stream | |
571 | ||
572 | (* number: [0-9.]+ *) | |
573 | | [< ' ('0' .. '9' as c); stream >] -> | |
574 | let buffer = Buffer.create 1 in | |
575 | Buffer.add_char buffer c; | |
576 | lex_number buffer stream | |
577 | ||
578 | (* Comment until end of line. *) | |
579 | | [< ' ('#'); stream >] -> | |
580 | lex_comment stream | |
581 | ||
582 | (* Otherwise, just return the character as its ascii value. *) | |
583 | | [< 'c; stream >] -> | |
584 | [< 'Token.Kwd c; lex stream >] | |
585 | ||
586 | (* end of stream. *) | |
587 | | [< >] -> [< >] | |
588 | ||
589 | and lex_number buffer = parser | |
590 | | [< ' ('0' .. '9' | '.' as c); stream >] -> | |
591 | Buffer.add_char buffer c; | |
592 | lex_number buffer stream | |
593 | | [< stream=lex >] -> | |
594 | [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >] | |
595 | ||
596 | and lex_ident buffer = parser | |
597 | | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] -> | |
598 | Buffer.add_char buffer c; | |
599 | lex_ident buffer stream | |
600 | | [< stream=lex >] -> | |
601 | match Buffer.contents buffer with | |
602 | | "def" -> [< 'Token.Def; stream >] | |
603 | | "extern" -> [< 'Token.Extern; stream >] | |
604 | | id -> [< 'Token.Ident id; stream >] | |
605 | ||
606 | and lex_comment = parser | |
607 | | [< ' ('\n'); stream=lex >] -> stream | |
608 | | [< 'c; e=lex_comment >] -> e | |
609 | | [< >] -> [< >] | |
610 | </pre> | |
611 | </dd> | |
612 | ||
613 | <dt>ast.ml:</dt> | |
614 | <dd class="doc_code"> | |
615 | <pre> | |
616 | (*===----------------------------------------------------------------------=== | |
617 | * Abstract Syntax Tree (aka Parse Tree) | |
618 | *===----------------------------------------------------------------------===*) | |
619 | ||
620 | (* expr - Base type for all expression nodes. *) | |
621 | type expr = | |
622 | (* variant for numeric literals like "1.0". *) | |
623 | | Number of float | |
624 | ||
625 | (* variant for referencing a variable, like "a". *) | |
626 | | Variable of string | |
627 | ||
628 | (* variant for a binary operator. *) | |
629 | | Binary of char * expr * expr | |
630 | ||
631 | (* variant for function calls. *) | |
632 | | Call of string * expr array | |
633 | ||
634 | (* proto - This type represents the "prototype" for a function, which captures | |
635 | * its name, and its argument names (thus implicitly the number of arguments the | |
636 | * function takes). *) | |
637 | type proto = Prototype of string * string array | |
638 | ||
639 | (* func - This type represents a function definition itself. *) | |
640 | type func = Function of proto * expr | |
641 | </pre> | |
642 | </dd> | |
643 | ||
644 | <dt>parser.ml:</dt> | |
645 | <dd class="doc_code"> | |
646 | <pre> | |
647 | (*===---------------------------------------------------------------------=== | |
648 | * Parser | |
649 | *===---------------------------------------------------------------------===*) | |
650 | ||
651 | (* binop_precedence - This holds the precedence for each binary operator that is | |
652 | * defined *) | |
653 | let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10 | |
654 | ||
655 | (* precedence - Get the precedence of the pending binary operator token. *) | |
656 | let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1 | |
657 | ||
658 | (* primary | |
659 | * ::= identifier | |
660 | * ::= numberexpr | |
661 | * ::= parenexpr *) | |
662 | let rec parse_primary = parser | |
663 | (* numberexpr ::= number *) | |
664 | | [< 'Token.Number n >] -> Ast.Number n | |
665 | ||
666 | (* parenexpr ::= '(' expression ')' *) | |
667 | | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e | |
668 | ||
669 | (* identifierexpr | |
670 | * ::= identifier | |
671 | * ::= identifier '(' argumentexpr ')' *) | |
672 | | [< 'Token.Ident id; stream >] -> | |
673 | let rec parse_args accumulator = parser | |
674 | | [< e=parse_expr; stream >] -> | |
675 | begin parser | |
676 | | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e | |
677 | | [< >] -> e :: accumulator | |
678 | end stream | |
679 | | [< >] -> accumulator | |
680 | in | |
681 | let rec parse_ident id = parser | |
682 | (* Call. *) | |
683 | | [< 'Token.Kwd '('; | |
684 | args=parse_args []; | |
685 | 'Token.Kwd ')' ?? "expected ')'">] -> | |
686 | Ast.Call (id, Array.of_list (List.rev args)) | |
687 | ||
688 | (* Simple variable ref. *) | |
689 | | [< >] -> Ast.Variable id | |
690 | in | |
691 | parse_ident id stream | |
692 | ||
693 | | [< >] -> raise (Stream.Error "unknown token when expecting an expression.") | |
694 | ||
695 | (* binoprhs | |
696 | * ::= ('+' primary)* *) | |
697 | and parse_bin_rhs expr_prec lhs stream = | |
698 | match Stream.peek stream with | |
699 | (* If this is a binop, find its precedence. *) | |
700 | | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c -> | |
701 | let token_prec = precedence c in | |
702 | ||
703 | (* If this is a binop that binds at least as tightly as the current binop, | |
704 | * consume it, otherwise we are done. *) | |
705 | if token_prec < expr_prec then lhs else begin | |
706 | (* Eat the binop. *) | |
707 | Stream.junk stream; | |
708 | ||
709 | (* Parse the primary expression after the binary operator. *) | |
710 | let rhs = parse_primary stream in | |
711 | ||
712 | (* Okay, we know this is a binop. *) | |
713 | let rhs = | |
714 | match Stream.peek stream with | |
715 | | Some (Token.Kwd c2) -> | |
716 | (* If BinOp binds less tightly with rhs than the operator after | |
717 | * rhs, let the pending operator take rhs as its lhs. *) | |
718 | let next_prec = precedence c2 in | |
719 | if token_prec < next_prec | |
720 | then parse_bin_rhs (token_prec + 1) rhs stream | |
721 | else rhs | |
722 | | _ -> rhs | |
723 | in | |
724 | ||
725 | (* Merge lhs/rhs. *) | |
726 | let lhs = Ast.Binary (c, lhs, rhs) in | |
727 | parse_bin_rhs expr_prec lhs stream | |
728 | end | |
729 | | _ -> lhs | |
730 | ||
731 | (* expression | |
732 | * ::= primary binoprhs *) | |
733 | and parse_expr = parser | |
734 | | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream | |
735 | ||
736 | (* prototype | |
737 | * ::= id '(' id* ')' *) | |
738 | let parse_prototype = | |
739 | let rec parse_args accumulator = parser | |
740 | | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e | |
741 | | [< >] -> accumulator | |
742 | in | |
743 | ||
744 | parser | |
745 | | [< 'Token.Ident id; | |
746 | 'Token.Kwd '(' ?? "expected '(' in prototype"; | |
747 | args=parse_args []; | |
748 | 'Token.Kwd ')' ?? "expected ')' in prototype" >] -> | |
749 | (* success. *) | |
750 | Ast.Prototype (id, Array.of_list (List.rev args)) | |
751 | ||
752 | | [< >] -> | |
753 | raise (Stream.Error "expected function name in prototype") | |
754 | ||
755 | (* definition ::= 'def' prototype expression *) | |
756 | let parse_definition = parser | |
757 | | [< 'Token.Def; p=parse_prototype; e=parse_expr >] -> | |
758 | Ast.Function (p, e) | |
759 | ||
760 | (* toplevelexpr ::= expression *) | |
761 | let parse_toplevel = parser | |
762 | | [< e=parse_expr >] -> | |
763 | (* Make an anonymous proto. *) | |
764 | Ast.Function (Ast.Prototype ("", [||]), e) | |
765 | ||
766 | (* external ::= 'extern' prototype *) | |
767 | let parse_extern = parser | |
768 | | [< 'Token.Extern; e=parse_prototype >] -> e | |
769 | </pre> | |
770 | </dd> | |
771 | ||
772 | <dt>codegen.ml:</dt> | |
773 | <dd class="doc_code"> | |
774 | <pre> | |
775 | (*===----------------------------------------------------------------------=== | |
776 | * Code Generation | |
777 | *===----------------------------------------------------------------------===*) | |
778 | ||
779 | open Llvm | |
780 | ||
781 | exception Error of string | |
782 | ||
783 | let context = global_context () | |
784 | let the_module = create_module context "my cool jit" | |
785 | let builder = builder context | |
786 | let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10 | |
787 | let double_type = double_type context | |
788 | ||
789 | let rec codegen_expr = function | |
790 | | Ast.Number n -> const_float double_type n | |
791 | | Ast.Variable name -> | |
792 | (try Hashtbl.find named_values name with | |
793 | | Not_found -> raise (Error "unknown variable name")) | |
794 | | Ast.Binary (op, lhs, rhs) -> | |
795 | let lhs_val = codegen_expr lhs in | |
796 | let rhs_val = codegen_expr rhs in | |
797 | begin | |
798 | match op with | |
799 | | '+' -> build_add lhs_val rhs_val "addtmp" builder | |
800 | | '-' -> build_sub lhs_val rhs_val "subtmp" builder | |
801 | | '*' -> build_mul lhs_val rhs_val "multmp" builder | |
802 | | '<' -> | |
803 | (* Convert bool 0/1 to double 0.0 or 1.0 *) | |
804 | let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in | |
805 | build_uitofp i double_type "booltmp" builder | |
806 | | _ -> raise (Error "invalid binary operator") | |
807 | end | |
808 | | Ast.Call (callee, args) -> | |
809 | (* Look up the name in the module table. *) | |
810 | let callee = | |
811 | match lookup_function callee the_module with | |
812 | | Some callee -> callee | |
813 | | None -> raise (Error "unknown function referenced") | |
814 | in | |
815 | let params = params callee in | |
816 | ||
817 | (* If argument mismatch error. *) | |
818 | if Array.length params == Array.length args then () else | |
819 | raise (Error "incorrect # arguments passed"); | |
820 | let args = Array.map codegen_expr args in | |
821 | build_call callee args "calltmp" builder | |
822 | ||
823 | let codegen_proto = function | |
824 | | Ast.Prototype (name, args) -> | |
825 | (* Make the function type: double(double,double) etc. *) | |
826 | let doubles = Array.make (Array.length args) double_type in | |
827 | let ft = function_type double_type doubles in | |
828 | let f = | |
829 | match lookup_function name the_module with | |
830 | | None -> declare_function name ft the_module | |
831 | ||
832 | (* If 'f' conflicted, there was already something named 'name'. If it | |
833 | * has a body, don't allow redefinition or reextern. *) | |
834 | | Some f -> | |
835 | (* If 'f' already has a body, reject this. *) | |
836 | if block_begin f <> At_end f then | |
837 | raise (Error "redefinition of function"); | |
838 | ||
839 | (* If 'f' took a different number of arguments, reject. *) | |
840 | if element_type (type_of f) <> ft then | |
841 | raise (Error "redefinition of function with different # args"); | |
842 | f | |
843 | in | |
844 | ||
845 | (* Set names for all arguments. *) | |
846 | Array.iteri (fun i a -> | |
847 | let n = args.(i) in | |
848 | set_value_name n a; | |
849 | Hashtbl.add named_values n a; | |
850 | ) (params f); | |
851 | f | |
852 | ||
853 | let codegen_func the_fpm = function | |
854 | | Ast.Function (proto, body) -> | |
855 | Hashtbl.clear named_values; | |
856 | let the_function = codegen_proto proto in | |
857 | ||
858 | (* Create a new basic block to start insertion into. *) | |
859 | let bb = append_block context "entry" the_function in | |
860 | position_at_end bb builder; | |
861 | ||
862 | try | |
863 | let ret_val = codegen_expr body in | |
864 | ||
865 | (* Finish off the function. *) | |
866 | let _ = build_ret ret_val builder in | |
867 | ||
868 | (* Validate the generated code, checking for consistency. *) | |
869 | Llvm_analysis.assert_valid_function the_function; | |
870 | ||
871 | (* Optimize the function. *) | |
872 | let _ = PassManager.run_function the_function the_fpm in | |
873 | ||
874 | the_function | |
875 | with e -> | |
876 | delete_function the_function; | |
877 | raise e | |
878 | </pre> | |
879 | </dd> | |
880 | ||
881 | <dt>toplevel.ml:</dt> | |
882 | <dd class="doc_code"> | |
883 | <pre> | |
884 | (*===----------------------------------------------------------------------=== | |
885 | * Top-Level parsing and JIT Driver | |
886 | *===----------------------------------------------------------------------===*) | |
887 | ||
888 | open Llvm | |
889 | open Llvm_executionengine | |
890 | ||
891 | (* top ::= definition | external | expression | ';' *) | |
892 | let rec main_loop the_fpm the_execution_engine stream = | |
893 | match Stream.peek stream with | |
894 | | None -> () | |
895 | ||
896 | (* ignore top-level semicolons. *) | |
897 | | Some (Token.Kwd ';') -> | |
898 | Stream.junk stream; | |
899 | main_loop the_fpm the_execution_engine stream | |
900 | ||
901 | | Some token -> | |
902 | begin | |
903 | try match token with | |
904 | | Token.Def -> | |
905 | let e = Parser.parse_definition stream in | |
906 | print_endline "parsed a function definition."; | |
907 | dump_value (Codegen.codegen_func the_fpm e); | |
908 | | Token.Extern -> | |
909 | let e = Parser.parse_extern stream in | |
910 | print_endline "parsed an extern."; | |
911 | dump_value (Codegen.codegen_proto e); | |
912 | | _ -> | |
913 | (* Evaluate a top-level expression into an anonymous function. *) | |
914 | let e = Parser.parse_toplevel stream in | |
915 | print_endline "parsed a top-level expr"; | |
916 | let the_function = Codegen.codegen_func the_fpm e in | |
917 | dump_value the_function; | |
918 | ||
919 | (* JIT the function, returning a function pointer. *) | |
920 | let result = ExecutionEngine.run_function the_function [||] | |
921 | the_execution_engine in | |
922 | ||
923 | print_string "Evaluated to "; | |
924 | print_float (GenericValue.as_float Codegen.double_type result); | |
925 | print_newline (); | |
926 | with Stream.Error s | Codegen.Error s -> | |
927 | (* Skip token for error recovery. *) | |
928 | Stream.junk stream; | |
929 | print_endline s; | |
930 | end; | |
931 | print_string "ready> "; flush stdout; | |
932 | main_loop the_fpm the_execution_engine stream | |
933 | </pre> | |
934 | </dd> | |
935 | ||
936 | <dt>toy.ml:</dt> | |
937 | <dd class="doc_code"> | |
938 | <pre> | |
939 | (*===----------------------------------------------------------------------=== | |
940 | * Main driver code. | |
941 | *===----------------------------------------------------------------------===*) | |
942 | ||
943 | open Llvm | |
944 | open Llvm_executionengine | |
945 | open Llvm_target | |
946 | open Llvm_scalar_opts | |
947 | ||
948 | let main () = | |
949 | ignore (initialize_native_target ()); | |
950 | ||
951 | (* Install standard binary operators. | |
952 | * 1 is the lowest precedence. *) | |
953 | Hashtbl.add Parser.binop_precedence '<' 10; | |
954 | Hashtbl.add Parser.binop_precedence '+' 20; | |
955 | Hashtbl.add Parser.binop_precedence '-' 20; | |
956 | Hashtbl.add Parser.binop_precedence '*' 40; (* highest. *) | |
957 | ||
958 | (* Prime the first token. *) | |
959 | print_string "ready> "; flush stdout; | |
960 | let stream = Lexer.lex (Stream.of_channel stdin) in | |
961 | ||
962 | (* Create the JIT. *) | |
963 | let the_execution_engine = ExecutionEngine.create Codegen.the_module in | |
964 | let the_fpm = PassManager.create_function Codegen.the_module in | |
965 | ||
966 | (* Set up the optimizer pipeline. Start with registering info about how the | |
967 | * target lays out data structures. *) | |
968 | TargetData.add (ExecutionEngine.target_data the_execution_engine) the_fpm; | |
969 | ||
970 | (* Do simple "peephole" optimizations and bit-twiddling optzn. *) | |
971 | add_instruction_combination the_fpm; | |
972 | ||
973 | (* reassociate expressions. *) | |
974 | add_reassociation the_fpm; | |
975 | ||
976 | (* Eliminate Common SubExpressions. *) | |
977 | add_gvn the_fpm; | |
978 | ||
979 | (* Simplify the control flow graph (deleting unreachable blocks, etc). *) | |
980 | add_cfg_simplification the_fpm; | |
981 | ||
982 | ignore (PassManager.initialize the_fpm); | |
983 | ||
984 | (* Run the main "interpreter loop" now. *) | |
985 | Toplevel.main_loop the_fpm the_execution_engine stream; | |
986 | ||
987 | (* Print out all the generated code. *) | |
988 | dump_module Codegen.the_module | |
989 | ;; | |
990 | ||
991 | main () | |
992 | </pre> | |
993 | </dd> | |
994 | ||
995 | <dt>bindings.c</dt> | |
996 | <dd class="doc_code"> | |
997 | <pre> | |
998 | #include <stdio.h> | |
999 | ||
1000 | /* putchard - putchar that takes a double and returns 0. */ | |
1001 | extern double putchard(double X) { | |
1002 | putchar((char)X); | |
1003 | return 0; | |
1004 | } | |
1005 | </pre> | |
1006 | </dd> | |
1007 | </dl> | |
1008 | ||
1009 | <a href="OCamlLangImpl5.html">Next: Extending the language: control flow</a> | |
1010 | </div> | |
1011 | ||
1012 | <!-- *********************************************************************** --> | |
1013 | <hr> | |
1014 | <address> | |
1015 | <a href="http://jigsaw.w3.org/css-validator/check/referer"><img | |
1016 | src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a> | |
1017 | <a href="http://validator.w3.org/check/referer"><img | |
1018 | src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a> | |
1019 | ||
1020 | <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> | |
1021 | <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a><br> | |
1022 | <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> | |
1023 | Last modified: $Date$ | |
1024 | </address> | |
1025 | </body> | |
1026 | </html> |