]> git.proxmox.com Git - rustc.git/blob - src/llvm/docs/tutorial/LangImpl6.html
Imported Upstream version 0.6
[rustc.git] / src / llvm / docs / tutorial / LangImpl6.html
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3
4 <html>
5 <head>
6 <title>Kaleidoscope: Extending the Language: User-defined Operators</title>
7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8 <meta name="author" content="Chris Lattner">
9 <link rel="stylesheet" href="../_static/llvm.css" type="text/css">
10 </head>
11
12 <body>
13
14 <h1>Kaleidoscope: Extending the Language: User-defined Operators</h1>
15
16 <ul>
17 <li><a href="index.html">Up to Tutorial Index</a></li>
18 <li>Chapter 6
19 <ol>
20 <li><a href="#intro">Chapter 6 Introduction</a></li>
21 <li><a href="#idea">User-defined Operators: the Idea</a></li>
22 <li><a href="#binary">User-defined Binary Operators</a></li>
23 <li><a href="#unary">User-defined Unary Operators</a></li>
24 <li><a href="#example">Kicking the Tires</a></li>
25 <li><a href="#code">Full Code Listing</a></li>
26 </ol>
27 </li>
28 <li><a href="LangImpl7.html">Chapter 7</a>: Extending the Language: Mutable
29 Variables / SSA Construction</li>
30 </ul>
31
32 <div class="doc_author">
33 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
34 </div>
35
36 <!-- *********************************************************************** -->
37 <h2><a name="intro">Chapter 6 Introduction</a></h2>
38 <!-- *********************************************************************** -->
39
40 <div>
41
42 <p>Welcome to Chapter 6 of the "<a href="index.html">Implementing a language
43 with LLVM</a>" tutorial. At this point in our tutorial, we now have a fully
44 functional language that is fairly minimal, but also useful. There
45 is still one big problem with it, however. Our language doesn't have many
46 useful operators (like division, logical negation, or even any comparisons
47 besides less-than).</p>
48
49 <p>This chapter of the tutorial takes a wild digression into adding user-defined
50 operators to the simple and beautiful Kaleidoscope language. This digression now gives
51 us a simple and ugly language in some ways, but also a powerful one at the same time.
52 One of the great things about creating your own language is that you get to
53 decide what is good or bad. In this tutorial we'll assume that it is okay to
54 use this as a way to show some interesting parsing techniques.</p>
55
56 <p>At the end of this tutorial, we'll run through an example Kaleidoscope
57 application that <a href="#example">renders the Mandelbrot set</a>. This gives
58 an example of what you can build with Kaleidoscope and its feature set.</p>
59
60 </div>
61
62 <!-- *********************************************************************** -->
63 <h2><a name="idea">User-defined Operators: the Idea</a></h2>
64 <!-- *********************************************************************** -->
65
66 <div>
67
68 <p>
69 The "operator overloading" that we will add to Kaleidoscope is more general than
70 languages like C++. In C++, you are only allowed to redefine existing
71 operators: you can't programatically change the grammar, introduce new
72 operators, change precedence levels, etc. In this chapter, we will add this
73 capability to Kaleidoscope, which will let the user round out the set of
74 operators that are supported.</p>
75
76 <p>The point of going into user-defined operators in a tutorial like this is to
77 show the power and flexibility of using a hand-written parser. Thus far, the parser
78 we have been implementing uses recursive descent for most parts of the grammar and
79 operator precedence parsing for the expressions. See <a
80 href="LangImpl2.html">Chapter 2</a> for details. Without using operator
81 precedence parsing, it would be very difficult to allow the programmer to
82 introduce new operators into the grammar: the grammar is dynamically extensible
83 as the JIT runs.</p>
84
85 <p>The two specific features we'll add are programmable unary operators (right
86 now, Kaleidoscope has no unary operators at all) as well as binary operators.
87 An example of this is:</p>
88
89 <div class="doc_code">
90 <pre>
91 # Logical unary not.
92 def unary!(v)
93 if v then
94 0
95 else
96 1;
97
98 # Define &gt; with the same precedence as &lt;.
99 def binary&gt; 10 (LHS RHS)
100 RHS &lt; LHS;
101
102 # Binary "logical or", (note that it does not "short circuit")
103 def binary| 5 (LHS RHS)
104 if LHS then
105 1
106 else if RHS then
107 1
108 else
109 0;
110
111 # Define = with slightly lower precedence than relationals.
112 def binary= 9 (LHS RHS)
113 !(LHS &lt; RHS | LHS &gt; RHS);
114 </pre>
115 </div>
116
117 <p>Many languages aspire to being able to implement their standard runtime
118 library in the language itself. In Kaleidoscope, we can implement significant
119 parts of the language in the library!</p>
120
121 <p>We will break down implementation of these features into two parts:
122 implementing support for user-defined binary operators and adding unary
123 operators.</p>
124
125 </div>
126
127 <!-- *********************************************************************** -->
128 <h2><a name="binary">User-defined Binary Operators</a></h2>
129 <!-- *********************************************************************** -->
130
131 <div>
132
133 <p>Adding support for user-defined binary operators is pretty simple with our
134 current framework. We'll first add support for the unary/binary keywords:</p>
135
136 <div class="doc_code">
137 <pre>
138 enum Token {
139 ...
140 <b>// operators
141 tok_binary = -11, tok_unary = -12</b>
142 };
143 ...
144 static int gettok() {
145 ...
146 if (IdentifierStr == "for") return tok_for;
147 if (IdentifierStr == "in") return tok_in;
148 <b>if (IdentifierStr == "binary") return tok_binary;
149 if (IdentifierStr == "unary") return tok_unary;</b>
150 return tok_identifier;
151 </pre>
152 </div>
153
154 <p>This just adds lexer support for the unary and binary keywords, like we
155 did in <a href="LangImpl5.html#iflexer">previous chapters</a>. One nice thing
156 about our current AST, is that we represent binary operators with full generalisation
157 by using their ASCII code as the opcode. For our extended operators, we'll use this
158 same representation, so we don't need any new AST or parser support.</p>
159
160 <p>On the other hand, we have to be able to represent the definitions of these
161 new operators, in the "def binary| 5" part of the function definition. In our
162 grammar so far, the "name" for the function definition is parsed as the
163 "prototype" production and into the <tt>PrototypeAST</tt> AST node. To
164 represent our new user-defined operators as prototypes, we have to extend
165 the <tt>PrototypeAST</tt> AST node like this:</p>
166
167 <div class="doc_code">
168 <pre>
169 /// PrototypeAST - This class represents the "prototype" for a function,
170 /// which captures its argument names as well as if it is an operator.
171 class PrototypeAST {
172 std::string Name;
173 std::vector&lt;std::string&gt; Args;
174 <b>bool isOperator;
175 unsigned Precedence; // Precedence if a binary op.</b>
176 public:
177 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
178 <b>bool isoperator = false, unsigned prec = 0</b>)
179 : Name(name), Args(args), <b>isOperator(isoperator), Precedence(prec)</b> {}
180
181 <b>bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
182 bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
183
184 char getOperatorName() const {
185 assert(isUnaryOp() || isBinaryOp());
186 return Name[Name.size()-1];
187 }
188
189 unsigned getBinaryPrecedence() const { return Precedence; }</b>
190
191 Function *Codegen();
192 };
193 </pre>
194 </div>
195
196 <p>Basically, in addition to knowing a name for the prototype, we now keep track
197 of whether it was an operator, and if it was, what precedence level the operator
198 is at. The precedence is only used for binary operators (as you'll see below,
199 it just doesn't apply for unary operators). Now that we have a way to represent
200 the prototype for a user-defined operator, we need to parse it:</p>
201
202 <div class="doc_code">
203 <pre>
204 /// prototype
205 /// ::= id '(' id* ')'
206 <b>/// ::= binary LETTER number? (id, id)</b>
207 static PrototypeAST *ParsePrototype() {
208 std::string FnName;
209
210 <b>unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
211 unsigned BinaryPrecedence = 30;</b>
212
213 switch (CurTok) {
214 default:
215 return ErrorP("Expected function name in prototype");
216 case tok_identifier:
217 FnName = IdentifierStr;
218 Kind = 0;
219 getNextToken();
220 break;
221 <b>case tok_binary:
222 getNextToken();
223 if (!isascii(CurTok))
224 return ErrorP("Expected binary operator");
225 FnName = "binary";
226 FnName += (char)CurTok;
227 Kind = 2;
228 getNextToken();
229
230 // Read the precedence if present.
231 if (CurTok == tok_number) {
232 if (NumVal &lt; 1 || NumVal &gt; 100)
233 return ErrorP("Invalid precedecnce: must be 1..100");
234 BinaryPrecedence = (unsigned)NumVal;
235 getNextToken();
236 }
237 break;</b>
238 }
239
240 if (CurTok != '(')
241 return ErrorP("Expected '(' in prototype");
242
243 std::vector&lt;std::string&gt; ArgNames;
244 while (getNextToken() == tok_identifier)
245 ArgNames.push_back(IdentifierStr);
246 if (CurTok != ')')
247 return ErrorP("Expected ')' in prototype");
248
249 // success.
250 getNextToken(); // eat ')'.
251
252 <b>// Verify right number of names for operator.
253 if (Kind &amp;&amp; ArgNames.size() != Kind)
254 return ErrorP("Invalid number of operands for operator");
255
256 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);</b>
257 }
258 </pre>
259 </div>
260
261 <p>This is all fairly straightforward parsing code, and we have already seen
262 a lot of similar code in the past. One interesting part about the code above is
263 the couple lines that set up <tt>FnName</tt> for binary operators. This builds names
264 like "binary@" for a newly defined "@" operator. This then takes advantage of the
265 fact that symbol names in the LLVM symbol table are allowed to have any character in
266 them, including embedded nul characters.</p>
267
268 <p>The next interesting thing to add, is codegen support for these binary operators.
269 Given our current structure, this is a simple addition of a default case for our
270 existing binary operator node:</p>
271
272 <div class="doc_code">
273 <pre>
274 Value *BinaryExprAST::Codegen() {
275 Value *L = LHS-&gt;Codegen();
276 Value *R = RHS-&gt;Codegen();
277 if (L == 0 || R == 0) return 0;
278
279 switch (Op) {
280 case '+': return Builder.CreateFAdd(L, R, "addtmp");
281 case '-': return Builder.CreateFSub(L, R, "subtmp");
282 case '*': return Builder.CreateFMul(L, R, "multmp");
283 case '&lt;':
284 L = Builder.CreateFCmpULT(L, R, "cmptmp");
285 // Convert bool 0/1 to double 0.0 or 1.0
286 return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
287 "booltmp");
288 <b>default: break;</b>
289 }
290
291 <b>// If it wasn't a builtin binary operator, it must be a user defined one. Emit
292 // a call to it.
293 Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
294 assert(F &amp;&amp; "binary operator not found!");
295
296 Value *Ops[2] = { L, R };
297 return Builder.CreateCall(F, Ops, "binop");</b>
298 }
299
300 </pre>
301 </div>
302
303 <p>As you can see above, the new code is actually really simple. It just does
304 a lookup for the appropriate operator in the symbol table and generates a
305 function call to it. Since user-defined operators are just built as normal
306 functions (because the "prototype" boils down to a function with the right
307 name) everything falls into place.</p>
308
309 <p>The final piece of code we are missing, is a bit of top-level magic:</p>
310
311 <div class="doc_code">
312 <pre>
313 Function *FunctionAST::Codegen() {
314 NamedValues.clear();
315
316 Function *TheFunction = Proto->Codegen();
317 if (TheFunction == 0)
318 return 0;
319
320 <b>// If this is an operator, install it.
321 if (Proto-&gt;isBinaryOp())
322 BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();</b>
323
324 // Create a new basic block to start insertion into.
325 BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
326 Builder.SetInsertPoint(BB);
327
328 if (Value *RetVal = Body-&gt;Codegen()) {
329 ...
330 </pre>
331 </div>
332
333 <p>Basically, before codegening a function, if it is a user-defined operator, we
334 register it in the precedence table. This allows the binary operator parsing
335 logic we already have in place to handle it. Since we are working on a fully-general operator precedence parser, this is all we need to do to "extend the grammar".</p>
336
337 <p>Now we have useful user-defined binary operators. This builds a lot
338 on the previous framework we built for other operators. Adding unary operators
339 is a bit more challenging, because we don't have any framework for it yet - lets
340 see what it takes.</p>
341
342 </div>
343
344 <!-- *********************************************************************** -->
345 <h2><a name="unary">User-defined Unary Operators</a></h2>
346 <!-- *********************************************************************** -->
347
348 <div>
349
350 <p>Since we don't currently support unary operators in the Kaleidoscope
351 language, we'll need to add everything to support them. Above, we added simple
352 support for the 'unary' keyword to the lexer. In addition to that, we need an
353 AST node:</p>
354
355 <div class="doc_code">
356 <pre>
357 /// UnaryExprAST - Expression class for a unary operator.
358 class UnaryExprAST : public ExprAST {
359 char Opcode;
360 ExprAST *Operand;
361 public:
362 UnaryExprAST(char opcode, ExprAST *operand)
363 : Opcode(opcode), Operand(operand) {}
364 virtual Value *Codegen();
365 };
366 </pre>
367 </div>
368
369 <p>This AST node is very simple and obvious by now. It directly mirrors the
370 binary operator AST node, except that it only has one child. With this, we
371 need to add the parsing logic. Parsing a unary operator is pretty simple: we'll
372 add a new function to do it:</p>
373
374 <div class="doc_code">
375 <pre>
376 /// unary
377 /// ::= primary
378 /// ::= '!' unary
379 static ExprAST *ParseUnary() {
380 // If the current token is not an operator, it must be a primary expr.
381 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
382 return ParsePrimary();
383
384 // If this is a unary operator, read it.
385 int Opc = CurTok;
386 getNextToken();
387 if (ExprAST *Operand = ParseUnary())
388 return new UnaryExprAST(Opc, Operand);
389 return 0;
390 }
391 </pre>
392 </div>
393
394 <p>The grammar we add is pretty straightforward here. If we see a unary
395 operator when parsing a primary operator, we eat the operator as a prefix and
396 parse the remaining piece as another unary operator. This allows us to handle
397 multiple unary operators (e.g. "!!x"). Note that unary operators can't have
398 ambiguous parses like binary operators can, so there is no need for precedence
399 information.</p>
400
401 <p>The problem with this function, is that we need to call ParseUnary from somewhere.
402 To do this, we change previous callers of ParsePrimary to call ParseUnary
403 instead:</p>
404
405 <div class="doc_code">
406 <pre>
407 /// binoprhs
408 /// ::= ('+' unary)*
409 static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
410 ...
411 <b>// Parse the unary expression after the binary operator.
412 ExprAST *RHS = ParseUnary();
413 if (!RHS) return 0;</b>
414 ...
415 }
416 /// expression
417 /// ::= unary binoprhs
418 ///
419 static ExprAST *ParseExpression() {
420 <b>ExprAST *LHS = ParseUnary();</b>
421 if (!LHS) return 0;
422
423 return ParseBinOpRHS(0, LHS);
424 }
425 </pre>
426 </div>
427
428 <p>With these two simple changes, we are now able to parse unary operators and build the
429 AST for them. Next up, we need to add parser support for prototypes, to parse
430 the unary operator prototype. We extend the binary operator code above
431 with:</p>
432
433 <div class="doc_code">
434 <pre>
435 /// prototype
436 /// ::= id '(' id* ')'
437 /// ::= binary LETTER number? (id, id)
438 <b>/// ::= unary LETTER (id)</b>
439 static PrototypeAST *ParsePrototype() {
440 std::string FnName;
441
442 unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
443 unsigned BinaryPrecedence = 30;
444
445 switch (CurTok) {
446 default:
447 return ErrorP("Expected function name in prototype");
448 case tok_identifier:
449 FnName = IdentifierStr;
450 Kind = 0;
451 getNextToken();
452 break;
453 <b>case tok_unary:
454 getNextToken();
455 if (!isascii(CurTok))
456 return ErrorP("Expected unary operator");
457 FnName = "unary";
458 FnName += (char)CurTok;
459 Kind = 1;
460 getNextToken();
461 break;</b>
462 case tok_binary:
463 ...
464 </pre>
465 </div>
466
467 <p>As with binary operators, we name unary operators with a name that includes
468 the operator character. This assists us at code generation time. Speaking of,
469 the final piece we need to add is codegen support for unary operators. It looks
470 like this:</p>
471
472 <div class="doc_code">
473 <pre>
474 Value *UnaryExprAST::Codegen() {
475 Value *OperandV = Operand->Codegen();
476 if (OperandV == 0) return 0;
477
478 Function *F = TheModule->getFunction(std::string("unary")+Opcode);
479 if (F == 0)
480 return ErrorV("Unknown unary operator");
481
482 return Builder.CreateCall(F, OperandV, "unop");
483 }
484 </pre>
485 </div>
486
487 <p>This code is similar to, but simpler than, the code for binary operators. It
488 is simpler primarily because it doesn't need to handle any predefined operators.
489 </p>
490
491 </div>
492
493 <!-- *********************************************************************** -->
494 <h2><a name="example">Kicking the Tires</a></h2>
495 <!-- *********************************************************************** -->
496
497 <div>
498
499 <p>It is somewhat hard to believe, but with a few simple extensions we've
500 covered in the last chapters, we have grown a real-ish language. With this, we
501 can do a lot of interesting things, including I/O, math, and a bunch of other
502 things. For example, we can now add a nice sequencing operator (printd is
503 defined to print out the specified value and a newline):</p>
504
505 <div class="doc_code">
506 <pre>
507 ready&gt; <b>extern printd(x);</b>
508 Read extern:
509 declare double @printd(double)
510
511 ready&gt; <b>def binary : 1 (x y) 0; # Low-precedence operator that ignores operands.</b>
512 ..
513 ready&gt; <b>printd(123) : printd(456) : printd(789);</b>
514 123.000000
515 456.000000
516 789.000000
517 Evaluated to 0.000000
518 </pre>
519 </div>
520
521 <p>We can also define a bunch of other "primitive" operations, such as:</p>
522
523 <div class="doc_code">
524 <pre>
525 # Logical unary not.
526 def unary!(v)
527 if v then
528 0
529 else
530 1;
531
532 # Unary negate.
533 def unary-(v)
534 0-v;
535
536 # Define &gt; with the same precedence as &lt;.
537 def binary&gt; 10 (LHS RHS)
538 RHS &lt; LHS;
539
540 # Binary logical or, which does not short circuit.
541 def binary| 5 (LHS RHS)
542 if LHS then
543 1
544 else if RHS then
545 1
546 else
547 0;
548
549 # Binary logical and, which does not short circuit.
550 def binary&amp; 6 (LHS RHS)
551 if !LHS then
552 0
553 else
554 !!RHS;
555
556 # Define = with slightly lower precedence than relationals.
557 def binary = 9 (LHS RHS)
558 !(LHS &lt; RHS | LHS &gt; RHS);
559
560 # Define ':' for sequencing: as a low-precedence operator that ignores operands
561 # and just returns the RHS.
562 def binary : 1 (x y) y;
563 </pre>
564 </div>
565
566
567 <p>Given the previous if/then/else support, we can also define interesting
568 functions for I/O. For example, the following prints out a character whose
569 "density" reflects the value passed in: the lower the value, the denser the
570 character:</p>
571
572 <div class="doc_code">
573 <pre>
574 ready&gt;
575 <b>
576 extern putchard(char)
577 def printdensity(d)
578 if d &gt; 8 then
579 putchard(32) # ' '
580 else if d &gt; 4 then
581 putchard(46) # '.'
582 else if d &gt; 2 then
583 putchard(43) # '+'
584 else
585 putchard(42); # '*'</b>
586 ...
587 ready&gt; <b>printdensity(1): printdensity(2): printdensity(3):
588 printdensity(4): printdensity(5): printdensity(9):
589 putchard(10);</b>
590 **++.
591 Evaluated to 0.000000
592 </pre>
593 </div>
594
595 <p>Based on these simple primitive operations, we can start to define more
596 interesting things. For example, here's a little function that solves for the
597 number of iterations it takes a function in the complex plane to
598 converge:</p>
599
600 <div class="doc_code">
601 <pre>
602 # Determine whether the specific location diverges.
603 # Solve for z = z^2 + c in the complex plane.
604 def mandleconverger(real imag iters creal cimag)
605 if iters &gt; 255 | (real*real + imag*imag &gt; 4) then
606 iters
607 else
608 mandleconverger(real*real - imag*imag + creal,
609 2*real*imag + cimag,
610 iters+1, creal, cimag);
611
612 # Return the number of iterations required for the iteration to escape
613 def mandleconverge(real imag)
614 mandleconverger(real, imag, 0, real, imag);
615 </pre>
616 </div>
617
618 <p>This "<code>z = z<sup>2</sup> + c</code>" function is a beautiful little
619 creature that is the basis for computation of
620 the <a href="http://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot Set</a>.
621 Our <tt>mandelconverge</tt> function returns the number of iterations that it
622 takes for a complex orbit to escape, saturating to 255. This is not a very
623 useful function by itself, but if you plot its value over a two-dimensional
624 plane, you can see the Mandelbrot set. Given that we are limited to using
625 putchard here, our amazing graphical output is limited, but we can whip together
626 something using the density plotter above:</p>
627
628 <div class="doc_code">
629 <pre>
630 # Compute and plot the mandlebrot set with the specified 2 dimensional range
631 # info.
632 def mandelhelp(xmin xmax xstep ymin ymax ystep)
633 for y = ymin, y &lt; ymax, ystep in (
634 (for x = xmin, x &lt; xmax, xstep in
635 printdensity(mandleconverge(x,y)))
636 : putchard(10)
637 )
638
639 # mandel - This is a convenient helper function for plotting the mandelbrot set
640 # from the specified position with the specified Magnification.
641 def mandel(realstart imagstart realmag imagmag)
642 mandelhelp(realstart, realstart+realmag*78, realmag,
643 imagstart, imagstart+imagmag*40, imagmag);
644 </pre>
645 </div>
646
647 <p>Given this, we can try plotting out the mandlebrot set! Lets try it out:</p>
648
649 <div class="doc_code">
650 <pre>
651 ready&gt; <b>mandel(-2.3, -1.3, 0.05, 0.07);</b>
652 *******************************+++++++++++*************************************
653 *************************+++++++++++++++++++++++*******************************
654 **********************+++++++++++++++++++++++++++++****************************
655 *******************+++++++++++++++++++++.. ...++++++++*************************
656 *****************++++++++++++++++++++++.... ...+++++++++***********************
657 ***************+++++++++++++++++++++++..... ...+++++++++*********************
658 **************+++++++++++++++++++++++.... ....+++++++++********************
659 *************++++++++++++++++++++++...... .....++++++++*******************
660 ************+++++++++++++++++++++....... .......+++++++******************
661 ***********+++++++++++++++++++.... ... .+++++++*****************
662 **********+++++++++++++++++....... .+++++++****************
663 *********++++++++++++++........... ...+++++++***************
664 ********++++++++++++............ ...++++++++**************
665 ********++++++++++... .......... .++++++++**************
666 *******+++++++++..... .+++++++++*************
667 *******++++++++...... ..+++++++++*************
668 *******++++++....... ..+++++++++*************
669 *******+++++...... ..+++++++++*************
670 *******.... .... ...+++++++++*************
671 *******.... . ...+++++++++*************
672 *******+++++...... ...+++++++++*************
673 *******++++++....... ..+++++++++*************
674 *******++++++++...... .+++++++++*************
675 *******+++++++++..... ..+++++++++*************
676 ********++++++++++... .......... .++++++++**************
677 ********++++++++++++............ ...++++++++**************
678 *********++++++++++++++.......... ...+++++++***************
679 **********++++++++++++++++........ .+++++++****************
680 **********++++++++++++++++++++.... ... ..+++++++****************
681 ***********++++++++++++++++++++++....... .......++++++++*****************
682 ************+++++++++++++++++++++++...... ......++++++++******************
683 **************+++++++++++++++++++++++.... ....++++++++********************
684 ***************+++++++++++++++++++++++..... ...+++++++++*********************
685 *****************++++++++++++++++++++++.... ...++++++++***********************
686 *******************+++++++++++++++++++++......++++++++*************************
687 *********************++++++++++++++++++++++.++++++++***************************
688 *************************+++++++++++++++++++++++*******************************
689 ******************************+++++++++++++************************************
690 *******************************************************************************
691 *******************************************************************************
692 *******************************************************************************
693 Evaluated to 0.000000
694 ready&gt; <b>mandel(-2, -1, 0.02, 0.04);</b>
695 **************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
696 ***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
697 *********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
698 *******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
699 *****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
700 ***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
701 **************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
702 ************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
703 ***********++++++++++++++++++++++++++++++++++++++++++++++++++........ .
704 **********++++++++++++++++++++++++++++++++++++++++++++++.............
705 ********+++++++++++++++++++++++++++++++++++++++++++..................
706 *******+++++++++++++++++++++++++++++++++++++++.......................
707 ******+++++++++++++++++++++++++++++++++++...........................
708 *****++++++++++++++++++++++++++++++++............................
709 *****++++++++++++++++++++++++++++...............................
710 ****++++++++++++++++++++++++++...... .........................
711 ***++++++++++++++++++++++++......... ...... ...........
712 ***++++++++++++++++++++++............
713 **+++++++++++++++++++++..............
714 **+++++++++++++++++++................
715 *++++++++++++++++++.................
716 *++++++++++++++++............ ...
717 *++++++++++++++..............
718 *+++....++++................
719 *.......... ...........
720 *
721 *.......... ...........
722 *+++....++++................
723 *++++++++++++++..............
724 *++++++++++++++++............ ...
725 *++++++++++++++++++.................
726 **+++++++++++++++++++................
727 **+++++++++++++++++++++..............
728 ***++++++++++++++++++++++............
729 ***++++++++++++++++++++++++......... ...... ...........
730 ****++++++++++++++++++++++++++...... .........................
731 *****++++++++++++++++++++++++++++...............................
732 *****++++++++++++++++++++++++++++++++............................
733 ******+++++++++++++++++++++++++++++++++++...........................
734 *******+++++++++++++++++++++++++++++++++++++++.......................
735 ********+++++++++++++++++++++++++++++++++++++++++++..................
736 Evaluated to 0.000000
737 ready&gt; <b>mandel(-0.9, -1.4, 0.02, 0.03);</b>
738 *******************************************************************************
739 *******************************************************************************
740 *******************************************************************************
741 **********+++++++++++++++++++++************************************************
742 *+++++++++++++++++++++++++++++++++++++++***************************************
743 +++++++++++++++++++++++++++++++++++++++++++++**********************************
744 ++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
745 ++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
746 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
747 +++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
748 +++++++++++++++++++++++++++++++.... ......+++++++++++++++++++****************
749 +++++++++++++++++++++++++++++....... ........+++++++++++++++++++**************
750 ++++++++++++++++++++++++++++........ ........++++++++++++++++++++************
751 +++++++++++++++++++++++++++......... .. ...+++++++++++++++++++++**********
752 ++++++++++++++++++++++++++........... ....++++++++++++++++++++++********
753 ++++++++++++++++++++++++............. .......++++++++++++++++++++++******
754 +++++++++++++++++++++++............. ........+++++++++++++++++++++++****
755 ++++++++++++++++++++++........... ..........++++++++++++++++++++++***
756 ++++++++++++++++++++........... .........++++++++++++++++++++++*
757 ++++++++++++++++++............ ...........++++++++++++++++++++
758 ++++++++++++++++............... .............++++++++++++++++++
759 ++++++++++++++................. ...............++++++++++++++++
760 ++++++++++++.................. .................++++++++++++++
761 +++++++++.................. .................+++++++++++++
762 ++++++........ . ......... ..++++++++++++
763 ++............ ...... ....++++++++++
764 .............. ...++++++++++
765 .............. ....+++++++++
766 .............. .....++++++++
767 ............. ......++++++++
768 ........... .......++++++++
769 ......... ........+++++++
770 ......... ........+++++++
771 ......... ....+++++++
772 ........ ...+++++++
773 ....... ...+++++++
774 ....+++++++
775 .....+++++++
776 ....+++++++
777 ....+++++++
778 ....+++++++
779 Evaluated to 0.000000
780 ready&gt; <b>^D</b>
781 </pre>
782 </div>
783
784 <p>At this point, you may be starting to realize that Kaleidoscope is a real
785 and powerful language. It may not be self-similar :), but it can be used to
786 plot things that are!</p>
787
788 <p>With this, we conclude the "adding user-defined operators" chapter of the
789 tutorial. We have successfully augmented our language, adding the ability to extend the
790 language in the library, and we have shown how this can be used to build a simple but
791 interesting end-user application in Kaleidoscope. At this point, Kaleidoscope
792 can build a variety of applications that are functional and can call functions
793 with side-effects, but it can't actually define and mutate a variable itself.
794 </p>
795
796 <p>Strikingly, variable mutation is an important feature of some
797 languages, and it is not at all obvious how to <a href="LangImpl7.html">add
798 support for mutable variables</a> without having to add an "SSA construction"
799 phase to your front-end. In the next chapter, we will describe how you can
800 add variable mutation without building SSA in your front-end.</p>
801
802 </div>
803
804 <!-- *********************************************************************** -->
805 <h2><a name="code">Full Code Listing</a></h2>
806 <!-- *********************************************************************** -->
807
808 <div>
809
810 <p>
811 Here is the complete code listing for our running example, enhanced with the
812 if/then/else and for expressions.. To build this example, use:
813 </p>
814
815 <div class="doc_code">
816 <pre>
817 # Compile
818 clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
819 # Run
820 ./toy
821 </pre>
822 </div>
823
824 <p>On some platforms, you will need to specify -rdynamic or -Wl,--export-dynamic
825 when linking. This ensures that symbols defined in the main executable are
826 exported to the dynamic linker and so are available for symbol resolution at
827 run time. This is not needed if you compile your support code into a shared
828 library, although doing that will cause problems on Windows.</p>
829
830 <p>Here is the code:</p>
831
832 <div class="doc_code">
833 <pre>
834 #include "llvm/DerivedTypes.h"
835 #include "llvm/ExecutionEngine/ExecutionEngine.h"
836 #include "llvm/ExecutionEngine/JIT.h"
837 #include "llvm/IRBuilder.h"
838 #include "llvm/LLVMContext.h"
839 #include "llvm/Module.h"
840 #include "llvm/PassManager.h"
841 #include "llvm/Analysis/Verifier.h"
842 #include "llvm/Analysis/Passes.h"
843 #include "llvm/Target/TargetData.h"
844 #include "llvm/Transforms/Scalar.h"
845 #include "llvm/Support/TargetSelect.h"
846 #include &lt;cstdio&gt;
847 #include &lt;string&gt;
848 #include &lt;map&gt;
849 #include &lt;vector&gt;
850 using namespace llvm;
851
852 //===----------------------------------------------------------------------===//
853 // Lexer
854 //===----------------------------------------------------------------------===//
855
856 // The lexer returns tokens [0-255] if it is an unknown character, otherwise one
857 // of these for known things.
858 enum Token {
859 tok_eof = -1,
860
861 // commands
862 tok_def = -2, tok_extern = -3,
863
864 // primary
865 tok_identifier = -4, tok_number = -5,
866
867 // control
868 tok_if = -6, tok_then = -7, tok_else = -8,
869 tok_for = -9, tok_in = -10,
870
871 // operators
872 tok_binary = -11, tok_unary = -12
873 };
874
875 static std::string IdentifierStr; // Filled in if tok_identifier
876 static double NumVal; // Filled in if tok_number
877
878 /// gettok - Return the next token from standard input.
879 static int gettok() {
880 static int LastChar = ' ';
881
882 // Skip any whitespace.
883 while (isspace(LastChar))
884 LastChar = getchar();
885
886 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
887 IdentifierStr = LastChar;
888 while (isalnum((LastChar = getchar())))
889 IdentifierStr += LastChar;
890
891 if (IdentifierStr == "def") return tok_def;
892 if (IdentifierStr == "extern") return tok_extern;
893 if (IdentifierStr == "if") return tok_if;
894 if (IdentifierStr == "then") return tok_then;
895 if (IdentifierStr == "else") return tok_else;
896 if (IdentifierStr == "for") return tok_for;
897 if (IdentifierStr == "in") return tok_in;
898 if (IdentifierStr == "binary") return tok_binary;
899 if (IdentifierStr == "unary") return tok_unary;
900 return tok_identifier;
901 }
902
903 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
904 std::string NumStr;
905 do {
906 NumStr += LastChar;
907 LastChar = getchar();
908 } while (isdigit(LastChar) || LastChar == '.');
909
910 NumVal = strtod(NumStr.c_str(), 0);
911 return tok_number;
912 }
913
914 if (LastChar == '#') {
915 // Comment until end of line.
916 do LastChar = getchar();
917 while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp;&amp; LastChar != '\r');
918
919 if (LastChar != EOF)
920 return gettok();
921 }
922
923 // Check for end of file. Don't eat the EOF.
924 if (LastChar == EOF)
925 return tok_eof;
926
927 // Otherwise, just return the character as its ascii value.
928 int ThisChar = LastChar;
929 LastChar = getchar();
930 return ThisChar;
931 }
932
933 //===----------------------------------------------------------------------===//
934 // Abstract Syntax Tree (aka Parse Tree)
935 //===----------------------------------------------------------------------===//
936
937 /// ExprAST - Base class for all expression nodes.
938 class ExprAST {
939 public:
940 virtual ~ExprAST() {}
941 virtual Value *Codegen() = 0;
942 };
943
944 /// NumberExprAST - Expression class for numeric literals like "1.0".
945 class NumberExprAST : public ExprAST {
946 double Val;
947 public:
948 NumberExprAST(double val) : Val(val) {}
949 virtual Value *Codegen();
950 };
951
952 /// VariableExprAST - Expression class for referencing a variable, like "a".
953 class VariableExprAST : public ExprAST {
954 std::string Name;
955 public:
956 VariableExprAST(const std::string &amp;name) : Name(name) {}
957 virtual Value *Codegen();
958 };
959
960 /// UnaryExprAST - Expression class for a unary operator.
961 class UnaryExprAST : public ExprAST {
962 char Opcode;
963 ExprAST *Operand;
964 public:
965 UnaryExprAST(char opcode, ExprAST *operand)
966 : Opcode(opcode), Operand(operand) {}
967 virtual Value *Codegen();
968 };
969
970 /// BinaryExprAST - Expression class for a binary operator.
971 class BinaryExprAST : public ExprAST {
972 char Op;
973 ExprAST *LHS, *RHS;
974 public:
975 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
976 : Op(op), LHS(lhs), RHS(rhs) {}
977 virtual Value *Codegen();
978 };
979
980 /// CallExprAST - Expression class for function calls.
981 class CallExprAST : public ExprAST {
982 std::string Callee;
983 std::vector&lt;ExprAST*&gt; Args;
984 public:
985 CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
986 : Callee(callee), Args(args) {}
987 virtual Value *Codegen();
988 };
989
990 /// IfExprAST - Expression class for if/then/else.
991 class IfExprAST : public ExprAST {
992 ExprAST *Cond, *Then, *Else;
993 public:
994 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
995 : Cond(cond), Then(then), Else(_else) {}
996 virtual Value *Codegen();
997 };
998
999 /// ForExprAST - Expression class for for/in.
1000 class ForExprAST : public ExprAST {
1001 std::string VarName;
1002 ExprAST *Start, *End, *Step, *Body;
1003 public:
1004 ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
1005 ExprAST *step, ExprAST *body)
1006 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
1007 virtual Value *Codegen();
1008 };
1009
1010 /// PrototypeAST - This class represents the "prototype" for a function,
1011 /// which captures its name, and its argument names (thus implicitly the number
1012 /// of arguments the function takes), as well as if it is an operator.
1013 class PrototypeAST {
1014 std::string Name;
1015 std::vector&lt;std::string&gt; Args;
1016 bool isOperator;
1017 unsigned Precedence; // Precedence if a binary op.
1018 public:
1019 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
1020 bool isoperator = false, unsigned prec = 0)
1021 : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
1022
1023 bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
1024 bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
1025
1026 char getOperatorName() const {
1027 assert(isUnaryOp() || isBinaryOp());
1028 return Name[Name.size()-1];
1029 }
1030
1031 unsigned getBinaryPrecedence() const { return Precedence; }
1032
1033 Function *Codegen();
1034 };
1035
1036 /// FunctionAST - This class represents a function definition itself.
1037 class FunctionAST {
1038 PrototypeAST *Proto;
1039 ExprAST *Body;
1040 public:
1041 FunctionAST(PrototypeAST *proto, ExprAST *body)
1042 : Proto(proto), Body(body) {}
1043
1044 Function *Codegen();
1045 };
1046
1047 //===----------------------------------------------------------------------===//
1048 // Parser
1049 //===----------------------------------------------------------------------===//
1050
1051 /// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
1052 /// token the parser is looking at. getNextToken reads another token from the
1053 /// lexer and updates CurTok with its results.
1054 static int CurTok;
1055 static int getNextToken() {
1056 return CurTok = gettok();
1057 }
1058
1059 /// BinopPrecedence - This holds the precedence for each binary operator that is
1060 /// defined.
1061 static std::map&lt;char, int&gt; BinopPrecedence;
1062
1063 /// GetTokPrecedence - Get the precedence of the pending binary operator token.
1064 static int GetTokPrecedence() {
1065 if (!isascii(CurTok))
1066 return -1;
1067
1068 // Make sure it's a declared binop.
1069 int TokPrec = BinopPrecedence[CurTok];
1070 if (TokPrec &lt;= 0) return -1;
1071 return TokPrec;
1072 }
1073
1074 /// Error* - These are little helper functions for error handling.
1075 ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1076 PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1077 FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1078
1079 static ExprAST *ParseExpression();
1080
1081 /// identifierexpr
1082 /// ::= identifier
1083 /// ::= identifier '(' expression* ')'
1084 static ExprAST *ParseIdentifierExpr() {
1085 std::string IdName = IdentifierStr;
1086
1087 getNextToken(); // eat identifier.
1088
1089 if (CurTok != '(') // Simple variable ref.
1090 return new VariableExprAST(IdName);
1091
1092 // Call.
1093 getNextToken(); // eat (
1094 std::vector&lt;ExprAST*&gt; Args;
1095 if (CurTok != ')') {
1096 while (1) {
1097 ExprAST *Arg = ParseExpression();
1098 if (!Arg) return 0;
1099 Args.push_back(Arg);
1100
1101 if (CurTok == ')') break;
1102
1103 if (CurTok != ',')
1104 return Error("Expected ')' or ',' in argument list");
1105 getNextToken();
1106 }
1107 }
1108
1109 // Eat the ')'.
1110 getNextToken();
1111
1112 return new CallExprAST(IdName, Args);
1113 }
1114
1115 /// numberexpr ::= number
1116 static ExprAST *ParseNumberExpr() {
1117 ExprAST *Result = new NumberExprAST(NumVal);
1118 getNextToken(); // consume the number
1119 return Result;
1120 }
1121
1122 /// parenexpr ::= '(' expression ')'
1123 static ExprAST *ParseParenExpr() {
1124 getNextToken(); // eat (.
1125 ExprAST *V = ParseExpression();
1126 if (!V) return 0;
1127
1128 if (CurTok != ')')
1129 return Error("expected ')'");
1130 getNextToken(); // eat ).
1131 return V;
1132 }
1133
1134 /// ifexpr ::= 'if' expression 'then' expression 'else' expression
1135 static ExprAST *ParseIfExpr() {
1136 getNextToken(); // eat the if.
1137
1138 // condition.
1139 ExprAST *Cond = ParseExpression();
1140 if (!Cond) return 0;
1141
1142 if (CurTok != tok_then)
1143 return Error("expected then");
1144 getNextToken(); // eat the then
1145
1146 ExprAST *Then = ParseExpression();
1147 if (Then == 0) return 0;
1148
1149 if (CurTok != tok_else)
1150 return Error("expected else");
1151
1152 getNextToken();
1153
1154 ExprAST *Else = ParseExpression();
1155 if (!Else) return 0;
1156
1157 return new IfExprAST(Cond, Then, Else);
1158 }
1159
1160 /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
1161 static ExprAST *ParseForExpr() {
1162 getNextToken(); // eat the for.
1163
1164 if (CurTok != tok_identifier)
1165 return Error("expected identifier after for");
1166
1167 std::string IdName = IdentifierStr;
1168 getNextToken(); // eat identifier.
1169
1170 if (CurTok != '=')
1171 return Error("expected '=' after for");
1172 getNextToken(); // eat '='.
1173
1174
1175 ExprAST *Start = ParseExpression();
1176 if (Start == 0) return 0;
1177 if (CurTok != ',')
1178 return Error("expected ',' after for start value");
1179 getNextToken();
1180
1181 ExprAST *End = ParseExpression();
1182 if (End == 0) return 0;
1183
1184 // The step value is optional.
1185 ExprAST *Step = 0;
1186 if (CurTok == ',') {
1187 getNextToken();
1188 Step = ParseExpression();
1189 if (Step == 0) return 0;
1190 }
1191
1192 if (CurTok != tok_in)
1193 return Error("expected 'in' after for");
1194 getNextToken(); // eat 'in'.
1195
1196 ExprAST *Body = ParseExpression();
1197 if (Body == 0) return 0;
1198
1199 return new ForExprAST(IdName, Start, End, Step, Body);
1200 }
1201
1202 /// primary
1203 /// ::= identifierexpr
1204 /// ::= numberexpr
1205 /// ::= parenexpr
1206 /// ::= ifexpr
1207 /// ::= forexpr
1208 static ExprAST *ParsePrimary() {
1209 switch (CurTok) {
1210 default: return Error("unknown token when expecting an expression");
1211 case tok_identifier: return ParseIdentifierExpr();
1212 case tok_number: return ParseNumberExpr();
1213 case '(': return ParseParenExpr();
1214 case tok_if: return ParseIfExpr();
1215 case tok_for: return ParseForExpr();
1216 }
1217 }
1218
1219 /// unary
1220 /// ::= primary
1221 /// ::= '!' unary
1222 static ExprAST *ParseUnary() {
1223 // If the current token is not an operator, it must be a primary expr.
1224 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
1225 return ParsePrimary();
1226
1227 // If this is a unary operator, read it.
1228 int Opc = CurTok;
1229 getNextToken();
1230 if (ExprAST *Operand = ParseUnary())
1231 return new UnaryExprAST(Opc, Operand);
1232 return 0;
1233 }
1234
1235 /// binoprhs
1236 /// ::= ('+' unary)*
1237 static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1238 // If this is a binop, find its precedence.
1239 while (1) {
1240 int TokPrec = GetTokPrecedence();
1241
1242 // If this is a binop that binds at least as tightly as the current binop,
1243 // consume it, otherwise we are done.
1244 if (TokPrec &lt; ExprPrec)
1245 return LHS;
1246
1247 // Okay, we know this is a binop.
1248 int BinOp = CurTok;
1249 getNextToken(); // eat binop
1250
1251 // Parse the unary expression after the binary operator.
1252 ExprAST *RHS = ParseUnary();
1253 if (!RHS) return 0;
1254
1255 // If BinOp binds less tightly with RHS than the operator after RHS, let
1256 // the pending operator take RHS as its LHS.
1257 int NextPrec = GetTokPrecedence();
1258 if (TokPrec &lt; NextPrec) {
1259 RHS = ParseBinOpRHS(TokPrec+1, RHS);
1260 if (RHS == 0) return 0;
1261 }
1262
1263 // Merge LHS/RHS.
1264 LHS = new BinaryExprAST(BinOp, LHS, RHS);
1265 }
1266 }
1267
1268 /// expression
1269 /// ::= unary binoprhs
1270 ///
1271 static ExprAST *ParseExpression() {
1272 ExprAST *LHS = ParseUnary();
1273 if (!LHS) return 0;
1274
1275 return ParseBinOpRHS(0, LHS);
1276 }
1277
1278 /// prototype
1279 /// ::= id '(' id* ')'
1280 /// ::= binary LETTER number? (id, id)
1281 /// ::= unary LETTER (id)
1282 static PrototypeAST *ParsePrototype() {
1283 std::string FnName;
1284
1285 unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
1286 unsigned BinaryPrecedence = 30;
1287
1288 switch (CurTok) {
1289 default:
1290 return ErrorP("Expected function name in prototype");
1291 case tok_identifier:
1292 FnName = IdentifierStr;
1293 Kind = 0;
1294 getNextToken();
1295 break;
1296 case tok_unary:
1297 getNextToken();
1298 if (!isascii(CurTok))
1299 return ErrorP("Expected unary operator");
1300 FnName = "unary";
1301 FnName += (char)CurTok;
1302 Kind = 1;
1303 getNextToken();
1304 break;
1305 case tok_binary:
1306 getNextToken();
1307 if (!isascii(CurTok))
1308 return ErrorP("Expected binary operator");
1309 FnName = "binary";
1310 FnName += (char)CurTok;
1311 Kind = 2;
1312 getNextToken();
1313
1314 // Read the precedence if present.
1315 if (CurTok == tok_number) {
1316 if (NumVal &lt; 1 || NumVal &gt; 100)
1317 return ErrorP("Invalid precedecnce: must be 1..100");
1318 BinaryPrecedence = (unsigned)NumVal;
1319 getNextToken();
1320 }
1321 break;
1322 }
1323
1324 if (CurTok != '(')
1325 return ErrorP("Expected '(' in prototype");
1326
1327 std::vector&lt;std::string&gt; ArgNames;
1328 while (getNextToken() == tok_identifier)
1329 ArgNames.push_back(IdentifierStr);
1330 if (CurTok != ')')
1331 return ErrorP("Expected ')' in prototype");
1332
1333 // success.
1334 getNextToken(); // eat ')'.
1335
1336 // Verify right number of names for operator.
1337 if (Kind &amp;&amp; ArgNames.size() != Kind)
1338 return ErrorP("Invalid number of operands for operator");
1339
1340 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
1341 }
1342
1343 /// definition ::= 'def' prototype expression
1344 static FunctionAST *ParseDefinition() {
1345 getNextToken(); // eat def.
1346 PrototypeAST *Proto = ParsePrototype();
1347 if (Proto == 0) return 0;
1348
1349 if (ExprAST *E = ParseExpression())
1350 return new FunctionAST(Proto, E);
1351 return 0;
1352 }
1353
1354 /// toplevelexpr ::= expression
1355 static FunctionAST *ParseTopLevelExpr() {
1356 if (ExprAST *E = ParseExpression()) {
1357 // Make an anonymous proto.
1358 PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1359 return new FunctionAST(Proto, E);
1360 }
1361 return 0;
1362 }
1363
1364 /// external ::= 'extern' prototype
1365 static PrototypeAST *ParseExtern() {
1366 getNextToken(); // eat extern.
1367 return ParsePrototype();
1368 }
1369
1370 //===----------------------------------------------------------------------===//
1371 // Code Generation
1372 //===----------------------------------------------------------------------===//
1373
1374 static Module *TheModule;
1375 static IRBuilder&lt;&gt; Builder(getGlobalContext());
1376 static std::map&lt;std::string, Value*&gt; NamedValues;
1377 static FunctionPassManager *TheFPM;
1378
1379 Value *ErrorV(const char *Str) { Error(Str); return 0; }
1380
1381 Value *NumberExprAST::Codegen() {
1382 return ConstantFP::get(getGlobalContext(), APFloat(Val));
1383 }
1384
1385 Value *VariableExprAST::Codegen() {
1386 // Look this variable up in the function.
1387 Value *V = NamedValues[Name];
1388 return V ? V : ErrorV("Unknown variable name");
1389 }
1390
1391 Value *UnaryExprAST::Codegen() {
1392 Value *OperandV = Operand-&gt;Codegen();
1393 if (OperandV == 0) return 0;
1394
1395 Function *F = TheModule-&gt;getFunction(std::string("unary")+Opcode);
1396 if (F == 0)
1397 return ErrorV("Unknown unary operator");
1398
1399 return Builder.CreateCall(F, OperandV, "unop");
1400 }
1401
1402 Value *BinaryExprAST::Codegen() {
1403 Value *L = LHS-&gt;Codegen();
1404 Value *R = RHS-&gt;Codegen();
1405 if (L == 0 || R == 0) return 0;
1406
1407 switch (Op) {
1408 case '+': return Builder.CreateFAdd(L, R, "addtmp");
1409 case '-': return Builder.CreateFSub(L, R, "subtmp");
1410 case '*': return Builder.CreateFMul(L, R, "multmp");
1411 case '&lt;':
1412 L = Builder.CreateFCmpULT(L, R, "cmptmp");
1413 // Convert bool 0/1 to double 0.0 or 1.0
1414 return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
1415 "booltmp");
1416 default: break;
1417 }
1418
1419 // If it wasn't a builtin binary operator, it must be a user defined one. Emit
1420 // a call to it.
1421 Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
1422 assert(F &amp;&amp; "binary operator not found!");
1423
1424 Value *Ops[2] = { L, R };
1425 return Builder.CreateCall(F, Ops, "binop");
1426 }
1427
1428 Value *CallExprAST::Codegen() {
1429 // Look up the name in the global module table.
1430 Function *CalleeF = TheModule-&gt;getFunction(Callee);
1431 if (CalleeF == 0)
1432 return ErrorV("Unknown function referenced");
1433
1434 // If argument mismatch error.
1435 if (CalleeF-&gt;arg_size() != Args.size())
1436 return ErrorV("Incorrect # arguments passed");
1437
1438 std::vector&lt;Value*&gt; ArgsV;
1439 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1440 ArgsV.push_back(Args[i]-&gt;Codegen());
1441 if (ArgsV.back() == 0) return 0;
1442 }
1443
1444 return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
1445 }
1446
1447 Value *IfExprAST::Codegen() {
1448 Value *CondV = Cond-&gt;Codegen();
1449 if (CondV == 0) return 0;
1450
1451 // Convert condition to a bool by comparing equal to 0.0.
1452 CondV = Builder.CreateFCmpONE(CondV,
1453 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1454 "ifcond");
1455
1456 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1457
1458 // Create blocks for the then and else cases. Insert the 'then' block at the
1459 // end of the function.
1460 BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
1461 BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
1462 BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
1463
1464 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1465
1466 // Emit then value.
1467 Builder.SetInsertPoint(ThenBB);
1468
1469 Value *ThenV = Then-&gt;Codegen();
1470 if (ThenV == 0) return 0;
1471
1472 Builder.CreateBr(MergeBB);
1473 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1474 ThenBB = Builder.GetInsertBlock();
1475
1476 // Emit else block.
1477 TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1478 Builder.SetInsertPoint(ElseBB);
1479
1480 Value *ElseV = Else-&gt;Codegen();
1481 if (ElseV == 0) return 0;
1482
1483 Builder.CreateBr(MergeBB);
1484 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1485 ElseBB = Builder.GetInsertBlock();
1486
1487 // Emit merge block.
1488 TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1489 Builder.SetInsertPoint(MergeBB);
1490 PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2,
1491 "iftmp");
1492
1493 PN-&gt;addIncoming(ThenV, ThenBB);
1494 PN-&gt;addIncoming(ElseV, ElseBB);
1495 return PN;
1496 }
1497
1498 Value *ForExprAST::Codegen() {
1499 // Output this as:
1500 // ...
1501 // start = startexpr
1502 // goto loop
1503 // loop:
1504 // variable = phi [start, loopheader], [nextvariable, loopend]
1505 // ...
1506 // bodyexpr
1507 // ...
1508 // loopend:
1509 // step = stepexpr
1510 // nextvariable = variable + step
1511 // endcond = endexpr
1512 // br endcond, loop, endloop
1513 // outloop:
1514
1515 // Emit the start code first, without 'variable' in scope.
1516 Value *StartVal = Start-&gt;Codegen();
1517 if (StartVal == 0) return 0;
1518
1519 // Make the new basic block for the loop header, inserting after current
1520 // block.
1521 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1522 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1523 BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
1524
1525 // Insert an explicit fall through from the current block to the LoopBB.
1526 Builder.CreateBr(LoopBB);
1527
1528 // Start insertion in LoopBB.
1529 Builder.SetInsertPoint(LoopBB);
1530
1531 // Start the PHI node with an entry for Start.
1532 PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, VarName.c_str());
1533 Variable-&gt;addIncoming(StartVal, PreheaderBB);
1534
1535 // Within the loop, the variable is defined equal to the PHI node. If it
1536 // shadows an existing variable, we have to restore it, so save it now.
1537 Value *OldVal = NamedValues[VarName];
1538 NamedValues[VarName] = Variable;
1539
1540 // Emit the body of the loop. This, like any other expr, can change the
1541 // current BB. Note that we ignore the value computed by the body, but don't
1542 // allow an error.
1543 if (Body-&gt;Codegen() == 0)
1544 return 0;
1545
1546 // Emit the step value.
1547 Value *StepVal;
1548 if (Step) {
1549 StepVal = Step-&gt;Codegen();
1550 if (StepVal == 0) return 0;
1551 } else {
1552 // If not specified, use 1.0.
1553 StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
1554 }
1555
1556 Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
1557
1558 // Compute the end condition.
1559 Value *EndCond = End-&gt;Codegen();
1560 if (EndCond == 0) return EndCond;
1561
1562 // Convert condition to a bool by comparing equal to 0.0.
1563 EndCond = Builder.CreateFCmpONE(EndCond,
1564 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1565 "loopcond");
1566
1567 // Create the "after loop" block and insert it.
1568 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1569 BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
1570
1571 // Insert the conditional branch into the end of LoopEndBB.
1572 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1573
1574 // Any new code will be inserted in AfterBB.
1575 Builder.SetInsertPoint(AfterBB);
1576
1577 // Add a new entry to the PHI node for the backedge.
1578 Variable-&gt;addIncoming(NextVar, LoopEndBB);
1579
1580 // Restore the unshadowed variable.
1581 if (OldVal)
1582 NamedValues[VarName] = OldVal;
1583 else
1584 NamedValues.erase(VarName);
1585
1586
1587 // for expr always returns 0.0.
1588 return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
1589 }
1590
1591 Function *PrototypeAST::Codegen() {
1592 // Make the function type: double(double,double) etc.
1593 std::vector&lt;Type*&gt; Doubles(Args.size(),
1594 Type::getDoubleTy(getGlobalContext()));
1595 FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
1596 Doubles, false);
1597
1598 Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
1599
1600 // If F conflicted, there was already something named 'Name'. If it has a
1601 // body, don't allow redefinition or reextern.
1602 if (F-&gt;getName() != Name) {
1603 // Delete the one we just made and get the existing one.
1604 F-&gt;eraseFromParent();
1605 F = TheModule-&gt;getFunction(Name);
1606
1607 // If F already has a body, reject this.
1608 if (!F-&gt;empty()) {
1609 ErrorF("redefinition of function");
1610 return 0;
1611 }
1612
1613 // If F took a different number of args, reject.
1614 if (F-&gt;arg_size() != Args.size()) {
1615 ErrorF("redefinition of function with different # args");
1616 return 0;
1617 }
1618 }
1619
1620 // Set names for all arguments.
1621 unsigned Idx = 0;
1622 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1623 ++AI, ++Idx) {
1624 AI-&gt;setName(Args[Idx]);
1625
1626 // Add arguments to variable symbol table.
1627 NamedValues[Args[Idx]] = AI;
1628 }
1629
1630 return F;
1631 }
1632
1633 Function *FunctionAST::Codegen() {
1634 NamedValues.clear();
1635
1636 Function *TheFunction = Proto-&gt;Codegen();
1637 if (TheFunction == 0)
1638 return 0;
1639
1640 // If this is an operator, install it.
1641 if (Proto-&gt;isBinaryOp())
1642 BinopPrecedence[Proto-&gt;getOperatorName()] = Proto-&gt;getBinaryPrecedence();
1643
1644 // Create a new basic block to start insertion into.
1645 BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
1646 Builder.SetInsertPoint(BB);
1647
1648 if (Value *RetVal = Body-&gt;Codegen()) {
1649 // Finish off the function.
1650 Builder.CreateRet(RetVal);
1651
1652 // Validate the generated code, checking for consistency.
1653 verifyFunction(*TheFunction);
1654
1655 // Optimize the function.
1656 TheFPM-&gt;run(*TheFunction);
1657
1658 return TheFunction;
1659 }
1660
1661 // Error reading body, remove function.
1662 TheFunction-&gt;eraseFromParent();
1663
1664 if (Proto-&gt;isBinaryOp())
1665 BinopPrecedence.erase(Proto-&gt;getOperatorName());
1666 return 0;
1667 }
1668
1669 //===----------------------------------------------------------------------===//
1670 // Top-Level parsing and JIT Driver
1671 //===----------------------------------------------------------------------===//
1672
1673 static ExecutionEngine *TheExecutionEngine;
1674
1675 static void HandleDefinition() {
1676 if (FunctionAST *F = ParseDefinition()) {
1677 if (Function *LF = F-&gt;Codegen()) {
1678 fprintf(stderr, "Read function definition:");
1679 LF-&gt;dump();
1680 }
1681 } else {
1682 // Skip token for error recovery.
1683 getNextToken();
1684 }
1685 }
1686
1687 static void HandleExtern() {
1688 if (PrototypeAST *P = ParseExtern()) {
1689 if (Function *F = P-&gt;Codegen()) {
1690 fprintf(stderr, "Read extern: ");
1691 F-&gt;dump();
1692 }
1693 } else {
1694 // Skip token for error recovery.
1695 getNextToken();
1696 }
1697 }
1698
1699 static void HandleTopLevelExpression() {
1700 // Evaluate a top-level expression into an anonymous function.
1701 if (FunctionAST *F = ParseTopLevelExpr()) {
1702 if (Function *LF = F-&gt;Codegen()) {
1703 // JIT the function, returning a function pointer.
1704 void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
1705
1706 // Cast it to the right type (takes no arguments, returns a double) so we
1707 // can call it as a native function.
1708 double (*FP)() = (double (*)())(intptr_t)FPtr;
1709 fprintf(stderr, "Evaluated to %f\n", FP());
1710 }
1711 } else {
1712 // Skip token for error recovery.
1713 getNextToken();
1714 }
1715 }
1716
1717 /// top ::= definition | external | expression | ';'
1718 static void MainLoop() {
1719 while (1) {
1720 fprintf(stderr, "ready&gt; ");
1721 switch (CurTok) {
1722 case tok_eof: return;
1723 case ';': getNextToken(); break; // ignore top-level semicolons.
1724 case tok_def: HandleDefinition(); break;
1725 case tok_extern: HandleExtern(); break;
1726 default: HandleTopLevelExpression(); break;
1727 }
1728 }
1729 }
1730
1731 //===----------------------------------------------------------------------===//
1732 // "Library" functions that can be "extern'd" from user code.
1733 //===----------------------------------------------------------------------===//
1734
1735 /// putchard - putchar that takes a double and returns 0.
1736 extern "C"
1737 double putchard(double X) {
1738 putchar((char)X);
1739 return 0;
1740 }
1741
1742 /// printd - printf that takes a double prints it as "%f\n", returning 0.
1743 extern "C"
1744 double printd(double X) {
1745 printf("%f\n", X);
1746 return 0;
1747 }
1748
1749 //===----------------------------------------------------------------------===//
1750 // Main driver code.
1751 //===----------------------------------------------------------------------===//
1752
1753 int main() {
1754 InitializeNativeTarget();
1755 LLVMContext &amp;Context = getGlobalContext();
1756
1757 // Install standard binary operators.
1758 // 1 is lowest precedence.
1759 BinopPrecedence['&lt;'] = 10;
1760 BinopPrecedence['+'] = 20;
1761 BinopPrecedence['-'] = 20;
1762 BinopPrecedence['*'] = 40; // highest.
1763
1764 // Prime the first token.
1765 fprintf(stderr, "ready&gt; ");
1766 getNextToken();
1767
1768 // Make the module, which holds all the code.
1769 TheModule = new Module("my cool jit", Context);
1770
1771 // Create the JIT. This takes ownership of the module.
1772 std::string ErrStr;
1773 TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&amp;ErrStr).create();
1774 if (!TheExecutionEngine) {
1775 fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
1776 exit(1);
1777 }
1778
1779 FunctionPassManager OurFPM(TheModule);
1780
1781 // Set up the optimizer pipeline. Start with registering info about how the
1782 // target lays out data structures.
1783 OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
1784 // Provide basic AliasAnalysis support for GVN.
1785 OurFPM.add(createBasicAliasAnalysisPass());
1786 // Do simple "peephole" optimizations and bit-twiddling optzns.
1787 OurFPM.add(createInstructionCombiningPass());
1788 // Reassociate expressions.
1789 OurFPM.add(createReassociatePass());
1790 // Eliminate Common SubExpressions.
1791 OurFPM.add(createGVNPass());
1792 // Simplify the control flow graph (deleting unreachable blocks, etc).
1793 OurFPM.add(createCFGSimplificationPass());
1794
1795 OurFPM.doInitialization();
1796
1797 // Set the global so the code gen can use this.
1798 TheFPM = &amp;OurFPM;
1799
1800 // Run the main "interpreter loop" now.
1801 MainLoop();
1802
1803 TheFPM = 0;
1804
1805 // Print out all of the generated code.
1806 TheModule-&gt;dump();
1807
1808 return 0;
1809 }
1810 </pre>
1811 </div>
1812
1813 <a href="LangImpl7.html">Next: Extending the language: mutable variables / SSA construction</a>
1814 </div>
1815
1816 <!-- *********************************************************************** -->
1817 <hr>
1818 <address>
1819 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1820 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1821 <a href="http://validator.w3.org/check/referer"><img
1822 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1823
1824 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1825 <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br>
1826 Last modified: $Date$
1827 </address>
1828 </body>
1829 </html>