]>
Commit | Line | Data |
---|---|---|
970d7e83 LB |
1 | ====================================================== |
2 | How to set up LLVM-style RTTI for your class hierarchy | |
3 | ====================================================== | |
4 | ||
5 | .. contents:: | |
6 | ||
7 | Background | |
8 | ========== | |
9 | ||
10 | LLVM avoids using C++'s built in RTTI. Instead, it pervasively uses its | |
11 | own hand-rolled form of RTTI which is much more efficient and flexible, | |
12 | although it requires a bit more work from you as a class author. | |
13 | ||
14 | A description of how to use LLVM-style RTTI from a client's perspective is | |
15 | given in the `Programmer's Manual <ProgrammersManual.html#isa>`_. This | |
16 | document, in contrast, discusses the steps you need to take as a class | |
17 | hierarchy author to make LLVM-style RTTI available to your clients. | |
18 | ||
19 | Before diving in, make sure that you are familiar with the Object Oriented | |
20 | Programming concept of "`is-a`_". | |
21 | ||
22 | .. _is-a: http://en.wikipedia.org/wiki/Is-a | |
23 | ||
24 | Basic Setup | |
25 | =========== | |
26 | ||
27 | This section describes how to set up the most basic form of LLVM-style RTTI | |
28 | (which is sufficient for 99.9% of the cases). We will set up LLVM-style | |
29 | RTTI for this class hierarchy: | |
30 | ||
31 | .. code-block:: c++ | |
32 | ||
33 | class Shape { | |
34 | public: | |
35 | Shape() {} | |
36 | virtual double computeArea() = 0; | |
37 | }; | |
38 | ||
39 | class Square : public Shape { | |
40 | double SideLength; | |
41 | public: | |
42 | Square(double S) : SideLength(S) {} | |
43 | double computeArea() /* override */; | |
44 | }; | |
45 | ||
46 | class Circle : public Shape { | |
47 | double Radius; | |
48 | public: | |
49 | Circle(double R) : Radius(R) {} | |
50 | double computeArea() /* override */; | |
51 | }; | |
52 | ||
53 | The most basic working setup for LLVM-style RTTI requires the following | |
54 | steps: | |
55 | ||
56 | #. In the header where you declare ``Shape``, you will want to ``#include | |
57 | "llvm/Support/Casting.h"``, which declares LLVM's RTTI templates. That | |
58 | way your clients don't even have to think about it. | |
59 | ||
60 | .. code-block:: c++ | |
61 | ||
62 | #include "llvm/Support/Casting.h" | |
63 | ||
64 | #. In the base class, introduce an enum which discriminates all of the | |
65 | different concrete classes in the hierarchy, and stash the enum value | |
66 | somewhere in the base class. | |
67 | ||
68 | Here is the code after introducing this change: | |
69 | ||
70 | .. code-block:: c++ | |
71 | ||
72 | class Shape { | |
73 | public: | |
74 | + /// Discriminator for LLVM-style RTTI (dyn_cast<> et al.) | |
75 | + enum ShapeKind { | |
76 | + SK_Square, | |
77 | + SK_Circle | |
78 | + }; | |
79 | +private: | |
80 | + const ShapeKind Kind; | |
81 | +public: | |
82 | + ShapeKind getKind() const { return Kind; } | |
83 | + | |
84 | Shape() {} | |
85 | virtual double computeArea() = 0; | |
86 | }; | |
87 | ||
88 | You will usually want to keep the ``Kind`` member encapsulated and | |
89 | private, but let the enum ``ShapeKind`` be public along with providing a | |
90 | ``getKind()`` method. This is convenient for clients so that they can do | |
91 | a ``switch`` over the enum. | |
92 | ||
93 | A common naming convention is that these enums are "kind"s, to avoid | |
94 | ambiguity with the words "type" or "class" which have overloaded meanings | |
95 | in many contexts within LLVM. Sometimes there will be a natural name for | |
96 | it, like "opcode". Don't bikeshed over this; when in doubt use ``Kind``. | |
97 | ||
98 | You might wonder why the ``Kind`` enum doesn't have an entry for | |
99 | ``Shape``. The reason for this is that since ``Shape`` is abstract | |
100 | (``computeArea() = 0;``), you will never actually have non-derived | |
101 | instances of exactly that class (only subclasses). See `Concrete Bases | |
102 | and Deeper Hierarchies`_ for information on how to deal with | |
103 | non-abstract bases. It's worth mentioning here that unlike | |
104 | ``dynamic_cast<>``, LLVM-style RTTI can be used (and is often used) for | |
105 | classes that don't have v-tables. | |
106 | ||
107 | #. Next, you need to make sure that the ``Kind`` gets initialized to the | |
108 | value corresponding to the dynamic type of the class. Typically, you will | |
109 | want to have it be an argument to the constructor of the base class, and | |
110 | then pass in the respective ``XXXKind`` from subclass constructors. | |
111 | ||
112 | Here is the code after that change: | |
113 | ||
114 | .. code-block:: c++ | |
115 | ||
116 | class Shape { | |
117 | public: | |
118 | /// Discriminator for LLVM-style RTTI (dyn_cast<> et al.) | |
119 | enum ShapeKind { | |
120 | SK_Square, | |
121 | SK_Circle | |
122 | }; | |
123 | private: | |
124 | const ShapeKind Kind; | |
125 | public: | |
126 | ShapeKind getKind() const { return Kind; } | |
127 | ||
128 | - Shape() {} | |
129 | + Shape(ShapeKind K) : Kind(K) {} | |
130 | virtual double computeArea() = 0; | |
131 | }; | |
132 | ||
133 | class Square : public Shape { | |
134 | double SideLength; | |
135 | public: | |
136 | - Square(double S) : SideLength(S) {} | |
137 | + Square(double S) : Shape(SK_Square), SideLength(S) {} | |
138 | double computeArea() /* override */; | |
139 | }; | |
140 | ||
141 | class Circle : public Shape { | |
142 | double Radius; | |
143 | public: | |
144 | - Circle(double R) : Radius(R) {} | |
145 | + Circle(double R) : Shape(SK_Circle), Radius(R) {} | |
146 | double computeArea() /* override */; | |
147 | }; | |
148 | ||
149 | #. Finally, you need to inform LLVM's RTTI templates how to dynamically | |
150 | determine the type of a class (i.e. whether the ``isa<>``/``dyn_cast<>`` | |
151 | should succeed). The default "99.9% of use cases" way to accomplish this | |
152 | is through a small static member function ``classof``. In order to have | |
153 | proper context for an explanation, we will display this code first, and | |
154 | then below describe each part: | |
155 | ||
156 | .. code-block:: c++ | |
157 | ||
158 | class Shape { | |
159 | public: | |
160 | /// Discriminator for LLVM-style RTTI (dyn_cast<> et al.) | |
161 | enum ShapeKind { | |
162 | SK_Square, | |
163 | SK_Circle | |
164 | }; | |
165 | private: | |
166 | const ShapeKind Kind; | |
167 | public: | |
168 | ShapeKind getKind() const { return Kind; } | |
169 | ||
170 | Shape(ShapeKind K) : Kind(K) {} | |
171 | virtual double computeArea() = 0; | |
172 | }; | |
173 | ||
174 | class Square : public Shape { | |
175 | double SideLength; | |
176 | public: | |
177 | Square(double S) : Shape(SK_Square), SideLength(S) {} | |
178 | double computeArea() /* override */; | |
179 | + | |
180 | + static bool classof(const Shape *S) { | |
181 | + return S->getKind() == SK_Square; | |
182 | + } | |
183 | }; | |
184 | ||
185 | class Circle : public Shape { | |
186 | double Radius; | |
187 | public: | |
188 | Circle(double R) : Shape(SK_Circle), Radius(R) {} | |
189 | double computeArea() /* override */; | |
190 | + | |
191 | + static bool classof(const Shape *S) { | |
192 | + return S->getKind() == SK_Circle; | |
193 | + } | |
194 | }; | |
195 | ||
196 | The job of ``classof`` is to dynamically determine whether an object of | |
197 | a base class is in fact of a particular derived class. In order to | |
198 | downcast a type ``Base`` to a type ``Derived``, there needs to be a | |
199 | ``classof`` in ``Derived`` which will accept an object of type ``Base``. | |
200 | ||
201 | To be concrete, consider the following code: | |
202 | ||
203 | .. code-block:: c++ | |
204 | ||
205 | Shape *S = ...; | |
206 | if (isa<Circle>(S)) { | |
207 | /* do something ... */ | |
208 | } | |
209 | ||
210 | The code of the ``isa<>`` test in this code will eventually boil | |
211 | down---after template instantiation and some other machinery---to a | |
212 | check roughly like ``Circle::classof(S)``. For more information, see | |
213 | :ref:`classof-contract`. | |
214 | ||
215 | The argument to ``classof`` should always be an *ancestor* class because | |
216 | the implementation has logic to allow and optimize away | |
217 | upcasts/up-``isa<>``'s automatically. It is as though every class | |
218 | ``Foo`` automatically has a ``classof`` like: | |
219 | ||
220 | .. code-block:: c++ | |
221 | ||
222 | class Foo { | |
223 | [...] | |
224 | template <class T> | |
225 | static bool classof(const T *, | |
1a4d82fc JJ |
226 | ::std::enable_if< |
227 | ::std::is_base_of<Foo, T>::value | |
970d7e83 LB |
228 | >::type* = 0) { return true; } |
229 | [...] | |
230 | }; | |
231 | ||
232 | Note that this is the reason that we did not need to introduce a | |
233 | ``classof`` into ``Shape``: all relevant classes derive from ``Shape``, | |
234 | and ``Shape`` itself is abstract (has no entry in the ``Kind`` enum), | |
235 | so this notional inferred ``classof`` is all we need. See `Concrete | |
236 | Bases and Deeper Hierarchies`_ for more information about how to extend | |
237 | this example to more general hierarchies. | |
238 | ||
239 | Although for this small example setting up LLVM-style RTTI seems like a lot | |
240 | of "boilerplate", if your classes are doing anything interesting then this | |
241 | will end up being a tiny fraction of the code. | |
242 | ||
243 | Concrete Bases and Deeper Hierarchies | |
244 | ===================================== | |
245 | ||
246 | For concrete bases (i.e. non-abstract interior nodes of the inheritance | |
247 | tree), the ``Kind`` check inside ``classof`` needs to be a bit more | |
248 | complicated. The situation differs from the example above in that | |
249 | ||
250 | * Since the class is concrete, it must itself have an entry in the ``Kind`` | |
251 | enum because it is possible to have objects with this class as a dynamic | |
252 | type. | |
253 | ||
254 | * Since the class has children, the check inside ``classof`` must take them | |
255 | into account. | |
256 | ||
257 | Say that ``SpecialSquare`` and ``OtherSpecialSquare`` derive | |
258 | from ``Square``, and so ``ShapeKind`` becomes: | |
259 | ||
260 | .. code-block:: c++ | |
261 | ||
262 | enum ShapeKind { | |
263 | SK_Square, | |
264 | + SK_SpecialSquare, | |
265 | + SK_OtherSpecialSquare, | |
266 | SK_Circle | |
267 | } | |
268 | ||
269 | Then in ``Square``, we would need to modify the ``classof`` like so: | |
270 | ||
271 | .. code-block:: c++ | |
272 | ||
273 | - static bool classof(const Shape *S) { | |
274 | - return S->getKind() == SK_Square; | |
275 | - } | |
276 | + static bool classof(const Shape *S) { | |
277 | + return S->getKind() >= SK_Square && | |
278 | + S->getKind() <= SK_OtherSpecialSquare; | |
279 | + } | |
280 | ||
281 | The reason that we need to test a range like this instead of just equality | |
282 | is that both ``SpecialSquare`` and ``OtherSpecialSquare`` "is-a" | |
283 | ``Square``, and so ``classof`` needs to return ``true`` for them. | |
284 | ||
285 | This approach can be made to scale to arbitrarily deep hierarchies. The | |
286 | trick is that you arrange the enum values so that they correspond to a | |
287 | preorder traversal of the class hierarchy tree. With that arrangement, all | |
288 | subclass tests can be done with two comparisons as shown above. If you just | |
289 | list the class hierarchy like a list of bullet points, you'll get the | |
290 | ordering right:: | |
291 | ||
292 | | Shape | |
293 | | Square | |
294 | | SpecialSquare | |
295 | | OtherSpecialSquare | |
296 | | Circle | |
297 | ||
298 | A Bug to be Aware Of | |
299 | -------------------- | |
300 | ||
301 | The example just given opens the door to bugs where the ``classof``\s are | |
302 | not updated to match the ``Kind`` enum when adding (or removing) classes to | |
303 | (from) the hierarchy. | |
304 | ||
305 | Continuing the example above, suppose we add a ``SomewhatSpecialSquare`` as | |
306 | a subclass of ``Square``, and update the ``ShapeKind`` enum like so: | |
307 | ||
308 | .. code-block:: c++ | |
309 | ||
310 | enum ShapeKind { | |
311 | SK_Square, | |
312 | SK_SpecialSquare, | |
313 | SK_OtherSpecialSquare, | |
314 | + SK_SomewhatSpecialSquare, | |
315 | SK_Circle | |
316 | } | |
317 | ||
318 | Now, suppose that we forget to update ``Square::classof()``, so it still | |
319 | looks like: | |
320 | ||
321 | .. code-block:: c++ | |
322 | ||
323 | static bool classof(const Shape *S) { | |
324 | // BUG: Returns false when S->getKind() == SK_SomewhatSpecialSquare, | |
325 | // even though SomewhatSpecialSquare "is a" Square. | |
326 | return S->getKind() >= SK_Square && | |
327 | S->getKind() <= SK_OtherSpecialSquare; | |
328 | } | |
329 | ||
330 | As the comment indicates, this code contains a bug. A straightforward and | |
331 | non-clever way to avoid this is to introduce an explicit ``SK_LastSquare`` | |
332 | entry in the enum when adding the first subclass(es). For example, we could | |
333 | rewrite the example at the beginning of `Concrete Bases and Deeper | |
334 | Hierarchies`_ as: | |
335 | ||
336 | .. code-block:: c++ | |
337 | ||
338 | enum ShapeKind { | |
339 | SK_Square, | |
340 | + SK_SpecialSquare, | |
341 | + SK_OtherSpecialSquare, | |
342 | + SK_LastSquare, | |
343 | SK_Circle | |
344 | } | |
345 | ... | |
346 | // Square::classof() | |
347 | - static bool classof(const Shape *S) { | |
348 | - return S->getKind() == SK_Square; | |
349 | - } | |
350 | + static bool classof(const Shape *S) { | |
351 | + return S->getKind() >= SK_Square && | |
352 | + S->getKind() <= SK_LastSquare; | |
353 | + } | |
354 | ||
355 | Then, adding new subclasses is easy: | |
356 | ||
357 | .. code-block:: c++ | |
358 | ||
359 | enum ShapeKind { | |
360 | SK_Square, | |
361 | SK_SpecialSquare, | |
362 | SK_OtherSpecialSquare, | |
363 | + SK_SomewhatSpecialSquare, | |
364 | SK_LastSquare, | |
365 | SK_Circle | |
366 | } | |
367 | ||
368 | Notice that ``Square::classof`` does not need to be changed. | |
369 | ||
370 | .. _classof-contract: | |
371 | ||
372 | The Contract of ``classof`` | |
373 | --------------------------- | |
374 | ||
375 | To be more precise, let ``classof`` be inside a class ``C``. Then the | |
376 | contract for ``classof`` is "return ``true`` if the dynamic type of the | |
377 | argument is-a ``C``". As long as your implementation fulfills this | |
378 | contract, you can tweak and optimize it as much as you want. | |
379 | ||
380 | .. TODO:: | |
381 | ||
382 | Touch on some of the more advanced features, like ``isa_impl`` and | |
383 | ``simplify_type``. However, those two need reference documentation in | |
384 | the form of doxygen comments as well. We need the doxygen so that we can | |
385 | say "for full details, see http://llvm.org/doxygen/..." | |
386 | ||
387 | Rules of Thumb | |
388 | ============== | |
389 | ||
390 | #. The ``Kind`` enum should have one entry per concrete class, ordered | |
391 | according to a preorder traversal of the inheritance tree. | |
392 | #. The argument to ``classof`` should be a ``const Base *``, where ``Base`` | |
393 | is some ancestor in the inheritance hierarchy. The argument should | |
394 | *never* be a derived class or the class itself: the template machinery | |
395 | for ``isa<>`` already handles this case and optimizes it. | |
396 | #. For each class in the hierarchy that has no children, implement a | |
397 | ``classof`` that checks only against its ``Kind``. | |
398 | #. For each class in the hierarchy that has children, implement a | |
399 | ``classof`` that checks a range of the first child's ``Kind`` and the | |
400 | last child's ``Kind``. |