]>
Commit | Line | Data |
---|---|---|
970d7e83 LB |
1 | ===================== |
2 | YAML I/O | |
3 | ===================== | |
4 | ||
5 | .. contents:: | |
6 | :local: | |
7 | ||
8 | Introduction to YAML | |
9 | ==================== | |
10 | ||
11 | YAML is a human readable data serialization language. The full YAML language | |
12 | spec can be read at `yaml.org | |
13 | <http://www.yaml.org/spec/1.2/spec.html#Introduction>`_. The simplest form of | |
14 | yaml is just "scalars", "mappings", and "sequences". A scalar is any number | |
15 | or string. The pound/hash symbol (#) begins a comment line. A mapping is | |
16 | a set of key-value pairs where the key ends with a colon. For example: | |
17 | ||
18 | .. code-block:: yaml | |
19 | ||
20 | # a mapping | |
21 | name: Tom | |
22 | hat-size: 7 | |
23 | ||
24 | A sequence is a list of items where each item starts with a leading dash ('-'). | |
25 | For example: | |
26 | ||
27 | .. code-block:: yaml | |
28 | ||
29 | # a sequence | |
30 | - x86 | |
31 | - x86_64 | |
32 | - PowerPC | |
33 | ||
34 | You can combine mappings and sequences by indenting. For example a sequence | |
35 | of mappings in which one of the mapping values is itself a sequence: | |
36 | ||
37 | .. code-block:: yaml | |
38 | ||
39 | # a sequence of mappings with one key's value being a sequence | |
40 | - name: Tom | |
41 | cpus: | |
42 | - x86 | |
43 | - x86_64 | |
44 | - name: Bob | |
45 | cpus: | |
46 | - x86 | |
47 | - name: Dan | |
48 | cpus: | |
49 | - PowerPC | |
50 | - x86 | |
51 | ||
52 | Sometime sequences are known to be short and the one entry per line is too | |
53 | verbose, so YAML offers an alternate syntax for sequences called a "Flow | |
54 | Sequence" in which you put comma separated sequence elements into square | |
55 | brackets. The above example could then be simplified to : | |
56 | ||
57 | ||
58 | .. code-block:: yaml | |
59 | ||
60 | # a sequence of mappings with one key's value being a flow sequence | |
61 | - name: Tom | |
62 | cpus: [ x86, x86_64 ] | |
63 | - name: Bob | |
64 | cpus: [ x86 ] | |
65 | - name: Dan | |
66 | cpus: [ PowerPC, x86 ] | |
67 | ||
68 | ||
69 | Introduction to YAML I/O | |
70 | ======================== | |
71 | ||
72 | The use of indenting makes the YAML easy for a human to read and understand, | |
73 | but having a program read and write YAML involves a lot of tedious details. | |
74 | The YAML I/O library structures and simplifies reading and writing YAML | |
75 | documents. | |
76 | ||
77 | YAML I/O assumes you have some "native" data structures which you want to be | |
78 | able to dump as YAML and recreate from YAML. The first step is to try | |
79 | writing example YAML for your data structures. You may find after looking at | |
80 | possible YAML representations that a direct mapping of your data structures | |
81 | to YAML is not very readable. Often the fields are not in the order that | |
82 | a human would find readable. Or the same information is replicated in multiple | |
83 | locations, making it hard for a human to write such YAML correctly. | |
84 | ||
85 | In relational database theory there is a design step called normalization in | |
86 | which you reorganize fields and tables. The same considerations need to | |
87 | go into the design of your YAML encoding. But, you may not want to change | |
88 | your existing native data structures. Therefore, when writing out YAML | |
89 | there may be a normalization step, and when reading YAML there would be a | |
90 | corresponding denormalization step. | |
91 | ||
92 | YAML I/O uses a non-invasive, traits based design. YAML I/O defines some | |
93 | abstract base templates. You specialize those templates on your data types. | |
94 | For instance, if you have an enumerated type FooBar you could specialize | |
95 | ScalarEnumerationTraits on that type and define the enumeration() method: | |
96 | ||
97 | .. code-block:: c++ | |
98 | ||
99 | using llvm::yaml::ScalarEnumerationTraits; | |
100 | using llvm::yaml::IO; | |
101 | ||
102 | template <> | |
103 | struct ScalarEnumerationTraits<FooBar> { | |
104 | static void enumeration(IO &io, FooBar &value) { | |
105 | ... | |
106 | } | |
107 | }; | |
108 | ||
109 | ||
110 | As with all YAML I/O template specializations, the ScalarEnumerationTraits is used for | |
111 | both reading and writing YAML. That is, the mapping between in-memory enum | |
1a4d82fc | 112 | values and the YAML string representation is only in one place. |
970d7e83 LB |
113 | This assures that the code for writing and parsing of YAML stays in sync. |
114 | ||
115 | To specify a YAML mappings, you define a specialization on | |
116 | llvm::yaml::MappingTraits. | |
117 | If your native data structure happens to be a struct that is already normalized, | |
118 | then the specialization is simple. For example: | |
119 | ||
120 | .. code-block:: c++ | |
121 | ||
122 | using llvm::yaml::MappingTraits; | |
123 | using llvm::yaml::IO; | |
124 | ||
125 | template <> | |
126 | struct MappingTraits<Person> { | |
127 | static void mapping(IO &io, Person &info) { | |
128 | io.mapRequired("name", info.name); | |
129 | io.mapOptional("hat-size", info.hatSize); | |
130 | } | |
131 | }; | |
132 | ||
133 | ||
134 | A YAML sequence is automatically inferred if you data type has begin()/end() | |
135 | iterators and a push_back() method. Therefore any of the STL containers | |
136 | (such as std::vector<>) will automatically translate to YAML sequences. | |
137 | ||
138 | Once you have defined specializations for your data types, you can | |
139 | programmatically use YAML I/O to write a YAML document: | |
140 | ||
141 | .. code-block:: c++ | |
142 | ||
143 | using llvm::yaml::Output; | |
144 | ||
145 | Person tom; | |
146 | tom.name = "Tom"; | |
147 | tom.hatSize = 8; | |
148 | Person dan; | |
149 | dan.name = "Dan"; | |
150 | dan.hatSize = 7; | |
151 | std::vector<Person> persons; | |
152 | persons.push_back(tom); | |
153 | persons.push_back(dan); | |
154 | ||
155 | Output yout(llvm::outs()); | |
156 | yout << persons; | |
157 | ||
158 | This would write the following: | |
159 | ||
160 | .. code-block:: yaml | |
161 | ||
162 | - name: Tom | |
163 | hat-size: 8 | |
164 | - name: Dan | |
165 | hat-size: 7 | |
166 | ||
167 | And you can also read such YAML documents with the following code: | |
168 | ||
169 | .. code-block:: c++ | |
170 | ||
171 | using llvm::yaml::Input; | |
172 | ||
173 | typedef std::vector<Person> PersonList; | |
174 | std::vector<PersonList> docs; | |
175 | ||
176 | Input yin(document.getBuffer()); | |
177 | yin >> docs; | |
178 | ||
179 | if ( yin.error() ) | |
180 | return; | |
181 | ||
182 | // Process read document | |
183 | for ( PersonList &pl : docs ) { | |
184 | for ( Person &person : pl ) { | |
185 | cout << "name=" << person.name; | |
186 | } | |
187 | } | |
188 | ||
189 | One other feature of YAML is the ability to define multiple documents in a | |
190 | single file. That is why reading YAML produces a vector of your document type. | |
191 | ||
192 | ||
193 | ||
194 | Error Handling | |
195 | ============== | |
196 | ||
197 | When parsing a YAML document, if the input does not match your schema (as | |
198 | expressed in your XxxTraits<> specializations). YAML I/O | |
199 | will print out an error message and your Input object's error() method will | |
200 | return true. For instance the following document: | |
201 | ||
202 | .. code-block:: yaml | |
203 | ||
204 | - name: Tom | |
205 | shoe-size: 12 | |
206 | - name: Dan | |
207 | hat-size: 7 | |
208 | ||
209 | Has a key (shoe-size) that is not defined in the schema. YAML I/O will | |
210 | automatically generate this error: | |
211 | ||
212 | .. code-block:: yaml | |
213 | ||
214 | YAML:2:2: error: unknown key 'shoe-size' | |
215 | shoe-size: 12 | |
216 | ^~~~~~~~~ | |
217 | ||
218 | Similar errors are produced for other input not conforming to the schema. | |
219 | ||
220 | ||
221 | Scalars | |
222 | ======= | |
223 | ||
224 | YAML scalars are just strings (i.e. not a sequence or mapping). The YAML I/O | |
225 | library provides support for translating between YAML scalars and specific | |
226 | C++ types. | |
227 | ||
228 | ||
229 | Built-in types | |
230 | -------------- | |
231 | The following types have built-in support in YAML I/O: | |
232 | ||
233 | * bool | |
234 | * float | |
235 | * double | |
236 | * StringRef | |
1a4d82fc | 237 | * std::string |
970d7e83 LB |
238 | * int64_t |
239 | * int32_t | |
240 | * int16_t | |
241 | * int8_t | |
242 | * uint64_t | |
243 | * uint32_t | |
244 | * uint16_t | |
245 | * uint8_t | |
246 | ||
247 | That is, you can use those types in fields of MappingTraits or as element type | |
248 | in sequence. When reading, YAML I/O will validate that the string found | |
249 | is convertible to that type and error out if not. | |
250 | ||
251 | ||
252 | Unique types | |
253 | ------------ | |
254 | Given that YAML I/O is trait based, the selection of how to convert your data | |
255 | to YAML is based on the type of your data. But in C++ type matching, typedefs | |
256 | do not generate unique type names. That means if you have two typedefs of | |
257 | unsigned int, to YAML I/O both types look exactly like unsigned int. To | |
258 | facilitate make unique type names, YAML I/O provides a macro which is used | |
259 | like a typedef on built-in types, but expands to create a class with conversion | |
260 | operators to and from the base type. For example: | |
261 | ||
262 | .. code-block:: c++ | |
263 | ||
264 | LLVM_YAML_STRONG_TYPEDEF(uint32_t, MyFooFlags) | |
265 | LLVM_YAML_STRONG_TYPEDEF(uint32_t, MyBarFlags) | |
266 | ||
267 | This generates two classes MyFooFlags and MyBarFlags which you can use in your | |
268 | native data structures instead of uint32_t. They are implicitly | |
269 | converted to and from uint32_t. The point of creating these unique types | |
270 | is that you can now specify traits on them to get different YAML conversions. | |
271 | ||
272 | Hex types | |
273 | --------- | |
274 | An example use of a unique type is that YAML I/O provides fixed sized unsigned | |
275 | integers that are written with YAML I/O as hexadecimal instead of the decimal | |
276 | format used by the built-in integer types: | |
277 | ||
278 | * Hex64 | |
279 | * Hex32 | |
280 | * Hex16 | |
281 | * Hex8 | |
282 | ||
283 | You can use llvm::yaml::Hex32 instead of uint32_t and the only different will | |
284 | be that when YAML I/O writes out that type it will be formatted in hexadecimal. | |
285 | ||
286 | ||
287 | ScalarEnumerationTraits | |
288 | ----------------------- | |
289 | YAML I/O supports translating between in-memory enumerations and a set of string | |
290 | values in YAML documents. This is done by specializing ScalarEnumerationTraits<> | |
291 | on your enumeration type and define a enumeration() method. | |
292 | For instance, suppose you had an enumeration of CPUs and a struct with it as | |
293 | a field: | |
294 | ||
295 | .. code-block:: c++ | |
296 | ||
297 | enum CPUs { | |
298 | cpu_x86_64 = 5, | |
299 | cpu_x86 = 7, | |
300 | cpu_PowerPC = 8 | |
301 | }; | |
302 | ||
303 | struct Info { | |
304 | CPUs cpu; | |
305 | uint32_t flags; | |
306 | }; | |
307 | ||
308 | To support reading and writing of this enumeration, you can define a | |
309 | ScalarEnumerationTraits specialization on CPUs, which can then be used | |
310 | as a field type: | |
311 | ||
312 | .. code-block:: c++ | |
313 | ||
314 | using llvm::yaml::ScalarEnumerationTraits; | |
315 | using llvm::yaml::MappingTraits; | |
316 | using llvm::yaml::IO; | |
317 | ||
318 | template <> | |
319 | struct ScalarEnumerationTraits<CPUs> { | |
320 | static void enumeration(IO &io, CPUs &value) { | |
321 | io.enumCase(value, "x86_64", cpu_x86_64); | |
322 | io.enumCase(value, "x86", cpu_x86); | |
323 | io.enumCase(value, "PowerPC", cpu_PowerPC); | |
324 | } | |
325 | }; | |
326 | ||
327 | template <> | |
328 | struct MappingTraits<Info> { | |
329 | static void mapping(IO &io, Info &info) { | |
330 | io.mapRequired("cpu", info.cpu); | |
331 | io.mapOptional("flags", info.flags, 0); | |
332 | } | |
333 | }; | |
334 | ||
335 | When reading YAML, if the string found does not match any of the the strings | |
336 | specified by enumCase() methods, an error is automatically generated. | |
337 | When writing YAML, if the value being written does not match any of the values | |
338 | specified by the enumCase() methods, a runtime assertion is triggered. | |
339 | ||
340 | ||
341 | BitValue | |
342 | -------- | |
343 | Another common data structure in C++ is a field where each bit has a unique | |
344 | meaning. This is often used in a "flags" field. YAML I/O has support for | |
345 | converting such fields to a flow sequence. For instance suppose you | |
346 | had the following bit flags defined: | |
347 | ||
348 | .. code-block:: c++ | |
349 | ||
350 | enum { | |
351 | flagsPointy = 1 | |
352 | flagsHollow = 2 | |
353 | flagsFlat = 4 | |
354 | flagsRound = 8 | |
355 | }; | |
356 | ||
1a4d82fc | 357 | LLVM_YAML_STRONG_TYPEDEF(uint32_t, MyFlags) |
970d7e83 LB |
358 | |
359 | To support reading and writing of MyFlags, you specialize ScalarBitSetTraits<> | |
360 | on MyFlags and provide the bit values and their names. | |
361 | ||
362 | .. code-block:: c++ | |
363 | ||
364 | using llvm::yaml::ScalarBitSetTraits; | |
365 | using llvm::yaml::MappingTraits; | |
366 | using llvm::yaml::IO; | |
367 | ||
368 | template <> | |
369 | struct ScalarBitSetTraits<MyFlags> { | |
370 | static void bitset(IO &io, MyFlags &value) { | |
371 | io.bitSetCase(value, "hollow", flagHollow); | |
372 | io.bitSetCase(value, "flat", flagFlat); | |
373 | io.bitSetCase(value, "round", flagRound); | |
374 | io.bitSetCase(value, "pointy", flagPointy); | |
375 | } | |
376 | }; | |
377 | ||
378 | struct Info { | |
379 | StringRef name; | |
380 | MyFlags flags; | |
381 | }; | |
382 | ||
383 | template <> | |
384 | struct MappingTraits<Info> { | |
385 | static void mapping(IO &io, Info& info) { | |
386 | io.mapRequired("name", info.name); | |
387 | io.mapRequired("flags", info.flags); | |
388 | } | |
389 | }; | |
390 | ||
391 | With the above, YAML I/O (when writing) will test mask each value in the | |
392 | bitset trait against the flags field, and each that matches will | |
393 | cause the corresponding string to be added to the flow sequence. The opposite | |
394 | is done when reading and any unknown string values will result in a error. With | |
395 | the above schema, a same valid YAML document is: | |
396 | ||
397 | .. code-block:: yaml | |
398 | ||
399 | name: Tom | |
400 | flags: [ pointy, flat ] | |
401 | ||
1a4d82fc JJ |
402 | Sometimes a "flags" field might contains an enumeration part |
403 | defined by a bit-mask. | |
404 | ||
405 | .. code-block:: c++ | |
406 | ||
407 | enum { | |
408 | flagsFeatureA = 1, | |
409 | flagsFeatureB = 2, | |
410 | flagsFeatureC = 4, | |
411 | ||
412 | flagsCPUMask = 24, | |
413 | ||
414 | flagsCPU1 = 8, | |
415 | flagsCPU2 = 16 | |
416 | }; | |
417 | ||
418 | To support reading and writing such fields, you need to use the maskedBitSet() | |
419 | method and provide the bit values, their names and the enumeration mask. | |
420 | ||
421 | .. code-block:: c++ | |
422 | ||
423 | template <> | |
424 | struct ScalarBitSetTraits<MyFlags> { | |
425 | static void bitset(IO &io, MyFlags &value) { | |
426 | io.bitSetCase(value, "featureA", flagsFeatureA); | |
427 | io.bitSetCase(value, "featureB", flagsFeatureB); | |
428 | io.bitSetCase(value, "featureC", flagsFeatureC); | |
429 | io.maskedBitSetCase(value, "CPU1", flagsCPU1, flagsCPUMask); | |
430 | io.maskedBitSetCase(value, "CPU2", flagsCPU2, flagsCPUMask); | |
431 | } | |
432 | }; | |
433 | ||
434 | YAML I/O (when writing) will apply the enumeration mask to the flags field, | |
435 | and compare the result and values from the bitset. As in case of a regular | |
436 | bitset, each that matches will cause the corresponding string to be added | |
437 | to the flow sequence. | |
970d7e83 LB |
438 | |
439 | Custom Scalar | |
440 | ------------- | |
441 | Sometimes for readability a scalar needs to be formatted in a custom way. For | |
442 | instance your internal data structure may use a integer for time (seconds since | |
443 | some epoch), but in YAML it would be much nicer to express that integer in | |
444 | some time format (e.g. 4-May-2012 10:30pm). YAML I/O has a way to support | |
445 | custom formatting and parsing of scalar types by specializing ScalarTraits<> on | |
446 | your data type. When writing, YAML I/O will provide the native type and | |
447 | your specialization must create a temporary llvm::StringRef. When reading, | |
1a4d82fc | 448 | YAML I/O will provide an llvm::StringRef of scalar and your specialization |
970d7e83 LB |
449 | must convert that to your native data type. An outline of a custom scalar type |
450 | looks like: | |
451 | ||
452 | .. code-block:: c++ | |
453 | ||
454 | using llvm::yaml::ScalarTraits; | |
455 | using llvm::yaml::IO; | |
456 | ||
457 | template <> | |
458 | struct ScalarTraits<MyCustomType> { | |
459 | static void output(const T &value, llvm::raw_ostream &out) { | |
460 | out << value; // do custom formatting here | |
461 | } | |
462 | static StringRef input(StringRef scalar, T &value) { | |
463 | // do custom parsing here. Return the empty string on success, | |
464 | // or an error message on failure. | |
1a4d82fc | 465 | return StringRef(); |
970d7e83 | 466 | } |
1a4d82fc JJ |
467 | // Determine if this scalar needs quotes. |
468 | static bool mustQuote(StringRef) { return true; } | |
970d7e83 LB |
469 | }; |
470 | ||
471 | ||
472 | Mappings | |
473 | ======== | |
474 | ||
475 | To be translated to or from a YAML mapping for your type T you must specialize | |
476 | llvm::yaml::MappingTraits on T and implement the "void mapping(IO &io, T&)" | |
477 | method. If your native data structures use pointers to a class everywhere, | |
478 | you can specialize on the class pointer. Examples: | |
479 | ||
480 | .. code-block:: c++ | |
481 | ||
482 | using llvm::yaml::MappingTraits; | |
483 | using llvm::yaml::IO; | |
484 | ||
485 | // Example of struct Foo which is used by value | |
486 | template <> | |
487 | struct MappingTraits<Foo> { | |
488 | static void mapping(IO &io, Foo &foo) { | |
489 | io.mapOptional("size", foo.size); | |
490 | ... | |
491 | } | |
492 | }; | |
493 | ||
494 | // Example of struct Bar which is natively always a pointer | |
495 | template <> | |
496 | struct MappingTraits<Bar*> { | |
497 | static void mapping(IO &io, Bar *&bar) { | |
498 | io.mapOptional("size", bar->size); | |
499 | ... | |
500 | } | |
501 | }; | |
502 | ||
503 | ||
504 | No Normalization | |
505 | ---------------- | |
506 | ||
507 | The mapping() method is responsible, if needed, for normalizing and | |
508 | denormalizing. In a simple case where the native data structure requires no | |
509 | normalization, the mapping method just uses mapOptional() or mapRequired() to | |
510 | bind the struct's fields to YAML key names. For example: | |
511 | ||
512 | .. code-block:: c++ | |
513 | ||
514 | using llvm::yaml::MappingTraits; | |
515 | using llvm::yaml::IO; | |
516 | ||
517 | template <> | |
518 | struct MappingTraits<Person> { | |
519 | static void mapping(IO &io, Person &info) { | |
520 | io.mapRequired("name", info.name); | |
521 | io.mapOptional("hat-size", info.hatSize); | |
522 | } | |
523 | }; | |
524 | ||
525 | ||
526 | Normalization | |
527 | ---------------- | |
528 | ||
529 | When [de]normalization is required, the mapping() method needs a way to access | |
530 | normalized values as fields. To help with this, there is | |
531 | a template MappingNormalization<> which you can then use to automatically | |
532 | do the normalization and denormalization. The template is used to create | |
533 | a local variable in your mapping() method which contains the normalized keys. | |
534 | ||
535 | Suppose you have native data type | |
536 | Polar which specifies a position in polar coordinates (distance, angle): | |
537 | ||
538 | .. code-block:: c++ | |
539 | ||
540 | struct Polar { | |
541 | float distance; | |
542 | float angle; | |
543 | }; | |
544 | ||
545 | but you've decided the normalized YAML for should be in x,y coordinates. That | |
546 | is, you want the yaml to look like: | |
547 | ||
548 | .. code-block:: yaml | |
549 | ||
550 | x: 10.3 | |
551 | y: -4.7 | |
552 | ||
553 | You can support this by defining a MappingTraits that normalizes the polar | |
554 | coordinates to x,y coordinates when writing YAML and denormalizes x,y | |
555 | coordinates into polar when reading YAML. | |
556 | ||
557 | .. code-block:: c++ | |
558 | ||
559 | using llvm::yaml::MappingTraits; | |
560 | using llvm::yaml::IO; | |
561 | ||
562 | template <> | |
563 | struct MappingTraits<Polar> { | |
564 | ||
565 | class NormalizedPolar { | |
566 | public: | |
567 | NormalizedPolar(IO &io) | |
568 | : x(0.0), y(0.0) { | |
569 | } | |
570 | NormalizedPolar(IO &, Polar &polar) | |
571 | : x(polar.distance * cos(polar.angle)), | |
572 | y(polar.distance * sin(polar.angle)) { | |
573 | } | |
574 | Polar denormalize(IO &) { | |
1a4d82fc | 575 | return Polar(sqrt(x*x+y*y), arctan(x,y)); |
970d7e83 LB |
576 | } |
577 | ||
578 | float x; | |
579 | float y; | |
580 | }; | |
581 | ||
582 | static void mapping(IO &io, Polar &polar) { | |
583 | MappingNormalization<NormalizedPolar, Polar> keys(io, polar); | |
584 | ||
585 | io.mapRequired("x", keys->x); | |
586 | io.mapRequired("y", keys->y); | |
587 | } | |
588 | }; | |
589 | ||
590 | When writing YAML, the local variable "keys" will be a stack allocated | |
1a4d82fc | 591 | instance of NormalizedPolar, constructed from the supplied polar object which |
970d7e83 LB |
592 | initializes it x and y fields. The mapRequired() methods then write out the x |
593 | and y values as key/value pairs. | |
594 | ||
595 | When reading YAML, the local variable "keys" will be a stack allocated instance | |
596 | of NormalizedPolar, constructed by the empty constructor. The mapRequired | |
597 | methods will find the matching key in the YAML document and fill in the x and y | |
598 | fields of the NormalizedPolar object keys. At the end of the mapping() method | |
599 | when the local keys variable goes out of scope, the denormalize() method will | |
600 | automatically be called to convert the read values back to polar coordinates, | |
601 | and then assigned back to the second parameter to mapping(). | |
602 | ||
603 | In some cases, the normalized class may be a subclass of the native type and | |
604 | could be returned by the denormalize() method, except that the temporary | |
605 | normalized instance is stack allocated. In these cases, the utility template | |
606 | MappingNormalizationHeap<> can be used instead. It just like | |
607 | MappingNormalization<> except that it heap allocates the normalized object | |
608 | when reading YAML. It never destroys the normalized object. The denormalize() | |
609 | method can this return "this". | |
610 | ||
611 | ||
612 | Default values | |
613 | -------------- | |
614 | Within a mapping() method, calls to io.mapRequired() mean that that key is | |
615 | required to exist when parsing YAML documents, otherwise YAML I/O will issue an | |
616 | error. | |
617 | ||
618 | On the other hand, keys registered with io.mapOptional() are allowed to not | |
619 | exist in the YAML document being read. So what value is put in the field | |
620 | for those optional keys? | |
621 | There are two steps to how those optional fields are filled in. First, the | |
622 | second parameter to the mapping() method is a reference to a native class. That | |
623 | native class must have a default constructor. Whatever value the default | |
624 | constructor initially sets for an optional field will be that field's value. | |
625 | Second, the mapOptional() method has an optional third parameter. If provided | |
626 | it is the value that mapOptional() should set that field to if the YAML document | |
627 | does not have that key. | |
628 | ||
629 | There is one important difference between those two ways (default constructor | |
630 | and third parameter to mapOptional). When YAML I/O generates a YAML document, | |
631 | if the mapOptional() third parameter is used, if the actual value being written | |
632 | is the same as (using ==) the default value, then that key/value is not written. | |
633 | ||
634 | ||
635 | Order of Keys | |
636 | -------------- | |
637 | ||
638 | When writing out a YAML document, the keys are written in the order that the | |
639 | calls to mapRequired()/mapOptional() are made in the mapping() method. This | |
640 | gives you a chance to write the fields in an order that a human reader of | |
641 | the YAML document would find natural. This may be different that the order | |
642 | of the fields in the native class. | |
643 | ||
644 | When reading in a YAML document, the keys in the document can be in any order, | |
645 | but they are processed in the order that the calls to mapRequired()/mapOptional() | |
646 | are made in the mapping() method. That enables some interesting | |
647 | functionality. For instance, if the first field bound is the cpu and the second | |
648 | field bound is flags, and the flags are cpu specific, you can programmatically | |
649 | switch how the flags are converted to and from YAML based on the cpu. | |
650 | This works for both reading and writing. For example: | |
651 | ||
652 | .. code-block:: c++ | |
653 | ||
654 | using llvm::yaml::MappingTraits; | |
655 | using llvm::yaml::IO; | |
656 | ||
657 | struct Info { | |
658 | CPUs cpu; | |
659 | uint32_t flags; | |
660 | }; | |
661 | ||
662 | template <> | |
663 | struct MappingTraits<Info> { | |
664 | static void mapping(IO &io, Info &info) { | |
665 | io.mapRequired("cpu", info.cpu); | |
666 | // flags must come after cpu for this to work when reading yaml | |
667 | if ( info.cpu == cpu_x86_64 ) | |
668 | io.mapRequired("flags", *(My86_64Flags*)info.flags); | |
669 | else | |
670 | io.mapRequired("flags", *(My86Flags*)info.flags); | |
671 | } | |
672 | }; | |
673 | ||
674 | ||
1a4d82fc JJ |
675 | Tags |
676 | ---- | |
677 | ||
678 | The YAML syntax supports tags as a way to specify the type of a node before | |
679 | it is parsed. This allows dynamic types of nodes. But the YAML I/O model uses | |
680 | static typing, so there are limits to how you can use tags with the YAML I/O | |
681 | model. Recently, we added support to YAML I/O for checking/setting the optional | |
682 | tag on a map. Using this functionality it is even possbile to support different | |
683 | mappings, as long as they are convertable. | |
684 | ||
685 | To check a tag, inside your mapping() method you can use io.mapTag() to specify | |
686 | what the tag should be. This will also add that tag when writing yaml. | |
687 | ||
688 | Validation | |
689 | ---------- | |
690 | ||
691 | Sometimes in a yaml map, each key/value pair is valid, but the combination is | |
692 | not. This is similar to something having no syntax errors, but still having | |
693 | semantic errors. To support semantic level checking, YAML I/O allows | |
694 | an optional ``validate()`` method in a MappingTraits template specialization. | |
695 | ||
696 | When parsing yaml, the ``validate()`` method is call *after* all key/values in | |
697 | the map have been processed. Any error message returned by the ``validate()`` | |
698 | method during input will be printed just a like a syntax error would be printed. | |
699 | When writing yaml, the ``validate()`` method is called *before* the yaml | |
700 | key/values are written. Any error during output will trigger an ``assert()`` | |
701 | because it is a programming error to have invalid struct values. | |
702 | ||
703 | ||
704 | .. code-block:: c++ | |
705 | ||
706 | using llvm::yaml::MappingTraits; | |
707 | using llvm::yaml::IO; | |
708 | ||
709 | struct Stuff { | |
710 | ... | |
711 | }; | |
712 | ||
713 | template <> | |
714 | struct MappingTraits<Stuff> { | |
715 | static void mapping(IO &io, Stuff &stuff) { | |
716 | ... | |
717 | } | |
718 | static StringRef validate(IO &io, Stuff &stuff) { | |
719 | // Look at all fields in 'stuff' and if there | |
720 | // are any bad values return a string describing | |
721 | // the error. Otherwise return an empty string. | |
722 | return StringRef(); | |
723 | } | |
724 | }; | |
725 | ||
726 | ||
970d7e83 LB |
727 | Sequence |
728 | ======== | |
729 | ||
730 | To be translated to or from a YAML sequence for your type T you must specialize | |
731 | llvm::yaml::SequenceTraits on T and implement two methods: | |
732 | ``size_t size(IO &io, T&)`` and | |
733 | ``T::value_type& element(IO &io, T&, size_t indx)``. For example: | |
734 | ||
735 | .. code-block:: c++ | |
736 | ||
737 | template <> | |
738 | struct SequenceTraits<MySeq> { | |
739 | static size_t size(IO &io, MySeq &list) { ... } | |
1a4d82fc | 740 | static MySeqEl &element(IO &io, MySeq &list, size_t index) { ... } |
970d7e83 LB |
741 | }; |
742 | ||
743 | The size() method returns how many elements are currently in your sequence. | |
744 | The element() method returns a reference to the i'th element in the sequence. | |
745 | When parsing YAML, the element() method may be called with an index one bigger | |
746 | than the current size. Your element() method should allocate space for one | |
747 | more element (using default constructor if element is a C++ object) and returns | |
748 | a reference to that new allocated space. | |
749 | ||
750 | ||
751 | Flow Sequence | |
752 | ------------- | |
753 | A YAML "flow sequence" is a sequence that when written to YAML it uses the | |
754 | inline notation (e.g [ foo, bar ] ). To specify that a sequence type should | |
755 | be written in YAML as a flow sequence, your SequenceTraits specialization should | |
756 | add "static const bool flow = true;". For instance: | |
757 | ||
758 | .. code-block:: c++ | |
759 | ||
760 | template <> | |
761 | struct SequenceTraits<MyList> { | |
762 | static size_t size(IO &io, MyList &list) { ... } | |
1a4d82fc | 763 | static MyListEl &element(IO &io, MyList &list, size_t index) { ... } |
970d7e83 LB |
764 | |
765 | // The existence of this member causes YAML I/O to use a flow sequence | |
766 | static const bool flow = true; | |
767 | }; | |
768 | ||
769 | With the above, if you used MyList as the data type in your native data | |
770 | structures, then then when converted to YAML, a flow sequence of integers | |
771 | will be used (e.g. [ 10, -3, 4 ]). | |
772 | ||
773 | ||
774 | Utility Macros | |
775 | -------------- | |
776 | Since a common source of sequences is std::vector<>, YAML I/O provides macros: | |
777 | LLVM_YAML_IS_SEQUENCE_VECTOR() and LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR() which | |
778 | can be used to easily specify SequenceTraits<> on a std::vector type. YAML | |
779 | I/O does not partial specialize SequenceTraits on std::vector<> because that | |
780 | would force all vectors to be sequences. An example use of the macros: | |
781 | ||
782 | .. code-block:: c++ | |
783 | ||
784 | std::vector<MyType1>; | |
785 | std::vector<MyType2>; | |
786 | LLVM_YAML_IS_SEQUENCE_VECTOR(MyType1) | |
787 | LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR(MyType2) | |
788 | ||
789 | ||
790 | ||
791 | Document List | |
792 | ============= | |
793 | ||
794 | YAML allows you to define multiple "documents" in a single YAML file. Each | |
795 | new document starts with a left aligned "---" token. The end of all documents | |
796 | is denoted with a left aligned "..." token. Many users of YAML will never | |
797 | have need for multiple documents. The top level node in their YAML schema | |
798 | will be a mapping or sequence. For those cases, the following is not needed. | |
799 | But for cases where you do want multiple documents, you can specify a | |
800 | trait for you document list type. The trait has the same methods as | |
801 | SequenceTraits but is named DocumentListTraits. For example: | |
802 | ||
803 | .. code-block:: c++ | |
804 | ||
805 | template <> | |
806 | struct DocumentListTraits<MyDocList> { | |
807 | static size_t size(IO &io, MyDocList &list) { ... } | |
808 | static MyDocType element(IO &io, MyDocList &list, size_t index) { ... } | |
809 | }; | |
810 | ||
811 | ||
812 | User Context Data | |
813 | ================= | |
814 | When an llvm::yaml::Input or llvm::yaml::Output object is created their | |
815 | constructors take an optional "context" parameter. This is a pointer to | |
816 | whatever state information you might need. | |
817 | ||
818 | For instance, in a previous example we showed how the conversion type for a | |
819 | flags field could be determined at runtime based on the value of another field | |
820 | in the mapping. But what if an inner mapping needs to know some field value | |
821 | of an outer mapping? That is where the "context" parameter comes in. You | |
822 | can set values in the context in the outer map's mapping() method and | |
823 | retrieve those values in the inner map's mapping() method. | |
824 | ||
825 | The context value is just a void*. All your traits which use the context | |
826 | and operate on your native data types, need to agree what the context value | |
827 | actually is. It could be a pointer to an object or struct which your various | |
828 | traits use to shared context sensitive information. | |
829 | ||
830 | ||
831 | Output | |
832 | ====== | |
833 | ||
834 | The llvm::yaml::Output class is used to generate a YAML document from your | |
835 | in-memory data structures, using traits defined on your data types. | |
836 | To instantiate an Output object you need an llvm::raw_ostream, and optionally | |
837 | a context pointer: | |
838 | ||
839 | .. code-block:: c++ | |
840 | ||
841 | class Output : public IO { | |
842 | public: | |
843 | Output(llvm::raw_ostream &, void *context=NULL); | |
844 | ||
845 | Once you have an Output object, you can use the C++ stream operator on it | |
846 | to write your native data as YAML. One thing to recall is that a YAML file | |
847 | can contain multiple "documents". If the top level data structure you are | |
848 | streaming as YAML is a mapping, scalar, or sequence, then Output assumes you | |
849 | are generating one document and wraps the mapping output | |
850 | with "``---``" and trailing "``...``". | |
851 | ||
852 | .. code-block:: c++ | |
853 | ||
854 | using llvm::yaml::Output; | |
855 | ||
856 | void dumpMyMapDoc(const MyMapType &info) { | |
857 | Output yout(llvm::outs()); | |
858 | yout << info; | |
859 | } | |
860 | ||
861 | The above could produce output like: | |
862 | ||
863 | .. code-block:: yaml | |
864 | ||
865 | --- | |
866 | name: Tom | |
867 | hat-size: 7 | |
868 | ... | |
869 | ||
870 | On the other hand, if the top level data structure you are streaming as YAML | |
871 | has a DocumentListTraits specialization, then Output walks through each element | |
872 | of your DocumentList and generates a "---" before the start of each element | |
873 | and ends with a "...". | |
874 | ||
875 | .. code-block:: c++ | |
876 | ||
877 | using llvm::yaml::Output; | |
878 | ||
879 | void dumpMyMapDoc(const MyDocListType &docList) { | |
880 | Output yout(llvm::outs()); | |
881 | yout << docList; | |
882 | } | |
883 | ||
884 | The above could produce output like: | |
885 | ||
886 | .. code-block:: yaml | |
887 | ||
888 | --- | |
889 | name: Tom | |
890 | hat-size: 7 | |
891 | --- | |
892 | name: Tom | |
893 | shoe-size: 11 | |
894 | ... | |
895 | ||
896 | Input | |
897 | ===== | |
898 | ||
899 | The llvm::yaml::Input class is used to parse YAML document(s) into your native | |
900 | data structures. To instantiate an Input | |
901 | object you need a StringRef to the entire YAML file, and optionally a context | |
902 | pointer: | |
903 | ||
904 | .. code-block:: c++ | |
905 | ||
906 | class Input : public IO { | |
907 | public: | |
908 | Input(StringRef inputContent, void *context=NULL); | |
909 | ||
910 | Once you have an Input object, you can use the C++ stream operator to read | |
911 | the document(s). If you expect there might be multiple YAML documents in | |
912 | one file, you'll need to specialize DocumentListTraits on a list of your | |
913 | document type and stream in that document list type. Otherwise you can | |
914 | just stream in the document type. Also, you can check if there was | |
915 | any syntax errors in the YAML be calling the error() method on the Input | |
916 | object. For example: | |
917 | ||
918 | .. code-block:: c++ | |
919 | ||
920 | // Reading a single document | |
921 | using llvm::yaml::Input; | |
922 | ||
923 | Input yin(mb.getBuffer()); | |
924 | ||
925 | // Parse the YAML file | |
926 | MyDocType theDoc; | |
927 | yin >> theDoc; | |
928 | ||
929 | // Check for error | |
930 | if ( yin.error() ) | |
931 | return; | |
932 | ||
933 | ||
934 | .. code-block:: c++ | |
935 | ||
936 | // Reading multiple documents in one file | |
937 | using llvm::yaml::Input; | |
938 | ||
939 | LLVM_YAML_IS_DOCUMENT_LIST_VECTOR(std::vector<MyDocType>) | |
940 | ||
941 | Input yin(mb.getBuffer()); | |
942 | ||
943 | // Parse the YAML file | |
944 | std::vector<MyDocType> theDocList; | |
945 | yin >> theDocList; | |
946 | ||
947 | // Check for error | |
948 | if ( yin.error() ) | |
949 | return; | |
950 | ||
951 |