]> git.proxmox.com Git - ceph.git/blob - ceph/src/s3select/rapidjson/doc/stream.md
update ceph source to reef 18.1.2
[ceph.git] / ceph / src / s3select / rapidjson / doc / stream.md
1 # Stream
2
3 In RapidJSON, `rapidjson::Stream` is a concept for reading/writing JSON. Here we'll first show you how to use provided streams. And then see how to create a custom stream.
4
5 [TOC]
6
7 # Memory Streams {#MemoryStreams}
8
9 Memory streams store JSON in memory.
10
11 ## StringStream (Input) {#StringStream}
12
13 `StringStream` is the most basic input stream. It represents a complete, read-only JSON stored in memory. It is defined in `rapidjson/rapidjson.h`.
14
15 ~~~~~~~~~~cpp
16 #include "rapidjson/document.h" // will include "rapidjson/rapidjson.h"
17
18 using namespace rapidjson;
19
20 // ...
21 const char json[] = "[1, 2, 3, 4]";
22 StringStream s(json);
23
24 Document d;
25 d.ParseStream(s);
26 ~~~~~~~~~~
27
28 Since this is very common usage, `Document::Parse(const char*)` is provided to do exactly the same as above:
29
30 ~~~~~~~~~~cpp
31 // ...
32 const char json[] = "[1, 2, 3, 4]";
33 Document d;
34 d.Parse(json);
35 ~~~~~~~~~~
36
37 Note that, `StringStream` is a typedef of `GenericStringStream<UTF8<> >`, user may use another encodings to represent the character set of the stream.
38
39 ## StringBuffer (Output) {#StringBuffer}
40
41 `StringBuffer` is a simple output stream. It allocates a memory buffer for writing the whole JSON. Use `GetString()` to obtain the buffer.
42
43 ~~~~~~~~~~cpp
44 #include "rapidjson/stringbuffer.h"
45 #include <rapidjson/writer.h>
46
47 StringBuffer buffer;
48 Writer<StringBuffer> writer(buffer);
49 d.Accept(writer);
50
51 const char* output = buffer.GetString();
52 ~~~~~~~~~~
53
54 When the buffer is full, it will increases the capacity automatically. The default capacity is 256 characters (256 bytes for UTF8, 512 bytes for UTF16, etc.). User can provide an allocator and an initial capacity.
55
56 ~~~~~~~~~~cpp
57 StringBuffer buffer1(0, 1024); // Use its allocator, initial size = 1024
58 StringBuffer buffer2(allocator, 1024);
59 ~~~~~~~~~~
60
61 By default, `StringBuffer` will instantiate an internal allocator.
62
63 Similarly, `StringBuffer` is a typedef of `GenericStringBuffer<UTF8<> >`.
64
65 # File Streams {#FileStreams}
66
67 When parsing a JSON from file, you may read the whole JSON into memory and use ``StringStream`` above.
68
69 However, if the JSON is big, or memory is limited, you can use `FileReadStream`. It only read a part of JSON from file into buffer, and then let the part be parsed. If it runs out of characters in the buffer, it will read the next part from file.
70
71 ## FileReadStream (Input) {#FileReadStream}
72
73 `FileReadStream` reads the file via a `FILE` pointer. And user need to provide a buffer.
74
75 ~~~~~~~~~~cpp
76 #include "rapidjson/filereadstream.h"
77 #include <cstdio>
78
79 using namespace rapidjson;
80
81 FILE* fp = fopen("big.json", "rb"); // non-Windows use "r"
82
83 char readBuffer[65536];
84 FileReadStream is(fp, readBuffer, sizeof(readBuffer));
85
86 Document d;
87 d.ParseStream(is);
88
89 fclose(fp);
90 ~~~~~~~~~~
91
92 Different from string streams, `FileReadStream` is byte stream. It does not handle encodings. If the file is not UTF-8, the byte stream can be wrapped in a `EncodedInputStream`. We will discuss more about this later in this tutorial.
93
94 Apart from reading file, user can also use `FileReadStream` to read `stdin`.
95
96 ## FileWriteStream (Output) {#FileWriteStream}
97
98 `FileWriteStream` is buffered output stream. Its usage is very similar to `FileReadStream`.
99
100 ~~~~~~~~~~cpp
101 #include "rapidjson/filewritestream.h"
102 #include <rapidjson/writer.h>
103 #include <cstdio>
104
105 using namespace rapidjson;
106
107 Document d;
108 d.Parse(json);
109 // ...
110
111 FILE* fp = fopen("output.json", "wb"); // non-Windows use "w"
112
113 char writeBuffer[65536];
114 FileWriteStream os(fp, writeBuffer, sizeof(writeBuffer));
115
116 Writer<FileWriteStream> writer(os);
117 d.Accept(writer);
118
119 fclose(fp);
120 ~~~~~~~~~~
121
122 It can also redirect the output to `stdout`.
123
124 # iostream Wrapper {#iostreamWrapper}
125
126 Due to users' requests, RapidJSON also provides official wrappers for `std::basic_istream` and `std::basic_ostream`. However, please note that the performance will be much lower than the other streams above.
127
128 ## IStreamWrapper {#IStreamWrapper}
129
130 `IStreamWrapper` wraps any class derived from `std::istream`, such as `std::istringstream`, `std::stringstream`, `std::ifstream`, `std::fstream`, into RapidJSON's input stream.
131
132 ~~~cpp
133 #include <rapidjson/document.h>
134 #include <rapidjson/istreamwrapper.h>
135 #include <fstream>
136
137 using namespace rapidjson;
138 using namespace std;
139
140 ifstream ifs("test.json");
141 IStreamWrapper isw(ifs);
142
143 Document d;
144 d.ParseStream(isw);
145 ~~~
146
147 For classes derived from `std::wistream`, use `WIStreamWrapper`.
148
149 ## OStreamWrapper {#OStreamWrapper}
150
151 Similarly, `OStreamWrapper` wraps any class derived from `std::ostream`, such as `std::ostringstream`, `std::stringstream`, `std::ofstream`, `std::fstream`, into RapidJSON's input stream.
152
153 ~~~cpp
154 #include <rapidjson/document.h>
155 #include <rapidjson/ostreamwrapper.h>
156 #include <rapidjson/writer.h>
157 #include <fstream>
158
159 using namespace rapidjson;
160 using namespace std;
161
162 Document d;
163 d.Parse(json);
164
165 // ...
166
167 ofstream ofs("output.json");
168 OStreamWrapper osw(ofs);
169
170 Writer<OStreamWrapper> writer(osw);
171 d.Accept(writer);
172 ~~~
173
174 For classes derived from `std::wostream`, use `WOStreamWrapper`.
175
176 # Encoded Streams {#EncodedStreams}
177
178 Encoded streams do not contain JSON itself, but they wrap byte streams to provide basic encoding/decoding function.
179
180 As mentioned above, UTF-8 byte streams can be read directly. However, UTF-16 and UTF-32 have endian issue. To handle endian correctly, it needs to convert bytes into characters (e.g. `wchar_t` for UTF-16) while reading, and characters into bytes while writing.
181
182 Besides, it also need to handle [byte order mark (BOM)](http://en.wikipedia.org/wiki/Byte_order_mark). When reading from a byte stream, it is needed to detect or just consume the BOM if exists. When writing to a byte stream, it can optionally write BOM.
183
184 If the encoding of stream is known during compile-time, you may use `EncodedInputStream` and `EncodedOutputStream`. If the stream can be UTF-8, UTF-16LE, UTF-16BE, UTF-32LE, UTF-32BE JSON, and it is only known in runtime, you may use `AutoUTFInputStream` and `AutoUTFOutputStream`. These streams are defined in `rapidjson/encodedstream.h`.
185
186 Note that, these encoded streams can be applied to streams other than file. For example, you may have a file in memory, or a custom byte stream, be wrapped in encoded streams.
187
188 ## EncodedInputStream {#EncodedInputStream}
189
190 `EncodedInputStream` has two template parameters. The first one is a `Encoding` class, such as `UTF8`, `UTF16LE`, defined in `rapidjson/encodings.h`. The second one is the class of stream to be wrapped.
191
192 ~~~~~~~~~~cpp
193 #include "rapidjson/document.h"
194 #include "rapidjson/filereadstream.h" // FileReadStream
195 #include "rapidjson/encodedstream.h" // EncodedInputStream
196 #include <cstdio>
197
198 using namespace rapidjson;
199
200 FILE* fp = fopen("utf16le.json", "rb"); // non-Windows use "r"
201
202 char readBuffer[256];
203 FileReadStream bis(fp, readBuffer, sizeof(readBuffer));
204
205 EncodedInputStream<UTF16LE<>, FileReadStream> eis(bis); // wraps bis into eis
206
207 Document d; // Document is GenericDocument<UTF8<> >
208 d.ParseStream<0, UTF16LE<> >(eis); // Parses UTF-16LE file into UTF-8 in memory
209
210 fclose(fp);
211 ~~~~~~~~~~
212
213 ## EncodedOutputStream {#EncodedOutputStream}
214
215 `EncodedOutputStream` is similar but it has a `bool putBOM` parameter in the constructor, controlling whether to write BOM into output byte stream.
216
217 ~~~~~~~~~~cpp
218 #include "rapidjson/filewritestream.h" // FileWriteStream
219 #include "rapidjson/encodedstream.h" // EncodedOutputStream
220 #include <rapidjson/writer.h>
221 #include <cstdio>
222
223 Document d; // Document is GenericDocument<UTF8<> >
224 // ...
225
226 FILE* fp = fopen("output_utf32le.json", "wb"); // non-Windows use "w"
227
228 char writeBuffer[256];
229 FileWriteStream bos(fp, writeBuffer, sizeof(writeBuffer));
230
231 typedef EncodedOutputStream<UTF32LE<>, FileWriteStream> OutputStream;
232 OutputStream eos(bos, true); // Write BOM
233
234 Writer<OutputStream, UTF8<>, UTF32LE<>> writer(eos);
235 d.Accept(writer); // This generates UTF32-LE file from UTF-8 in memory
236
237 fclose(fp);
238 ~~~~~~~~~~
239
240 ## AutoUTFInputStream {#AutoUTFInputStream}
241
242 Sometimes an application may want to handle all supported JSON encoding. `AutoUTFInputStream` will detection encoding by BOM first. If BOM is unavailable, it will use characteristics of valid JSON to make detection. If neither method success, it falls back to the UTF type provided in constructor.
243
244 Since the characters (code units) may be 8-bit, 16-bit or 32-bit. `AutoUTFInputStream` requires a character type which can hold at least 32-bit. We may use `unsigned`, as in the template parameter:
245
246 ~~~~~~~~~~cpp
247 #include "rapidjson/document.h"
248 #include "rapidjson/filereadstream.h" // FileReadStream
249 #include "rapidjson/encodedstream.h" // AutoUTFInputStream
250 #include <cstdio>
251
252 using namespace rapidjson;
253
254 FILE* fp = fopen("any.json", "rb"); // non-Windows use "r"
255
256 char readBuffer[256];
257 FileReadStream bis(fp, readBuffer, sizeof(readBuffer));
258
259 AutoUTFInputStream<unsigned, FileReadStream> eis(bis); // wraps bis into eis
260
261 Document d; // Document is GenericDocument<UTF8<> >
262 d.ParseStream<0, AutoUTF<unsigned> >(eis); // This parses any UTF file into UTF-8 in memory
263
264 fclose(fp);
265 ~~~~~~~~~~
266
267 When specifying the encoding of stream, uses `AutoUTF<CharType>` as in `ParseStream()` above.
268
269 You can obtain the type of UTF via `UTFType GetType()`. And check whether a BOM is found by `HasBOM()`
270
271 ## AutoUTFOutputStream {#AutoUTFOutputStream}
272
273 Similarly, to choose encoding for output during runtime, we can use `AutoUTFOutputStream`. This class is not automatic *per se*. You need to specify the UTF type and whether to write BOM in runtime.
274
275 ~~~~~~~~~~cpp
276 using namespace rapidjson;
277
278 void WriteJSONFile(FILE* fp, UTFType type, bool putBOM, const Document& d) {
279 char writeBuffer[256];
280 FileWriteStream bos(fp, writeBuffer, sizeof(writeBuffer));
281
282 typedef AutoUTFOutputStream<unsigned, FileWriteStream> OutputStream;
283 OutputStream eos(bos, type, putBOM);
284
285 Writer<OutputStream, UTF8<>, AutoUTF<> > writer;
286 d.Accept(writer);
287 }
288 ~~~~~~~~~~
289
290 `AutoUTFInputStream` and `AutoUTFOutputStream` is more convenient than `EncodedInputStream` and `EncodedOutputStream`. They just incur a little bit runtime overheads.
291
292 # Custom Stream {#CustomStream}
293
294 In addition to memory/file streams, user can create their own stream classes which fits RapidJSON's API. For example, you may create network stream, stream from compressed file, etc.
295
296 RapidJSON combines different types using templates. A class containing all required interface can be a stream. The Stream interface is defined in comments of `rapidjson/rapidjson.h`:
297
298 ~~~~~~~~~~cpp
299 concept Stream {
300 typename Ch; //!< Character type of the stream.
301
302 //! Read the current character from stream without moving the read cursor.
303 Ch Peek() const;
304
305 //! Read the current character from stream and moving the read cursor to next character.
306 Ch Take();
307
308 //! Get the current read cursor.
309 //! \return Number of characters read from start.
310 size_t Tell();
311
312 //! Begin writing operation at the current read pointer.
313 //! \return The begin writer pointer.
314 Ch* PutBegin();
315
316 //! Write a character.
317 void Put(Ch c);
318
319 //! Flush the buffer.
320 void Flush();
321
322 //! End the writing operation.
323 //! \param begin The begin write pointer returned by PutBegin().
324 //! \return Number of characters written.
325 size_t PutEnd(Ch* begin);
326 }
327 ~~~~~~~~~~
328
329 For input stream, they must implement `Peek()`, `Take()` and `Tell()`.
330 For output stream, they must implement `Put()` and `Flush()`.
331 There are two special interface, `PutBegin()` and `PutEnd()`, which are only for *in situ* parsing. Normal streams do not implement them. However, if the interface is not needed for a particular stream, it is still need to a dummy implementation, otherwise will generate compilation error.
332
333 ## Example: istream wrapper {#ExampleIStreamWrapper}
334
335 The following example is a simple wrapper of `std::istream`, which only implements 3 functions.
336
337 ~~~~~~~~~~cpp
338 class MyIStreamWrapper {
339 public:
340 typedef char Ch;
341
342 MyIStreamWrapper(std::istream& is) : is_(is) {
343 }
344
345 Ch Peek() const { // 1
346 int c = is_.peek();
347 return c == std::char_traits<char>::eof() ? '\0' : (Ch)c;
348 }
349
350 Ch Take() { // 2
351 int c = is_.get();
352 return c == std::char_traits<char>::eof() ? '\0' : (Ch)c;
353 }
354
355 size_t Tell() const { return (size_t)is_.tellg(); } // 3
356
357 Ch* PutBegin() { assert(false); return 0; }
358 void Put(Ch) { assert(false); }
359 void Flush() { assert(false); }
360 size_t PutEnd(Ch*) { assert(false); return 0; }
361
362 private:
363 MyIStreamWrapper(const MyIStreamWrapper&);
364 MyIStreamWrapper& operator=(const MyIStreamWrapper&);
365
366 std::istream& is_;
367 };
368 ~~~~~~~~~~
369
370 User can use it to wrap instances of `std::stringstream`, `std::ifstream`.
371
372 ~~~~~~~~~~cpp
373 const char* json = "[1,2,3,4]";
374 std::stringstream ss(json);
375 MyIStreamWrapper is(ss);
376
377 Document d;
378 d.ParseStream(is);
379 ~~~~~~~~~~
380
381 Note that, this implementation may not be as efficient as RapidJSON's memory or file streams, due to internal overheads of the standard library.
382
383 ## Example: ostream wrapper {#ExampleOStreamWrapper}
384
385 The following example is a simple wrapper of `std::istream`, which only implements 2 functions.
386
387 ~~~~~~~~~~cpp
388 class MyOStreamWrapper {
389 public:
390 typedef char Ch;
391
392 MyOStreamWrapper(std::ostream& os) : os_(os) {
393 }
394
395 Ch Peek() const { assert(false); return '\0'; }
396 Ch Take() { assert(false); return '\0'; }
397 size_t Tell() const { }
398
399 Ch* PutBegin() { assert(false); return 0; }
400 void Put(Ch c) { os_.put(c); } // 1
401 void Flush() { os_.flush(); } // 2
402 size_t PutEnd(Ch*) { assert(false); return 0; }
403
404 private:
405 MyOStreamWrapper(const MyOStreamWrapper&);
406 MyOStreamWrapper& operator=(const MyOStreamWrapper&);
407
408 std::ostream& os_;
409 };
410 ~~~~~~~~~~
411
412 User can use it to wrap instances of `std::stringstream`, `std::ofstream`.
413
414 ~~~~~~~~~~cpp
415 Document d;
416 // ...
417
418 std::stringstream ss;
419 MyOStreamWrapper os(ss);
420
421 Writer<MyOStreamWrapper> writer(os);
422 d.Accept(writer);
423 ~~~~~~~~~~
424
425 Note that, this implementation may not be as efficient as RapidJSON's memory or file streams, due to internal overheads of the standard library.
426
427 # Summary {#Summary}
428
429 This section describes stream classes available in RapidJSON. Memory streams are simple. File stream can reduce the memory required during JSON parsing and generation, if the JSON is stored in file system. Encoded streams converts between byte streams and character streams. Finally, user may create custom streams using a simple interface.