2 // Copyright (c) 2019 Vinnie Falco (vinnie.falco@gmail.com)
4 // Distributed under the Boost Software License, Version 1.0. (See accompanying
5 // file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
7 // Official repository: https://github.com/boostorg/json
10 #ifndef BOOST_JSON_STREAM_PARSER_HPP
11 #define BOOST_JSON_STREAM_PARSER_HPP
13 #include <boost/json/detail/config.hpp>
14 #include <boost/json/basic_parser.hpp>
15 #include <boost/json/parse_options.hpp>
16 #include <boost/json/storage_ptr.hpp>
17 #include <boost/json/value.hpp>
18 #include <boost/json/detail/handler.hpp>
19 #include <type_traits>
24 //----------------------------------------------------------
26 /** A DOM parser for JSON contained in multiple buffers.
28 This class is used to parse a JSON contained in a
29 series of one or more character buffers, into a
30 @ref value container. It implements a
31 <a href="https://en.wikipedia.org/wiki/Streaming_algorithm">
32 <em>streaming algorithm</em></a>, allowing these
35 @li Parse a JSON file a piece at a time.
37 @li Parse incoming JSON as it arrives,
40 @li Parse with bounded resource consumption
45 To use the parser first construct it, then optionally
46 call @ref reset to specify a @ref storage_ptr to use
47 for the resulting @ref value. Then call @ref write
48 one or more times to parse a single, complete JSON.
49 Call @ref done to determine if the parse has completed.
50 To indicate there are no more buffers, call @ref finish.
51 If the parse is successful, call @ref release to take
52 ownership of the value:
55 stream_parser p; // construct a parser
56 p.write( "[1,2" ); // parse some of a JSON
57 p.write( ",3,4]" ); // parse the rest of the JSON
58 assert( p.done() ); // we have a complete JSON
59 value jv = p.release(); // take ownership of the value
64 When the character buffer provided as input contains
65 additional data that is not part of the complete
66 JSON, an error is returned. The @ref write_some
67 function is an alternative which allows the parse
68 to finish early, without consuming all the characters
69 in the buffer. This allows parsing of a buffer
70 containing multiple individual JSONs or containing
71 different protocol data:
73 stream_parser p; // construct a parser
74 std::size_t n; // number of characters used
75 n = p.write_some( "[1,2" ); // parse some of a JSON
76 assert( n == 4 ); // all characters consumed
77 n = p.write_some( ",3,4] null" ); // parse the remainder of the JSON
78 assert( n == 6 ); // only some characters consumed
79 assert( p.done() ); // we have a complete JSON
80 value jv = p.release(); // take ownership of the value
83 @par Temporary Storage
85 The parser may dynamically allocate temporary
86 storage as needed to accommodate the nesting level
87 of the JSON being parsed. Temporary storage is
88 first obtained from an optional, caller-owned
89 buffer specified upon construction. When that
90 is exhausted, the next allocation uses the
91 @ref memory_resource passed to the constructor; if
92 no such argument is specified, the default memory
93 resource is used. Temporary storage is freed only
94 when the parser is destroyed; The performance of
95 parsing multiple JSONs may be improved by reusing
96 the same parser instance.
98 It is important to note that the @ref memory_resource
99 supplied upon construction is used for temporary
100 storage only, and not for allocating the elements
101 which make up the parsed value. That other memory
102 resource is optionally supplied in each call
107 If there are object elements with duplicate keys;
108 that is, if multiple elements in an object have
109 keys that compare equal, only the last equivalent
110 element will be inserted.
112 @par Non-Standard JSON
114 The @ref parse_options structure optionally
115 provided upon construction is used to customize
116 some parameters of the parser, including which
117 non-standard JSON extensions should be allowed.
118 A default-constructed parse options allows only
123 Distinct instances may be accessed concurrently.
124 Non-const member functions of a shared instance
125 may not be called concurrently with any other
126 member functions of that instance.
135 basic_parser<detail::handler> p_;
138 /// Copy constructor (deleted)
140 stream_parser const&) = delete;
142 /// Copy assignment (deleted)
143 stream_parser& operator=(
144 stream_parser const&) = delete;
148 All dynamically allocated memory, including
149 any incomplete parsing results, is freed.
152 Linear in the size of partial results
154 @par Exception Safety
157 ~stream_parser() = default;
161 This constructs a new parser which first uses
162 the caller-owned storage pointed to by `buffer`
163 for temporary storage, falling back to the memory
164 resource `sp` if needed. The parser will use the
165 specified parsing options.
167 The parsed value will use the default memory
168 resource for storage. To use a different resource,
169 call @ref reset after construction.
174 @par Exception Safety
177 @param sp The memory resource to use for
178 temporary storage after `buffer` is exhausted.
180 @param opt The parsing options to use.
182 @param buffer A pointer to valid memory of at least
183 `size` bytes for the parser to use for temporary storage.
184 Ownership is not transferred, the caller is responsible
185 for ensuring the lifetime of the memory pointed to by
186 `buffer` extends until the parser is destroyed.
188 @param size The number of valid bytes in `buffer`.
193 parse_options const& opt,
194 unsigned char* buffer,
195 std::size_t size) noexcept;
199 This constructs a new parser which uses the default
200 memory resource for temporary storage, and accepts
203 The parsed value will use the default memory
204 resource for storage. To use a different resource,
205 call @ref reset after construction.
210 @par Exception Safety
213 stream_parser() noexcept
214 : stream_parser({}, {})
220 This constructs a new parser which uses the
221 specified memory resource for temporary storage,
222 and is configured to use the specified parsing
225 The parsed value will use the default memory
226 resource for storage. To use a different resource,
227 call @ref reset after construction.
232 @par Exception Safety
235 @param sp The memory resource to use for temporary storage.
237 @param opt The parsing options to use.
242 parse_options const& opt) noexcept;
246 This constructs a new parser which uses the
247 specified memory resource for temporary storage,
248 and accepts only strict JSON.
250 The parsed value will use the default memory
251 resource for storage. To use a different resource,
252 call @ref reset after construction.
257 @par Exception Safety
260 @param sp The memory resource to use for temporary storage.
263 stream_parser(storage_ptr sp) noexcept
264 : stream_parser(std::move(sp), {})
270 This constructs a new parser which first uses the
271 caller-owned storage `buffer` for temporary storage,
272 falling back to the memory resource `sp` if needed.
273 The parser will use the specified parsing options.
275 The parsed value will use the default memory
276 resource for storage. To use a different resource,
277 call @ref reset after construction.
282 @par Exception Safety
285 @param sp The memory resource to use for
286 temporary storage after `buffer` is exhausted.
288 @param opt The parsing options to use.
290 @param buffer A buffer for the parser to use for
291 temporary storage. Ownership is not transferred,
292 the caller is responsible for ensuring the lifetime
293 of `buffer` extends until the parser is destroyed.
295 template<std::size_t N>
298 parse_options const& opt,
299 unsigned char(&buffer)[N]) noexcept
300 : stream_parser(std::move(sp),
305 #if defined(__cpp_lib_byte) || defined(BOOST_JSON_DOCS)
308 This constructs a new parser which first uses
309 the caller-owned storage pointed to by `buffer`
310 for temporary storage, falling back to the memory
311 resource `sp` if needed. The parser will use the
312 specified parsing options.
314 The parsed value will use the default memory
315 resource for storage. To use a different resource,
316 call @ref reset after construction.
321 @par Exception Safety
324 @param sp The memory resource to use for
325 temporary storage after `buffer` is exhausted.
327 @param opt The parsing options to use.
329 @param buffer A pointer to valid memory of at least
330 `size` bytes for the parser to use for temporary storage.
331 Ownership is not transferred, the caller is responsible
332 for ensuring the lifetime of the memory pointed to by
333 `buffer` extends until the parser is destroyed.
335 @param size The number of valid bytes in `buffer`.
339 parse_options const& opt,
341 std::size_t size) noexcept
342 : stream_parser(sp, opt, reinterpret_cast<
343 unsigned char*>(buffer), size)
349 This constructs a new parser which first uses the
350 caller-owned storage `buffer` for temporary storage,
351 falling back to the memory resource `sp` if needed.
352 The parser will use the specified parsing options.
354 The parsed value will use the default memory
355 resource for storage. To use a different resource,
356 call @ref reset after construction.
361 @par Exception Safety
364 @param sp The memory resource to use for
365 temporary storage after `buffer` is exhausted.
367 @param opt The parsing options to use.
369 @param buffer A buffer for the parser to use for
370 temporary storage. Ownership is not transferred,
371 the caller is responsible for ensuring the lifetime
372 of `buffer` extends until the parser is destroyed.
374 template<std::size_t N>
377 parse_options const& opt,
378 std::byte(&buffer)[N]) noexcept
379 : stream_parser(std::move(sp),
385 #ifndef BOOST_JSON_DOCS
386 // Safety net for accidental buffer overflows
387 template<std::size_t N>
390 parse_options const& opt,
391 unsigned char(&buffer)[N],
392 std::size_t n) noexcept
393 : stream_parser(std::move(sp),
396 // If this goes off, check your parameters
397 // closely, chances are you passed an array
398 // thinking it was a pointer.
399 BOOST_ASSERT(n <= N);
402 #ifdef __cpp_lib_byte
403 // Safety net for accidental buffer overflows
404 template<std::size_t N>
407 parse_options const& opt,
408 std::byte(&buffer)[N], std::size_t n) noexcept
409 : stream_parser(std::move(sp),
412 // If this goes off, check your parameters
413 // closely, chances are you passed an array
414 // thinking it was a pointer.
415 BOOST_ASSERT(n <= N);
420 /** Reset the parser for a new JSON.
422 This function is used to reset the parser to
423 prepare it for parsing a new complete JSON.
424 Any previous partial results are destroyed.
427 Constant or linear in the size of any previous
428 partial parsing results.
430 @par Exception Safety
433 @param sp A pointer to the @ref memory_resource
434 to use for the resulting @ref value. The parser
435 will acquire shared ownership.
439 reset(storage_ptr sp = {}) noexcept;
441 /** Return true if a complete JSON has been parsed.
443 This function returns `true` when all of these
446 @li A complete serialized JSON has been
447 presented to the parser, and
449 @li No error has occurred since the parser
450 was constructed, or since the last call
456 @par Exception Safety
460 done() const noexcept
465 /** Parse a buffer containing all or part of a complete JSON.
467 This function parses JSON contained in the
468 specified character buffer. If parsing completes,
469 any additional characters past the end of the
470 complete JSON are ignored. The function returns the
471 actual number of characters parsed, which may be
472 less than the size of the input. This allows parsing
473 of a buffer containing multiple individual JSONs or
474 containing different protocol data.
478 stream_parser p; // construct a parser
479 std::size_t n; // number of characters used
480 n = p.write_some( "[1,2" ); // parse the first part of the JSON
481 assert( n == 4 ); // all characters consumed
482 n = p.write_some( "3,4] null" ); // parse the rest of the JSON
483 assert( n == 5 ); // only some characters consumed
484 value jv = p.release(); // take ownership of the value
489 To indicate there are no more character buffers,
490 such as when @ref done returns `false` after
491 writing, call @ref finish.
496 @par Exception Safety
498 Calls to `memory_resource::allocate` may throw.
499 Upon error or exception, subsequent calls will
500 fail until @ref reset is called to parse a new JSON.
502 @return The number of characters consumed from
505 @param data A pointer to a buffer of `size`
508 @param size The number of characters pointed to
511 @param ec Set to the error, if any occurred.
520 /** Parse a buffer containing all or part of a complete JSON.
522 This function parses JSON contained in the
523 specified character buffer. If parsing completes,
524 any additional characters past the end of the
525 complete JSON are ignored. The function returns the
526 actual number of characters parsed, which may be
527 less than the size of the input. This allows parsing
528 of a buffer containing multiple individual JSONs or
529 containing different protocol data.
533 stream_parser p; // construct a parser
534 std::size_t n; // number of characters used
535 n = p.write_some( "[1,2" ); // parse the first part of the JSON
536 assert( n == 4 ); // all characters consumed
537 n = p.write_some( "3,4] null" ); // parse the rest of the JSON
538 assert( n == 5 ); // only some characters consumed
539 value jv = p.release(); // take ownership of the value
544 To indicate there are no more character buffers,
545 such as when @ref done returns `false` after
546 writing, call @ref finish.
551 @par Exception Safety
553 Calls to `memory_resource::allocate` may throw.
554 Upon error or exception, subsequent calls will
555 fail until @ref reset is called to parse a new JSON.
557 @return The number of characters consumed from
560 @param data A pointer to a buffer of `size`
563 @param size The number of characters pointed to
566 @throw system_error Thrown on error.
574 /** Parse a buffer containing all or part of a complete JSON.
576 This function parses JSON contained in the
577 specified character buffer. If parsing completes,
578 any additional characters past the end of the
579 complete JSON are ignored. The function returns the
580 actual number of characters parsed, which may be
581 less than the size of the input. This allows parsing
582 of a buffer containing multiple individual JSONs or
583 containing different protocol data.
587 stream_parser p; // construct a parser
588 std::size_t n; // number of characters used
589 n = p.write_some( "[1,2" ); // parse the first part of the JSON
590 assert( n == 4 ); // all characters consumed
591 n = p.write_some( "3,4] null" ); // parse the rest of the JSON
592 assert( n == 5 ); // only some characters consumed
593 value jv = p.release(); // take ownership of the value
598 To indicate there are no more character buffers,
599 such as when @ref done returns `false` after
600 writing, call @ref finish.
605 @par Exception Safety
607 Calls to `memory_resource::allocate` may throw.
608 Upon error or exception, subsequent calls will
609 fail until @ref reset is called to parse a new JSON.
611 @return The number of characters consumed from
614 @param s The character string to parse.
616 @param ec Set to the error, if any occurred.
624 s.data(), s.size(), ec);
627 /** Parse a buffer containing all or part of a complete JSON.
629 This function parses JSON contained in the
630 specified character buffer. If parsing completes,
631 any additional characters past the end of the
632 complete JSON are ignored. The function returns the
633 actual number of characters parsed, which may be
634 less than the size of the input. This allows parsing
635 of a buffer containing multiple individual JSONs or
636 containing different protocol data.
640 stream_parser p; // construct a parser
641 std::size_t n; // number of characters used
642 n = p.write_some( "[1,2" ); // parse the first part of the JSON
643 assert( n == 4 ); // all characters consumed
644 n = p.write_some( "3,4] null" ); // parse the rest of the JSON
645 assert( n == 5 ); // only some characters consumed
646 value jv = p.release(); // take ownership of the value
651 To indicate there are no more character buffers,
652 such as when @ref done returns `false` after
653 writing, call @ref finish.
658 @par Exception Safety
660 Calls to `memory_resource::allocate` may throw.
661 Upon error or exception, subsequent calls will
662 fail until @ref reset is called to parse a new JSON.
664 @return The number of characters consumed from
667 @param s The character string to parse.
669 @throw system_error Thrown on error.
679 /** Parse a buffer containing all or part of a complete JSON.
681 This function parses a all or part of a JSON
682 contained in the specified character buffer. The
683 entire buffer must be consumed; if there are
684 additional characters past the end of the complete
685 JSON, the parse fails and an error is returned.
689 stream_parser p; // construct a parser
690 std::size_t n; // number of characters used
691 n = p.write( "[1,2" ); // parse some of the JSON
692 assert( n == 4 ); // all characters consumed
693 n = p.write( "3,4]" ); // parse the rest of the JSON
694 assert( n == 4 ); // all characters consumed
695 value jv = p.release(); // take ownership of the value
700 To indicate there are no more character buffers,
701 such as when @ref done returns `false` after
702 writing, call @ref finish.
707 @par Exception Safety
709 Calls to `memory_resource::allocate` may throw.
710 Upon error or exception, subsequent calls will
711 fail until @ref reset is called to parse a new JSON.
713 @return The number of characters consumed from
716 @param data A pointer to a buffer of `size`
719 @param size The number of characters pointed to
722 @param ec Set to the error, if any occurred.
731 /** Parse a buffer containing all or part of a complete JSON.
733 This function parses a all or part of a JSON
734 contained in the specified character buffer. The
735 entire buffer must be consumed; if there are
736 additional characters past the end of the complete
737 JSON, the parse fails and an error is returned.
741 stream_parser p; // construct a parser
742 std::size_t n; // number of characters used
743 n = p.write( "[1,2" ); // parse some of the JSON
744 assert( n == 4 ); // all characters consumed
745 n = p.write( "3,4]" ); // parse the rest of the JSON
746 assert( n == 4 ); // all characters consumed
747 value jv = p.release(); // take ownership of the value
752 To indicate there are no more character buffers,
753 such as when @ref done returns `false` after
754 writing, call @ref finish.
759 @par Exception Safety
761 Calls to `memory_resource::allocate` may throw.
762 Upon error or exception, subsequent calls will
763 fail until @ref reset is called to parse a new JSON.
765 @return The number of characters consumed from
768 @param data A pointer to a buffer of `size`
771 @param size The number of characters pointed to
774 @throw system_error Thrown on error.
782 /** Parse a buffer containing all or part of a complete JSON.
784 This function parses a all or part of a JSON
785 contained in the specified character buffer. The
786 entire buffer must be consumed; if there are
787 additional characters past the end of the complete
788 JSON, the parse fails and an error is returned.
792 stream_parser p; // construct a parser
793 std::size_t n; // number of characters used
794 n = p.write( "[1,2" ); // parse some of the JSON
795 assert( n == 4 ); // all characters consumed
796 n = p.write( "3,4]" ); // parse the rest of the JSON
797 assert( n == 4 ); // all characters consumed
798 value jv = p.release(); // take ownership of the value
803 To indicate there are no more character buffers,
804 such as when @ref done returns `false` after
805 writing, call @ref finish.
810 @par Exception Safety
812 Calls to `memory_resource::allocate` may throw.
813 Upon error or exception, subsequent calls will
814 fail until @ref reset is called to parse a new JSON.
816 @return The number of characters consumed from
819 @param s The character string to parse.
821 @param ec Set to the error, if any occurred.
829 s.data(), s.size(), ec);
832 /** Parse a buffer containing all or part of a complete JSON.
834 This function parses a all or part of a JSON
835 contained in the specified character buffer. The
836 entire buffer must be consumed; if there are
837 additional characters past the end of the complete
838 JSON, the parse fails and an error is returned.
842 stream_parser p; // construct a parser
843 std::size_t n; // number of characters used
844 n = p.write( "[1,2" ); // parse some of the JSON
845 assert( n == 4 ); // all characters consumed
846 n = p.write( "3,4]" ); // parse the rest of the JSON
847 assert( n == 4 ); // all characters consumed
848 value jv = p.release(); // take ownership of the value
853 To indicate there are no more character buffers,
854 such as when @ref done returns `false` after
855 writing, call @ref finish.
860 @par Exception Safety
862 Calls to `memory_resource::allocate` may throw.
863 Upon error or exception, subsequent calls will
864 fail until @ref reset is called to parse a new JSON.
866 @return The number of characters consumed from
869 @param s The character string to parse.
871 @throw system_error Thrown on error.
881 /** Indicate the end of JSON input.
883 This function is used to indicate that there
884 are no more character buffers in the current
885 JSON being parsed. If ther resulting JSON is
886 incomplete, the error is set to indicate a
890 In the code below, @ref finish is called to
891 indicate there are no more digits in the
894 stream_parser p; // construct a parser
895 p.write( "3." ); // write the first part of the number
896 p.write( "14" ); // write the second part of the number
897 assert( ! p.done() ); // there could be more digits
898 p.finish(); // indicate the end of the JSON input
899 assert( p.done() ); // now we are finished
900 value jv = p.release(); // take ownership of the value
906 @par Exception Safety
908 Calls to `memory_resource::allocate` may throw.
909 Upon error or exception, subsequent calls will
910 fail until @ref reset is called to parse a new JSON.
912 @param ec Set to the error, if any occurred.
916 finish(error_code& ec);
918 /** Indicate the end of JSON input.
920 This function is used to indicate that there
921 are no more character buffers in the current
922 JSON being parsed. If ther resulting JSON is
923 incomplete, the error is set to indicate a
927 In the code below, @ref finish is called to
928 indicate there are no more digits in the
931 stream_parser p; // construct a parser
932 p.write( "3." ); // write the first part of the number
933 p.write( "14" ); // write the second part of the number
934 assert( ! p.done() ); // there could be more digits
935 p.finish(); // indicate the end of the JSON input
936 assert( p.done() ); // now we are finished
937 value jv = p.release(); // take ownership of the value
943 @par Exception Safety
945 Calls to `memory_resource::allocate` may throw.
946 Upon error or exception, subsequent calls will
947 fail until @ref reset is called to parse a new JSON.
949 @throw system_error Thrown on error.
955 /** Return the parsed JSON as a @ref value.
957 This returns the parsed value, or throws
958 an exception if the parsing is incomplete or
959 failed. It is necessary to call @ref reset
960 after calling this function in order to parse
973 @return The parsed value. Ownership of this
974 value is transferred to the caller.
976 @throw system_error Thrown on failure.