2 // Copyright (c) 2019 Vinnie Falco (vinnie.falco@gmail.com)
4 // Distributed under the Boost Software License, Version 1.0. (See accompanying
5 // file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
7 // Official repository: https://github.com/boostorg/json
10 #ifndef BOOST_JSON_STREAM_PARSER_HPP
11 #define BOOST_JSON_STREAM_PARSER_HPP
13 #include <boost/json/detail/config.hpp>
14 #include <boost/json/basic_parser.hpp>
15 #include <boost/json/parse_options.hpp>
16 #include <boost/json/storage_ptr.hpp>
17 #include <boost/json/value.hpp>
18 #include <boost/json/detail/handler.hpp>
19 #include <type_traits>
24 //----------------------------------------------------------
26 /** A DOM parser for JSON contained in multiple buffers.
28 This class is used to parse a JSON contained in a
29 series of one or more character buffers, into a
30 @ref value container. It implements a
31 <a href="https://en.wikipedia.org/wiki/Streaming_algorithm">
32 <em>streaming algorithm</em></a>, allowing these
35 @li Parse a JSON file a piece at a time.
37 @li Parse incoming JSON as it arrives,
40 @li Parse with bounded resource consumption
45 To use the parser first construct it, then optionally
46 call @ref reset to specify a @ref storage_ptr to use
47 for the resulting @ref value. Then call @ref write
48 one or more times to parse a single, complete JSON.
49 Call @ref done to determine if the parse has completed.
50 To indicate there are no more buffers, call @ref finish.
51 If the parse is successful, call @ref release to take
52 ownership of the value:
55 stream_parser p; // construct a parser
56 p.write( "[1,2" ); // parse some of a JSON
57 p.write( ",3,4]" ); // parse the rest of the JSON
58 assert( p.done() ); // we have a complete JSON
59 value jv = p.release(); // take ownership of the value
64 When the character buffer provided as input contains
65 additional data that is not part of the complete
66 JSON, an error is returned. The @ref write_some
67 function is an alternative which allows the parse
68 to finish early, without consuming all the characters
69 in the buffer. This allows parsing of a buffer
70 containing multiple individual JSONs or containing
71 different protocol data:
73 stream_parser p; // construct a parser
74 std::size_t n; // number of characters used
75 n = p.write_some( "[1,2" ); // parse some of a JSON
76 assert( n == 4 ); // all characters consumed
77 n = p.write_some( ",3,4] null" ); // parse the remainder of the JSON
78 assert( n == 6 ); // only some characters consumed
79 assert( p.done() ); // we have a complete JSON
80 value jv = p.release(); // take ownership of the value
83 @par Temporary Storage
85 The parser may dynamically allocate temporary
86 storage as needed to accommodate the nesting level
87 of the JSON being parsed. Temporary storage is
88 first obtained from an optional, caller-owned
89 buffer specified upon construction. When that
90 is exhausted, the next allocation uses the
91 @ref memory_resource passed to the constructor; if
92 no such argument is specified, the default memory
93 resource is used. Temporary storage is freed only
94 when the parser is destroyed; The performance of
95 parsing multiple JSONs may be improved by reusing
96 the same parser instance.
98 It is important to note that the @ref memory_resource
99 supplied upon construction is used for temporary
100 storage only, and not for allocating the elements
101 which make up the parsed value. That other memory
102 resource is optionally supplied in each call
107 If there are object elements with duplicate keys;
108 that is, if multiple elements in an object have
109 keys that compare equal, only the last equivalent
110 element will be inserted.
112 @par Non-Standard JSON
114 The @ref parse_options structure optionally
115 provided upon construction is used to customize
116 some parameters of the parser, including which
117 non-standard JSON extensions should be allowed.
118 A default-constructed parse options allows only
123 Distinct instances may be accessed concurrently.
124 Non-const member functions of a shared instance
125 may not be called concurrently with any other
126 member functions of that instance.
135 basic_parser<detail::handler> p_;
138 /// Copy constructor (deleted)
140 stream_parser const&) = delete;
142 /// Copy assignment (deleted)
143 stream_parser& operator=(
144 stream_parser const&) = delete;
148 All dynamically allocated memory, including
149 any incomplete parsing results, is freed.
152 Linear in the size of partial results
154 @par Exception Safety
157 ~stream_parser() = default;
161 This constructs a new parser which first uses
162 the caller-owned storage pointed to by `buffer`
163 for temporary storage, falling back to the memory
164 resource `sp` if needed. The parser will use the
165 specified parsing options.
167 The parsed value will use the default memory
168 resource for storage. To use a different resource,
169 call @ref reset after construction.
174 @par Exception Safety
177 @param sp The memory resource to use for
178 temporary storage after `buffer` is exhausted.
180 @param opt The parsing options to use.
182 @param buffer A pointer to valid memory of at least
183 `size` bytes for the parser to use for temporary storage.
184 Ownership is not transferred, the caller is responsible
185 for ensuring the lifetime of the memory pointed to by
186 `buffer` extends until the parser is destroyed.
188 @param size The number of valid bytes in `buffer`.
193 parse_options const& opt,
194 unsigned char* buffer,
195 std::size_t size) noexcept;
199 This constructs a new parser which uses the default
200 memory resource for temporary storage, and accepts
203 The parsed value will use the default memory
204 resource for storage. To use a different resource,
205 call @ref reset after construction.
210 @par Exception Safety
213 stream_parser() noexcept
214 : stream_parser({}, {})
220 This constructs a new parser which uses the
221 specified memory resource for temporary storage,
222 and is configured to use the specified parsing
225 The parsed value will use the default memory
226 resource for storage. To use a different resource,
227 call @ref reset after construction.
232 @par Exception Safety
235 @param sp The memory resource to use for temporary storage.
237 @param opt The parsing options to use.
242 parse_options const& opt) noexcept;
246 This constructs a new parser which uses the
247 specified memory resource for temporary storage,
248 and accepts only strict JSON.
250 The parsed value will use the default memory
251 resource for storage. To use a different resource,
252 call @ref reset after construction.
257 @par Exception Safety
260 @param sp The memory resource to use for temporary storage.
263 stream_parser(storage_ptr sp) noexcept
264 : stream_parser(std::move(sp), {})
270 This constructs a new parser which first uses the
271 caller-owned storage `buffer` for temporary storage,
272 falling back to the memory resource `sp` if needed.
273 The parser will use the specified parsing options.
275 The parsed value will use the default memory
276 resource for storage. To use a different resource,
277 call @ref reset after construction.
282 @par Exception Safety
285 @param sp The memory resource to use for
286 temporary storage after `buffer` is exhausted.
288 @param opt The parsing options to use.
290 @param buffer A buffer for the parser to use for
291 temporary storage. Ownership is not transferred,
292 the caller is responsible for ensuring the lifetime
293 of `buffer` extends until the parser is destroyed.
295 template<std::size_t N>
298 parse_options const& opt,
299 unsigned char(&buffer)[N]) noexcept
300 : stream_parser(std::move(sp),
305 #if defined(__cpp_lib_byte) || defined(BOOST_JSON_DOCS)
308 This constructs a new parser which first uses
309 the caller-owned storage pointed to by `buffer`
310 for temporary storage, falling back to the memory
311 resource `sp` if needed. The parser will use the
312 specified parsing options.
314 The parsed value will use the default memory
315 resource for storage. To use a different resource,
316 call @ref reset after construction.
321 @par Exception Safety
324 @param sp The memory resource to use for
325 temporary storage after `buffer` is exhausted.
327 @param opt The parsing options to use.
329 @param buffer A pointer to valid memory of at least
330 `size` bytes for the parser to use for temporary storage.
331 Ownership is not transferred, the caller is responsible
332 for ensuring the lifetime of the memory pointed to by
333 `buffer` extends until the parser is destroyed.
335 @param size The number of valid bytes in `buffer`.
339 parse_options const& opt,
341 std::size_t size) noexcept
342 : stream_parser(sp, opt, reinterpret_cast<
343 unsigned char*>(buffer), size)
349 This constructs a new parser which first uses the
350 caller-owned storage `buffer` for temporary storage,
351 falling back to the memory resource `sp` if needed.
352 The parser will use the specified parsing options.
354 The parsed value will use the default memory
355 resource for storage. To use a different resource,
356 call @ref reset after construction.
361 @par Exception Safety
364 @param sp The memory resource to use for
365 temporary storage after `buffer` is exhausted.
367 @param opt The parsing options to use.
369 @param buffer A buffer for the parser to use for
370 temporary storage. Ownership is not transferred,
371 the caller is responsible for ensuring the lifetime
372 of `buffer` extends until the parser is destroyed.
374 template<std::size_t N>
377 parse_options const& opt,
378 std::byte(&buffer)[N]) noexcept
379 : stream_parser(std::move(sp),
385 #ifndef BOOST_JSON_DOCS
386 // Safety net for accidental buffer overflows
387 template<std::size_t N>
390 parse_options const& opt,
391 unsigned char(&buffer)[N],
392 std::size_t n) noexcept
393 : stream_parser(std::move(sp),
396 // If this goes off, check your parameters
397 // closely, chances are you passed an array
398 // thinking it was a pointer.
399 BOOST_ASSERT(n <= N);
402 #ifdef __cpp_lib_byte
403 // Safety net for accidental buffer overflows
404 template<std::size_t N>
407 parse_options const& opt,
408 std::byte(&buffer)[N], std::size_t n) noexcept
409 : stream_parser(std::move(sp),
412 // If this goes off, check your parameters
413 // closely, chances are you passed an array
414 // thinking it was a pointer.
415 BOOST_ASSERT(n <= N);
420 /** Reset the parser for a new JSON.
422 This function is used to reset the parser to
423 prepare it for parsing a new complete JSON.
424 Any previous partial results are destroyed.
427 Constant or linear in the size of any previous
428 partial parsing results.
430 @par Exception Safety
433 @param sp A pointer to the @ref memory_resource
434 to use for the resulting @ref value. The parser
435 will acquire shared ownership.
439 reset(storage_ptr sp = {}) noexcept;
441 /** Return true if a complete JSON has been parsed.
443 This function returns `true` when all of these
446 @li A complete serialized JSON has been
447 presented to the parser, and
449 @li No error has occurred since the parser
450 was constructed, or since the last call
456 @par Exception Safety
460 done() const noexcept
465 /** Parse a buffer containing all or part of a complete JSON.
467 This function parses JSON contained in the
468 specified character buffer. If parsing completes,
469 any additional characters past the end of the
470 complete JSON are ignored. The function returns the
471 actual number of characters parsed, which may be
472 less than the size of the input. This allows parsing
473 of a buffer containing multiple individual JSONs or
474 containing different protocol data.
478 stream_parser p; // construct a parser
479 std::size_t n; // number of characters used
480 n = p.write_some( "[1,2" ); // parse the first part of the JSON
481 assert( n == 4 ); // all characters consumed
482 n = p.write_some( "3,4] null" ); // parse the rest of the JSON
483 assert( n == 5 ); // only some characters consumed
484 value jv = p.release(); // take ownership of the value
489 To indicate there are no more character buffers,
490 such as when @ref done returns `false` after
491 writing, call @ref finish.
496 @par Exception Safety
498 Calls to `memory_resource::allocate` may throw.
499 Upon error or exception, subsequent calls will
500 fail until @ref reset is called to parse a new JSON.
502 @return The number of characters consumed from
505 @param data A pointer to a buffer of `size`
508 @param size The number of characters pointed to
511 @param ec Set to the error, if any occurred.
526 std::error_code& ec);
529 /** Parse a buffer containing all or part of a complete JSON.
531 This function parses JSON contained in the
532 specified character buffer. If parsing completes,
533 any additional characters past the end of the
534 complete JSON are ignored. The function returns the
535 actual number of characters parsed, which may be
536 less than the size of the input. This allows parsing
537 of a buffer containing multiple individual JSONs or
538 containing different protocol data.
542 stream_parser p; // construct a parser
543 std::size_t n; // number of characters used
544 n = p.write_some( "[1,2" ); // parse the first part of the JSON
545 assert( n == 4 ); // all characters consumed
546 n = p.write_some( "3,4] null" ); // parse the rest of the JSON
547 assert( n == 5 ); // only some characters consumed
548 value jv = p.release(); // take ownership of the value
553 To indicate there are no more character buffers,
554 such as when @ref done returns `false` after
555 writing, call @ref finish.
560 @par Exception Safety
562 Calls to `memory_resource::allocate` may throw.
563 Upon error or exception, subsequent calls will
564 fail until @ref reset is called to parse a new JSON.
566 @return The number of characters consumed from
569 @param data A pointer to a buffer of `size`
572 @param size The number of characters pointed to
575 @throw system_error Thrown on error.
583 /** Parse a buffer containing all or part of a complete JSON.
585 This function parses JSON contained in the
586 specified character buffer. If parsing completes,
587 any additional characters past the end of the
588 complete JSON are ignored. The function returns the
589 actual number of characters parsed, which may be
590 less than the size of the input. This allows parsing
591 of a buffer containing multiple individual JSONs or
592 containing different protocol data.
596 stream_parser p; // construct a parser
597 std::size_t n; // number of characters used
598 n = p.write_some( "[1,2" ); // parse the first part of the JSON
599 assert( n == 4 ); // all characters consumed
600 n = p.write_some( "3,4] null" ); // parse the rest of the JSON
601 assert( n == 5 ); // only some characters consumed
602 value jv = p.release(); // take ownership of the value
607 To indicate there are no more character buffers,
608 such as when @ref done returns `false` after
609 writing, call @ref finish.
614 @par Exception Safety
616 Calls to `memory_resource::allocate` may throw.
617 Upon error or exception, subsequent calls will
618 fail until @ref reset is called to parse a new JSON.
620 @return The number of characters consumed from
623 @param s The character string to parse.
625 @param ec Set to the error, if any occurred.
634 s.data(), s.size(), ec);
643 s.data(), s.size(), ec);
647 /** Parse a buffer containing all or part of a complete JSON.
649 This function parses JSON contained in the
650 specified character buffer. If parsing completes,
651 any additional characters past the end of the
652 complete JSON are ignored. The function returns the
653 actual number of characters parsed, which may be
654 less than the size of the input. This allows parsing
655 of a buffer containing multiple individual JSONs or
656 containing different protocol data.
660 stream_parser p; // construct a parser
661 std::size_t n; // number of characters used
662 n = p.write_some( "[1,2" ); // parse the first part of the JSON
663 assert( n == 4 ); // all characters consumed
664 n = p.write_some( "3,4] null" ); // parse the rest of the JSON
665 assert( n == 5 ); // only some characters consumed
666 value jv = p.release(); // take ownership of the value
671 To indicate there are no more character buffers,
672 such as when @ref done returns `false` after
673 writing, call @ref finish.
678 @par Exception Safety
680 Calls to `memory_resource::allocate` may throw.
681 Upon error or exception, subsequent calls will
682 fail until @ref reset is called to parse a new JSON.
684 @return The number of characters consumed from
687 @param s The character string to parse.
689 @throw system_error Thrown on error.
699 /** Parse a buffer containing all or part of a complete JSON.
701 This function parses a all or part of a JSON
702 contained in the specified character buffer. The
703 entire buffer must be consumed; if there are
704 additional characters past the end of the complete
705 JSON, the parse fails and an error is returned.
709 stream_parser p; // construct a parser
710 std::size_t n; // number of characters used
711 n = p.write( "[1,2" ); // parse some of the JSON
712 assert( n == 4 ); // all characters consumed
713 n = p.write( "3,4]" ); // parse the rest of the JSON
714 assert( n == 4 ); // all characters consumed
715 value jv = p.release(); // take ownership of the value
720 To indicate there are no more character buffers,
721 such as when @ref done returns `false` after
722 writing, call @ref finish.
727 @par Exception Safety
729 Calls to `memory_resource::allocate` may throw.
730 Upon error or exception, subsequent calls will
731 fail until @ref reset is called to parse a new JSON.
733 @return The number of characters consumed from
736 @param data A pointer to a buffer of `size`
739 @param size The number of characters pointed to
742 @param ec Set to the error, if any occurred.
757 std::error_code& ec);
760 /** Parse a buffer containing all or part of a complete JSON.
762 This function parses a all or part of a JSON
763 contained in the specified character buffer. The
764 entire buffer must be consumed; if there are
765 additional characters past the end of the complete
766 JSON, the parse fails and an error is returned.
770 stream_parser p; // construct a parser
771 std::size_t n; // number of characters used
772 n = p.write( "[1,2" ); // parse some of the JSON
773 assert( n == 4 ); // all characters consumed
774 n = p.write( "3,4]" ); // parse the rest of the JSON
775 assert( n == 4 ); // all characters consumed
776 value jv = p.release(); // take ownership of the value
781 To indicate there are no more character buffers,
782 such as when @ref done returns `false` after
783 writing, call @ref finish.
788 @par Exception Safety
790 Calls to `memory_resource::allocate` may throw.
791 Upon error or exception, subsequent calls will
792 fail until @ref reset is called to parse a new JSON.
794 @return The number of characters consumed from
797 @param data A pointer to a buffer of `size`
800 @param size The number of characters pointed to
803 @throw system_error Thrown on error.
811 /** Parse a buffer containing all or part of a complete JSON.
813 This function parses a all or part of a JSON
814 contained in the specified character buffer. The
815 entire buffer must be consumed; if there are
816 additional characters past the end of the complete
817 JSON, the parse fails and an error is returned.
821 stream_parser p; // construct a parser
822 std::size_t n; // number of characters used
823 n = p.write( "[1,2" ); // parse some of the JSON
824 assert( n == 4 ); // all characters consumed
825 n = p.write( "3,4]" ); // parse the rest of the JSON
826 assert( n == 4 ); // all characters consumed
827 value jv = p.release(); // take ownership of the value
832 To indicate there are no more character buffers,
833 such as when @ref done returns `false` after
834 writing, call @ref finish.
839 @par Exception Safety
841 Calls to `memory_resource::allocate` may throw.
842 Upon error or exception, subsequent calls will
843 fail until @ref reset is called to parse a new JSON.
845 @return The number of characters consumed from
848 @param s The character string to parse.
850 @param ec Set to the error, if any occurred.
859 s.data(), s.size(), ec);
868 s.data(), s.size(), ec);
872 /** Parse a buffer containing all or part of a complete JSON.
874 This function parses a all or part of a JSON
875 contained in the specified character buffer. The
876 entire buffer must be consumed; if there are
877 additional characters past the end of the complete
878 JSON, the parse fails and an error is returned.
882 stream_parser p; // construct a parser
883 std::size_t n; // number of characters used
884 n = p.write( "[1,2" ); // parse some of the JSON
885 assert( n == 4 ); // all characters consumed
886 n = p.write( "3,4]" ); // parse the rest of the JSON
887 assert( n == 4 ); // all characters consumed
888 value jv = p.release(); // take ownership of the value
893 To indicate there are no more character buffers,
894 such as when @ref done returns `false` after
895 writing, call @ref finish.
900 @par Exception Safety
902 Calls to `memory_resource::allocate` may throw.
903 Upon error or exception, subsequent calls will
904 fail until @ref reset is called to parse a new JSON.
906 @return The number of characters consumed from
909 @param s The character string to parse.
911 @throw system_error Thrown on error.
921 /** Indicate the end of JSON input.
923 This function is used to indicate that there
924 are no more character buffers in the current
925 JSON being parsed. If ther resulting JSON is
926 incomplete, the error is set to indicate a
930 In the code below, @ref finish is called to
931 indicate there are no more digits in the
934 stream_parser p; // construct a parser
935 p.write( "3." ); // write the first part of the number
936 p.write( "14" ); // write the second part of the number
937 assert( ! p.done() ); // there could be more digits
938 p.finish(); // indicate the end of the JSON input
939 assert( p.done() ); // now we are finished
940 value jv = p.release(); // take ownership of the value
946 @par Exception Safety
948 Calls to `memory_resource::allocate` may throw.
949 Upon error or exception, subsequent calls will
950 fail until @ref reset is called to parse a new JSON.
952 @param ec Set to the error, if any occurred.
957 finish(error_code& ec);
961 finish(std::error_code& ec);
964 /** Indicate the end of JSON input.
966 This function is used to indicate that there
967 are no more character buffers in the current
968 JSON being parsed. If ther resulting JSON is
969 incomplete, the error is set to indicate a
973 In the code below, @ref finish is called to
974 indicate there are no more digits in the
977 stream_parser p; // construct a parser
978 p.write( "3." ); // write the first part of the number
979 p.write( "14" ); // write the second part of the number
980 assert( ! p.done() ); // there could be more digits
981 p.finish(); // indicate the end of the JSON input
982 assert( p.done() ); // now we are finished
983 value jv = p.release(); // take ownership of the value
989 @par Exception Safety
991 Calls to `memory_resource::allocate` may throw.
992 Upon error or exception, subsequent calls will
993 fail until @ref reset is called to parse a new JSON.
995 @throw system_error Thrown on error.
1001 /** Return the parsed JSON as a @ref value.
1003 This returns the parsed value, or throws
1004 an exception if the parsing is incomplete or
1005 failed. It is necessary to call @ref reset
1006 after calling this function in order to parse
1011 if( ! this->done() )
1019 @return The parsed value. Ownership of this
1020 value is transferred to the caller.
1022 @throw system_error Thrown on failure.