2 // Copyright (c) 2019 Vinnie Falco (vinnie.falco@gmail.com)
3 // Copyright (c) 2020 Krystian Stasiowski (sdkrystian@gmail.com)
5 // Distributed under the Boost Software License, Version 1.0. (See accompanying
6 // file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
8 // Official repository: https://github.com/boostorg/json
11 #ifndef BOOST_JSON_BASIC_PARSER_HPP
12 #define BOOST_JSON_BASIC_PARSER_HPP
14 #include <boost/json/detail/config.hpp>
15 #include <boost/json/detail/except.hpp>
16 #include <boost/json/error.hpp>
17 #include <boost/json/kind.hpp>
18 #include <boost/json/parse_options.hpp>
19 #include <boost/json/detail/stack.hpp>
20 #include <boost/json/detail/stream.hpp>
21 #include <boost/json/detail/utf8.hpp>
25 This file is in the detail namespace because it
26 is not allowed to be included directly by users,
27 who should be including <boost/json/basic_parser.hpp>
28 instead, which provides the member function definitions.
30 The source code is arranged this way to keep compile
36 /** An incremental SAX parser for serialized JSON.
38 This implements a SAX-style parser, invoking a
39 caller-supplied handler with each parsing event.
40 To use, first declare a variable of type
41 `basic_parser<T>` where `T` meets the handler
42 requirements specified below. Then call
43 @ref write_some one or more times with the input,
44 setting `more = false` on the final buffer.
45 The parsing events are realized through member
46 function calls on the handler, which exists
47 as a data member of the parser.
49 The parser may dynamically allocate intermediate
50 storage as needed to accommodate the nesting level
51 of the input JSON. On subsequent invocations, the
52 parser can cheaply re-use this memory, improving
53 performance. This storage is freed when the
58 To get the declaration and function definitions
59 for this class it is necessary to include this
62 #include <boost/json/basic_parser_impl.hpp>
65 Users who wish to parse JSON into the DOM container
66 @ref value will not use this class directly; instead
67 they will create an instance of @ref parser or
68 @ref stream_parser and use that instead. Alternatively,
69 they may call the function @ref parse. This class is
70 designed for users who wish to perform custom actions
71 instead of building a @ref value. For example, to
72 produce a DOM from an external library.
76 By default, only conforming JSON using UTF-8
77 encoding is accepted. However, select non-compliant
78 syntax can be allowed by construction using a
79 @ref parse_options set to desired values.
83 The handler provided must be implemented as an
84 object of class type which defines each of the
85 required event member functions below. The event
86 functions return a `bool` where `true` indicates
87 success, and `false` indicates failure. If the
88 member function returns `false`, it must set
89 the error code to a suitable value. This error
90 code will be returned by the write function to
93 Handlers are required to declare the maximum
94 limits on various elements. If these limits
95 are exceeded during parsing, then parsing
98 The following declaration meets the parser's
104 /// The maximum number of elements allowed in an array
105 static constexpr std::size_t max_array_size = -1;
107 /// The maximum number of elements allowed in an object
108 static constexpr std::size_t max_object_size = -1;
110 /// The maximum number of characters allowed in a string
111 static constexpr std::size_t max_string_size = -1;
113 /// The maximum number of characters allowed in a key
114 static constexpr std::size_t max_key_size = -1;
116 /// Called once when the JSON parsing begins.
118 /// @return `true` on success.
119 /// @param ec Set to the error, if any occurred.
121 bool on_document_begin( error_code& ec );
123 /// Called when the JSON parsing is done.
125 /// @return `true` on success.
126 /// @param ec Set to the error, if any occurred.
128 bool on_document_end( error_code& ec );
130 /// Called when the beginning of an array is encountered.
132 /// @return `true` on success.
133 /// @param ec Set to the error, if any occurred.
135 bool on_array_begin( error_code& ec );
137 /// Called when the end of the current array is encountered.
139 /// @return `true` on success.
140 /// @param n The number of elements in the array.
141 /// @param ec Set to the error, if any occurred.
143 bool on_array_end( std::size_t n, error_code& ec );
145 /// Called when the beginning of an object is encountered.
147 /// @return `true` on success.
148 /// @param ec Set to the error, if any occurred.
150 bool on_object_begin( error_code& ec );
152 /// Called when the end of the current object is encountered.
154 /// @return `true` on success.
155 /// @param n The number of elements in the object.
156 /// @param ec Set to the error, if any occurred.
158 bool on_object_end( std::size_t n, error_code& ec );
160 /// Called with characters corresponding to part of the current string.
162 /// @return `true` on success.
163 /// @param s The partial characters
164 /// @param n The total size of the string thus far
165 /// @param ec Set to the error, if any occurred.
167 bool on_string_part( string_view s, std::size_t n, error_code& ec );
169 /// Called with the last characters corresponding to the current string.
171 /// @return `true` on success.
172 /// @param s The remaining characters
173 /// @param n The total size of the string
174 /// @param ec Set to the error, if any occurred.
176 bool on_string( string_view s, std::size_t n, error_code& ec );
178 /// Called with characters corresponding to part of the current key.
180 /// @return `true` on success.
181 /// @param s The partial characters
182 /// @param n The total size of the key thus far
183 /// @param ec Set to the error, if any occurred.
185 bool on_key_part( string_view s, std::size_t n, error_code& ec );
187 /// Called with the last characters corresponding to the current key.
189 /// @return `true` on success.
190 /// @param s The remaining characters
191 /// @param n The total size of the key
192 /// @param ec Set to the error, if any occurred.
194 bool on_key( string_view s, std::size_t n, error_code& ec );
196 /// Called with the characters corresponding to part of the current number.
198 /// @return `true` on success.
199 /// @param s The partial characters
200 /// @param ec Set to the error, if any occurred.
202 bool on_number_part( string_view s, error_code& ec );
204 /// Called when a signed integer is parsed.
206 /// @return `true` on success.
207 /// @param i The value
208 /// @param s The remaining characters
209 /// @param ec Set to the error, if any occurred.
211 bool on_int64( int64_t i, string_view s, error_code& ec );
213 /// Called when an unsigend integer is parsed.
215 /// @return `true` on success.
216 /// @param u The value
217 /// @param s The remaining characters
218 /// @param ec Set to the error, if any occurred.
220 bool on_uint64( uint64_t u, string_view s, error_code& ec );
222 /// Called when a double is parsed.
224 /// @return `true` on success.
225 /// @param d The value
226 /// @param s The remaining characters
227 /// @param ec Set to the error, if any occurred.
229 bool on_double( double d, string_view s, error_code& ec );
231 /// Called when a boolean is parsed.
233 /// @return `true` on success.
234 /// @param b The value
235 /// @param s The remaining characters
236 /// @param ec Set to the error, if any occurred.
238 bool on_bool( bool b, error_code& ec );
240 /// Called when a null is parsed.
242 /// @return `true` on success.
243 /// @param ec Set to the error, if any occurred.
245 bool on_null( error_code& ec );
247 /// Called with characters corresponding to part of the current comment.
249 /// @return `true` on success.
250 /// @param s The partial characters.
251 /// @param ec Set to the error, if any occurred.
253 bool on_comment_part( string_view s, error_code& ec );
255 /// Called with the last characters corresponding to the current comment.
257 /// @return `true` on success.
258 /// @param s The remaining characters
259 /// @param ec Set to the error, if any occurred.
261 bool on_comment( string_view s, error_code& ec );
269 @headerfile <boost/json/basic_parser.hpp>
271 template<class Handler>
274 enum class state : char
276 doc1, doc2, doc3, doc4,
277 com1, com2, com3, com4,
280 fal1, fal2, fal3, fal4,
281 str1, str2, str3, str4,
282 str5, str6, str7, str8,
285 obj1, obj2, obj3, obj4,
286 obj5, obj6, obj7, obj8,
290 num1, num2, num3, num4,
291 num5, num6, num7, num8,
305 // optimization: must come first
311 detail::utf8_sequence seq_;
314 bool more_; // false for final buffer
315 bool done_ = false; // true on complete parse
316 bool clean_ = true; // write_some exited cleanly
319 // how many levels deeper the parser can go
320 std::size_t depth_ = opt_.max_depth;
322 inline void reserve();
323 inline const char* sentinel();
324 inline bool incomplete(
325 const detail::const_stream_wrapper& cs);
327 #ifdef __INTEL_COMPILER
329 #pragma warning disable 2196
335 suspend_or_fail(state st);
347 fail(const char* p) noexcept;
355 source_location const* loc) noexcept;
395 #ifdef __INTEL_COMPILER
399 template<bool StackEmpty_/*, bool Terminal_*/>
400 const char* parse_comment(const char* p,
401 std::integral_constant<bool, StackEmpty_> stack_empty,
402 /*std::integral_constant<bool, Terminal_>*/ bool terminal);
404 template<bool StackEmpty_>
405 const char* parse_document(const char* p,
406 std::integral_constant<bool, StackEmpty_> stack_empty);
408 template<bool StackEmpty_, bool AllowComments_/*,
409 bool AllowTrailing_, bool AllowBadUTF8_*/>
410 const char* parse_value(const char* p,
411 std::integral_constant<bool, StackEmpty_> stack_empty,
412 std::integral_constant<bool, AllowComments_> allow_comments,
413 /*std::integral_constant<bool, AllowTrailing_>*/ bool allow_trailing,
414 /*std::integral_constant<bool, AllowBadUTF8_>*/ bool allow_bad_utf8);
416 template<bool StackEmpty_, bool AllowComments_/*,
417 bool AllowTrailing_, bool AllowBadUTF8_*/>
418 const char* resume_value(const char* p,
419 std::integral_constant<bool, StackEmpty_> stack_empty,
420 std::integral_constant<bool, AllowComments_> allow_comments,
421 /*std::integral_constant<bool, AllowTrailing_>*/ bool allow_trailing,
422 /*std::integral_constant<bool, AllowBadUTF8_>*/ bool allow_bad_utf8);
424 template<bool StackEmpty_, bool AllowComments_/*,
425 bool AllowTrailing_, bool AllowBadUTF8_*/>
426 const char* parse_object(const char* p,
427 std::integral_constant<bool, StackEmpty_> stack_empty,
428 std::integral_constant<bool, AllowComments_> allow_comments,
429 /*std::integral_constant<bool, AllowTrailing_>*/ bool allow_trailing,
430 /*std::integral_constant<bool, AllowBadUTF8_>*/ bool allow_bad_utf8);
432 template<bool StackEmpty_, bool AllowComments_/*,
433 bool AllowTrailing_, bool AllowBadUTF8_*/>
434 const char* parse_array(const char* p,
435 std::integral_constant<bool, StackEmpty_> stack_empty,
436 std::integral_constant<bool, AllowComments_> allow_comments,
437 /*std::integral_constant<bool, AllowTrailing_>*/ bool allow_trailing,
438 /*std::integral_constant<bool, AllowBadUTF8_>*/ bool allow_bad_utf8);
440 template<bool StackEmpty_>
441 const char* parse_null(const char* p,
442 std::integral_constant<bool, StackEmpty_> stack_empty);
444 template<bool StackEmpty_>
445 const char* parse_true(const char* p,
446 std::integral_constant<bool, StackEmpty_> stack_empty);
448 template<bool StackEmpty_>
449 const char* parse_false(const char* p,
450 std::integral_constant<bool, StackEmpty_> stack_empty);
452 template<bool StackEmpty_, bool IsKey_/*,
453 bool AllowBadUTF8_*/>
454 const char* parse_string(const char* p,
455 std::integral_constant<bool, StackEmpty_> stack_empty,
456 std::integral_constant<bool, IsKey_> is_key,
457 /*std::integral_constant<bool, AllowBadUTF8_>*/ bool allow_bad_utf8);
459 template<bool StackEmpty_, char First_>
460 const char* parse_number(const char* p,
461 std::integral_constant<bool, StackEmpty_> stack_empty,
462 std::integral_constant<char, First_> first);
464 template<bool StackEmpty_, bool IsKey_/*,
465 bool AllowBadUTF8_*/>
466 const char* parse_unescaped(const char* p,
467 std::integral_constant<bool, StackEmpty_> stack_empty,
468 std::integral_constant<bool, IsKey_> is_key,
469 /*std::integral_constant<bool, AllowBadUTF8_>*/ bool allow_bad_utf8);
471 template<bool StackEmpty_/*, bool IsKey_,
472 bool AllowBadUTF8_*/>
473 const char* parse_escaped(
476 std::integral_constant<bool, StackEmpty_> stack_empty,
477 /*std::integral_constant<bool, IsKey_>*/ bool is_key,
478 /*std::integral_constant<bool, AllowBadUTF8_>*/ bool allow_bad_utf8);
480 // intentionally private
482 depth() const noexcept
484 return opt_.max_depth - depth_;
488 /// Copy constructor (deleted)
490 basic_parser const&) = delete;
492 /// Copy assignment (deleted)
493 basic_parser& operator=(
494 basic_parser const&) = delete;
498 All dynamically allocated internal memory is freed.
502 this->handler().~Handler()
506 Same as `~Handler()`.
508 @par Exception Safety
509 Same as `~Handler()`.
511 ~basic_parser() = default;
515 This function constructs the parser with
516 the specified options, with any additional
517 arguments forwarded to the handler's constructor.
520 Same as `Handler( std::forward< Args >( args )... )`.
522 @par Exception Safety
523 Same as `Handler( std::forward< Args >( args )... )`.
525 @param opt Configuration settings for the parser.
526 If this structure is default constructed, the
527 parser will accept only standard JSON.
529 @param args Optional additional arguments
530 forwarded to the handler's constructor.
534 template<class... Args>
537 parse_options const& opt,
540 /** Return a reference to the handler.
542 This function provides access to the constructed
543 instance of the handler owned by the parser.
548 @par Exception Safety
557 /** Return a reference to the handler.
559 This function provides access to the constructed
560 instance of the handler owned by the parser.
565 @par Exception Safety
569 handler() const noexcept
574 /** Return the last error.
576 This returns the last error code which
577 was generated in the most recent call
583 @par Exception Safety
587 last_error() const noexcept
592 /** Return true if a complete JSON has been parsed.
594 This function returns `true` when all of these
597 @li A complete serialized JSON has been
598 presented to the parser, and
600 @li No error or exception has occurred since the
601 parser was constructed, or since the last call
607 @par Exception Safety
611 done() const noexcept
616 /** Reset the state, to parse a new document.
618 This function discards the current parsing
619 state, to prepare for parsing a new document.
620 Dynamically allocated temporary memory used
621 by the implementation is not deallocated.
626 @par Exception Safety
632 /** Indicate a parsing failure.
634 This changes the state of the parser to indicate
635 that the parse has failed. A parser implementation
636 can use this to fail the parser if needed due to
641 If `!ec`, the stored error code is unspecified.
646 @par Exception Safety
649 @param ec The error code to set. If the code does
650 not indicate failure, an implementation-defined
651 error code that indicates failure will be stored
655 fail(error_code ec) noexcept;
657 /** Parse some of an input string as JSON, incrementally.
659 This function parses the JSON in the specified
660 buffer, calling the handler to emit each SAX
661 parsing event. The parse proceeds from the
662 current state, which is at the beginning of a
663 new JSON or in the middle of the current JSON
664 if any characters were already parsed.
666 The characters in the buffer are processed
667 starting from the beginning, until one of the
668 following conditions is met:
670 @li All of the characters in the buffer
673 @li Some of the characters in the buffer
674 have been parsed and the JSON is complete, or
676 @li A parsing error occurs.
678 The supplied buffer does not need to contain the
679 entire JSON. Subsequent calls can provide more
680 serialized data, allowing JSON to be processed
681 incrementally. The end of the serialized JSON
682 can be indicated by passing `more = false`.
687 @par Exception Safety
689 Calls to the handler may throw.
690 Upon error or exception, subsequent calls will
691 fail until @ref reset is called to parse a new JSON.
693 @return The number of characters successfully
694 parsed, which may be smaller than `size`.
696 @param more `true` if there are possibly more
697 buffers in the current JSON, otherwise `false`.
699 @param data A pointer to a buffer of `size`
702 @param size The number of characters pointed to
705 @param ec Set to the error, if any occurred.
720 std::error_code& ec);