]> git.proxmox.com Git - ceph.git/blame - ceph/src/seastar/doc/rpc.md
import quincy beta 17.1.0
[ceph.git] / ceph / src / seastar / doc / rpc.md
CommitLineData
11fdf7f2
TL
1# RPC protocol
2
3## Data encoding
4
5All integral data is encoded in little endian format.
6
7## Protocol negotiation
8
9The negotiation works by exchanging negotiation frame immediately after connection establishment. The negotiation frame format is:
10
11 uint8_t magic[8] = SSTARRPC
12 uint32_t len
13 uint8_t data[len]
14
15The negotiation frame data is itself composed of multiple records, one for each feature number present. Feature numbers begin at zero and will be defined by later versions of this document.
16
17
18 struct negotiation_frame_feature_record {
19 uint32_t feature_number;
20 uint32_t len;
21 uint8_t data[len];
22 }
23
24A `negotiation_frame_feature_record` signals that an optional feature is present in the client, and can contain additional feature-specific data. The feature number will be omitted in a server response if an optional feature is declined by the server.
25
26Actual negotiation looks like this:
27
28 Client Server
29 --------------------------------------------------------------------------------------------------
30 send negotiation frame
31 recv frame
32 check magic (disconnect if magic is not SSTARRPC)
33 send negotiation frame back
34 recv frame
35 check magic (disconnect if magic is not SSTARRPC)
36
37### Supported features
38
39#### Compression
40 feature_number: 0
41 data : opaque data that is passed to a compressor factory
42 provided by an application. Compressor factory is
43 responsible for negotiation of compression algorithm.
44
45 If compression is negotiated request and response frames are encapsulated in a compressed frame.
46
47#### Timeout propagation
48 feature_number: 1
49 data : none
50
51 If timeout propagation is negotiated request frame has additional 8 bytes that hold timeout value
52 for a request in milliseconds. Zero value means that timeout value was not specified.
53 If timeout is specified and server cannot handle the request in specified time frame it my choose
54 to not send the reply back (sending it back will not be an error either).
55
56#### Connection ID
57 feature_number: 2
58 uint64_t conenction_id : RPC connection ID
59
60 Server assigns unique connection ID for each connection and sends it to a client using
61 this feature.
62
63#### Stream parent
64 feature_number: 3
65 uint64_t connection_id : RPC connection ID representing a parent of the stream
66
67 If this feature is present it means that the connection is not regular RPC connection
68 but stream connection. If parent connection is closed or aborted all streams belonging
69 to it will be closed as well.
70
71 Stream connection is a connection that allows bidirectional flow of bytes which may carry one or
72 more messages in each direction. Stream connection should be explicitly closed by both client and
73 server. Closing is done by sending special EOS frame (described below).
74
75
76#### Isolation
77 feature number: 4
78 uint32_t isolation_cookie_len
79 uint8_t isolation_cookie[len]
80
81 The `isolation_cookie` field is used by the server to select a
82 `seastar::scheduling_group` (or equivalent in another implementation) that
83 will run this connection. In the future it will also be used for rpc buffer
84 isolation, to avoid rpc traffic in one isolation group from starving another.
85
86 The server does not directly assign meaning to values of `isolation_cookie`;
87 instead, the interpretation is left to user code.
88
89##### Compressed frame format
90 uint32_t len
91 uint8_t compressed_data[len]
92
93 after compressed_data is uncompressed it becomes regular request, response or streaming frame
94
95## Request frame format
96 uint64_t timeout_in_ms - only present if timeout propagation is negotiated
97 uint64_t verb_type
98 int64_t msg_id
99 uint32_t len
100 uint8_t data[len]
101
102msg_id has to be positive and may never be reused.
103data is transparent for the protocol and serialized/deserialized by a user
104
105## Response frame format
106 int64_t msg_id
107 uint32_t len
108 uint8_t data[len]
109
110if msg_id < 0 enclosed response contains an exception that came as a response to msg id abs(msg_id)
111data is transparent for the protocol and serialized/deserialized by a user
112
113## Stream frame format
114 uint32_t len
115 uint8_t data[len]
116
117len == 0xffffffff signals end of stream
118data is transparent for the protocol and serialized/deserialized by a user
119
120## Exception encoding
121 uint32_t type
122 uint32_t len
123 uint8_t data[len]
124
125### Known exception types
126 USER = 0
127 UNKNOWN_VERB = 1
128
129#### USER exception encoding
130
131 uint32_t len
132 char[len]
133
134This exception is sent as a reply if rpc handler throws an exception.
135It is delivered to a caller as std::runtime_error(char[len])
136
137#### UNKNOWN_VERB exception encoding
138
139 uint64_t verb_id
140
141This exception is sent as a response to a request with unknown verb_id, the verb id is passed back as part of the exception payload.
142
143## More formal protocol description
144
145 request_stream = negotiation_frame, { request | compressed_request }
146 request = verb_type, msg_id, len, { byte }*len
147 compressed_request = len, { bytes }*len
148 response_stream = negotiation_frame, { response | compressed_response }
149 response = reply | exception
150 compressed_response = len, { byte }*len
151 streaming_stream = negotiation_frame, { streaming_frame | compressed_streaming_frame }
152 streaming_frame = len, { byte }*len
153 compressed_streaming_frame = len, { byte }*len
154 reply = msg_id, len, { byte }*len
155 exception = exception_header, serialized_exception
156 exception_header = -msg_id, len
157 serialized_exception = (user|unknown_verb)
158 user = len, {byte}*len
159 unknown_verb = verb_type
160 verb_type = uint64_t
161 msg_id = int64_t
162 len = uint32_t
163 byte = uint8_t
164 negotiation_frame = 'SSTARRPC' len32(negotiation_frame_data) negotiation_frame_data
165 negotiation_frame_data = negotiation_frame_feature_record*
166 negotiation_frame_feature_record = feature_number len {byte}*len
167 feature_number = uint32_t
168
169Note that replies can come in order different from requests, and some requests may not have a reply at all.
170