1 ## Building a Single-Threaded Web Server
3 We’ll start by getting a single-threaded web server working. Before we begin,
4 let’s look at a quick overview of the protocols involved in building web
5 servers. The details of these protocols are beyond the scope of this book, but
6 a brief overview will give you the information you need.
8 The two main protocols involved in web servers are the *Hypertext Transfer
9 Protocol* *(HTTP)* and the *Transmission Control Protocol* *(TCP)*. Both
10 protocols are *request-response* protocols, meaning a *client* initiates
11 requests and a *server* listens to the requests and provides a response to the
12 client. The contents of those requests and responses are defined by the
15 TCP is the lower-level protocol that describes the details of how information
16 gets from one server to another but doesn’t specify what that information is.
17 HTTP builds on top of TCP by defining the contents of the requests and
18 responses. It’s technically possible to use HTTP with other protocols, but in
19 the vast majority of cases, HTTP sends its data over TCP. We’ll work with the
20 raw bytes of TCP and HTTP requests and responses.
22 ### Listening to the TCP Connection
24 Our web server needs to listen to a TCP connection, so that’s the first part
25 we’ll work on. The standard library offers a `std::net` module that lets us do
26 this. Let’s make a new project in the usual fashion:
30 Created binary (application) `hello` project
34 Now enter the code in Listing 20-1 in *src/main.rs* to start. This code will
35 listen at the address `127.0.0.1:7878` for incoming TCP streams. When it gets
36 an incoming stream, it will print `Connection established!`.
38 <span class="filename">Filename: src/main.rs</span>
41 {{#rustdoc_include ../listings/ch20-web-server/listing-20-01/src/main.rs}}
44 <span class="caption">Listing 20-1: Listening for incoming streams and printing
45 a message when we receive a stream</span>
47 Using `TcpListener`, we can listen for TCP connections at the address
48 `127.0.0.1:7878`. In the address, the section before the colon is an IP address
49 representing your computer (this is the same on every computer and doesn’t
50 represent the authors’ computer specifically), and `7878` is the port. We’ve
51 chosen this port for two reasons: HTTP is normally accepted on this port, and
52 7878 is *rust* typed on a telephone.
54 The `bind` function in this scenario works like the `new` function in that it
55 will return a new `TcpListener` instance. The reason the function is called
56 `bind` is that in networking, connecting to a port to listen to is known as
59 The `bind` function returns a `Result<T, E>`, which indicates that binding
60 might fail. For example, connecting to port 80 requires administrator
61 privileges (nonadministrators can listen only on ports higher than 1024), so if
62 we tried to connect to port 80 without being an administrator, binding wouldn’t
63 work. As another example, binding wouldn’t work if we ran two instances of our
64 program and so had two programs listening to the same port. Because we’re
65 writing a basic server just for learning purposes, we won’t worry about
66 handling these kinds of errors; instead, we use `unwrap` to stop the program if
69 The `incoming` method on `TcpListener` returns an iterator that gives us a
70 sequence of streams (more specifically, streams of type `TcpStream`). A single
71 *stream* represents an open connection between the client and the server. A
72 *connection* is the name for the full request and response process in which a
73 client connects to the server, the server generates a response, and the server
74 closes the connection. As such, `TcpStream` will read from itself to see what
75 the client sent and then allow us to write our response to the stream. Overall,
76 this `for` loop will process each connection in turn and produce a series of
77 streams for us to handle.
79 For now, our handling of the stream consists of calling `unwrap` to terminate
80 our program if the stream has any errors; if there aren’t any errors, the
81 program prints a message. We’ll add more functionality for the success case in
82 the next listing. The reason we might receive errors from the `incoming` method
83 when a client connects to the server is that we’re not actually iterating over
84 connections. Instead, we’re iterating over *connection attempts*. The
85 connection might not be successful for a number of reasons, many of them
86 operating system specific. For example, many operating systems have a limit to
87 the number of simultaneous open connections they can support; new connection
88 attempts beyond that number will produce an error until some of the open
89 connections are closed.
91 Let’s try running this code! Invoke `cargo run` in the terminal and then load
92 *127.0.0.1:7878* in a web browser. The browser should show an error message
93 like “Connection reset,” because the server isn’t currently sending back any
94 data. But when you look at your terminal, you should see several messages that
95 were printed when the browser connected to the server!
98 Running `target/debug/hello`
99 Connection established!
100 Connection established!
101 Connection established!
104 Sometimes, you’ll see multiple messages printed for one browser request; the
105 reason might be that the browser is making a request for the page as well as a
106 request for other resources, like the *favicon.ico* icon that appears in the
109 It could also be that the browser is trying to connect to the server multiple
110 times because the server isn’t responding with any data. When `stream` goes out
111 of scope and is dropped at the end of the loop, the connection is closed as
112 part of the `drop` implementation. Browsers sometimes deal with closed
113 connections by retrying, because the problem might be temporary. The important
114 factor is that we’ve successfully gotten a handle to a TCP connection!
116 Remember to stop the program by pressing <span class="keystroke">ctrl-c</span>
117 when you’re done running a particular version of the code. Then restart `cargo
118 run` after you’ve made each set of code changes to make sure you’re running the
121 ### Reading the Request
123 Let’s implement the functionality to read the request from the browser! To
124 separate the concerns of first getting a connection and then taking some action
125 with the connection, we’ll start a new function for processing connections. In
126 this new `handle_connection` function, we’ll read data from the TCP stream and
127 print it so we can see the data being sent from the browser. Change the code to
128 look like Listing 20-2.
130 <span class="filename">Filename: src/main.rs</span>
133 {{#rustdoc_include ../listings/ch20-web-server/listing-20-02/src/main.rs}}
136 <span class="caption">Listing 20-2: Reading from the `TcpStream` and printing
139 We bring `std::io::prelude` into scope to get access to certain traits that let
140 us read from and write to the stream. In the `for` loop in the `main` function,
141 instead of printing a message that says we made a connection, we now call the
142 new `handle_connection` function and pass the `stream` to it.
144 In the `handle_connection` function, we’ve made the `stream` parameter mutable.
145 The reason is that the `TcpStream` instance keeps track of what data it returns
146 to us internally. It might read more data than we asked for and save that data
147 for the next time we ask for data. It therefore needs to be `mut` because its
148 internal state might change; usually, we think of “reading” as not needing
149 mutation, but in this case we need the `mut` keyword.
151 Next, we need to actually read from the stream. We do this in two steps:
152 first, we declare a `buffer` on the stack to hold the data that is read in.
153 We’ve made the buffer 1024 bytes in size, which is big enough to hold the
154 data of a basic request and sufficient for our purposes in this chapter. If
155 we wanted to handle requests of an arbitrary size, buffer management would
156 need to be more complicated; we’ll keep it simple for now. We pass the buffer
157 to `stream.read`, which will read bytes from the `TcpStream` and put them in
160 Second, we convert the bytes in the buffer to a string and print that string.
161 The `String::from_utf8_lossy` function takes a `&[u8]` and produces a `String`
162 from it. The “lossy” part of the name indicates the behavior of this function
163 when it sees an invalid UTF-8 sequence: it will replace the invalid sequence
164 with `�`, the `U+FFFD REPLACEMENT CHARACTER`. You might see replacement
165 characters for characters in the buffer that aren’t filled by request data.
167 Let’s try this code! Start the program and make a request in a web browser
168 again. Note that we’ll still get an error page in the browser, but our
169 program’s output in the terminal will now look similar to this:
173 Compiling hello v0.1.0 (file:///projects/hello)
174 Finished dev [unoptimized + debuginfo] target(s) in 0.42s
175 Running `target/debug/hello`
176 Request: GET / HTTP/1.1
178 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
180 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
181 Accept-Language: en-US,en;q=0.5
182 Accept-Encoding: gzip, deflate
183 Connection: keep-alive
184 Upgrade-Insecure-Requests: 1
185 ������������������������������������
188 Depending on your browser, you might get slightly different output. Now that
189 we’re printing the request data, we can see why we get multiple connections
190 from one browser request by looking at the path after `Request: GET`. If the
191 repeated connections are all requesting */*, we know the browser is trying to
192 fetch */* repeatedly because it’s not getting a response from our program.
194 Let’s break down this request data to understand what the browser is asking of
197 ### A Closer Look at an HTTP Request
199 HTTP is a text-based protocol, and a request takes this format:
202 Method Request-URI HTTP-Version CRLF
207 The first line is the *request line* that holds information about what the
208 client is requesting. The first part of the request line indicates the *method*
209 being used, such as `GET` or `POST`, which describes how the client is making
210 this request. Our client used a `GET` request.
212 The next part of the request line is */*, which indicates the *Uniform Resource
213 Identifier* *(URI)* the client is requesting: a URI is almost, but not quite,
214 the same as a *Uniform Resource Locator* *(URL)*. The difference between URIs
215 and URLs isn’t important for our purposes in this chapter, but the HTTP spec
216 uses the term URI, so we can just mentally substitute URL for URI here.
218 The last part is the HTTP version the client uses, and then the request line
219 ends in a *CRLF sequence*. (CRLF stands for *carriage return* and *line feed*,
220 which are terms from the typewriter days!) The CRLF sequence can also be
221 written as `\r\n`, where `\r` is a carriage return and `\n` is a line feed. The
222 CRLF sequence separates the request line from the rest of the request data.
223 Note that when the CRLF is printed, we see a new line start rather than `\r\n`.
225 Looking at the request line data we received from running our program so far,
226 we see that `GET` is the method, */* is the request URI, and `HTTP/1.1` is the
229 After the request line, the remaining lines starting from `Host:` onward are
230 headers. `GET` requests have no body.
232 Try making a request from a different browser or asking for a different
233 address, such as *127.0.0.1:7878/test*, to see how the request data changes.
235 Now that we know what the browser is asking for, let’s send back some data!
237 ### Writing a Response
239 Now we’ll implement sending data in response to a client request. Responses
240 have the following format:
243 HTTP-Version Status-Code Reason-Phrase CRLF
248 The first line is a *status line* that contains the HTTP version used in the
249 response, a numeric status code that summarizes the result of the request, and
250 a reason phrase that provides a text description of the status code. After the
251 CRLF sequence are any headers, another CRLF sequence, and the body of the
254 Here is an example response that uses HTTP version 1.1, has a status code of
255 200, an OK reason phrase, no headers, and no body:
258 HTTP/1.1 200 OK\r\n\r\n
261 The status code 200 is the standard success response. The text is a tiny
262 successful HTTP response. Let’s write this to the stream as our response to a
263 successful request! From the `handle_connection` function, remove the
264 `println!` that was printing the request data and replace it with the code in
267 <span class="filename">Filename: src/main.rs</span>
270 {{#rustdoc_include ../listings/ch20-web-server/listing-20-03/src/main.rs:here}}
273 <span class="caption">Listing 20-3: Writing a tiny successful HTTP response to
276 The first new line defines the `response` variable that holds the success
277 message’s data. Then we call `as_bytes` on our `response` to convert the string
278 data to bytes. The `write` method on `stream` takes a `&[u8]` and sends those
279 bytes directly down the connection.
281 Because the `write` operation could fail, we use `unwrap` on any error result
282 as before. Again, in a real application you would add error handling here.
283 Finally, `flush` will wait and prevent the program from continuing until all
284 the bytes are written to the connection; `TcpStream` contains an internal
285 buffer to minimize calls to the underlying operating system.
287 With these changes, let’s run our code and make a request. We’re no longer
288 printing any data to the terminal, so we won’t see any output other than the
289 output from Cargo. When you load *127.0.0.1:7878* in a web browser, you should
290 get a blank page instead of an error. You’ve just hand-coded an HTTP request
293 ### Returning Real HTML
295 Let’s implement the functionality for returning more than a blank page. Create
296 a new file, *hello.html*, in the root of your project directory, not in the
297 *src* directory. You can input any HTML you want; Listing 20-4 shows one
300 <span class="filename">Filename: hello.html</span>
303 {{#include ../listings/ch20-web-server/listing-20-04/hello.html}}
306 <span class="caption">Listing 20-4: A sample HTML file to return in a
309 This is a minimal HTML5 document with a heading and some text. To return this
310 from the server when a request is received, we’ll modify `handle_connection` as
311 shown in Listing 20-5 to read the HTML file, add it to the response as a body,
314 <span class="filename">Filename: src/main.rs</span>
317 {{#rustdoc_include ../listings/ch20-web-server/listing-20-05/src/main.rs:here}}
320 <span class="caption">Listing 20-5: Sending the contents of *hello.html* as the
321 body of the response</span>
323 We’ve added a line at the top to bring the standard library’s filesystem module
324 into scope. The code for reading the contents of a file to a string should look
325 familiar; we used it in Chapter 12 when we read the contents of a file for our
326 I/O project in Listing 12-4.
328 Next, we use `format!` to add the file’s contents as the body of the success
329 response. To ensure a valid HTTP response, we add the `Content-Length` header
330 which is set to the size of our response body, in this case the size of `hello.html`.
332 Run this code with `cargo run` and load *127.0.0.1:7878* in your browser; you
333 should see your HTML rendered!
335 Currently, we’re ignoring the request data in `buffer` and just sending back
336 the contents of the HTML file unconditionally. That means if you try requesting
337 *127.0.0.1:7878/something-else* in your browser, you’ll still get back this
338 same HTML response. Our server is very limited and is not what most web servers
339 do. We want to customize our responses depending on the request and only send
340 back the HTML file for a well-formed request to */*.
342 ### Validating the Request and Selectively Responding
344 Right now, our web server will return the HTML in the file no matter what the
345 client requested. Let’s add functionality to check that the browser is
346 requesting */* before returning the HTML file and return an error if the
347 browser requests anything else. For this we need to modify `handle_connection`,
348 as shown in Listing 20-6. This new code checks the content of the request
349 received against what we know a request for */* looks like and adds `if` and
350 `else` blocks to treat requests differently.
352 <span class="filename">Filename: src/main.rs</span>
355 {{#rustdoc_include ../listings/ch20-web-server/listing-20-06/src/main.rs:here}}
358 <span class="caption">Listing 20-6: Matching the request and handling requests
359 to */* differently from other requests</span>
361 First, we hardcode the data corresponding to the */* request into the `get`
362 variable. Because we’re reading raw bytes into the buffer, we transform `get`
363 into a byte string by adding the `b""` byte string syntax at the start of the
364 content data. Then we check whether `buffer` starts with the bytes in `get`. If
365 it does, it means we’ve received a well-formed request to */*, which is the
366 success case we’ll handle in the `if` block that returns the contents of our
369 If `buffer` does *not* start with the bytes in `get`, it means we’ve received
370 some other request. We’ll add code to the `else` block in a moment to respond
371 to all other requests.
373 Run this code now and request *127.0.0.1:7878*; you should get the HTML in
374 *hello.html*. If you make any other request, such as
375 *127.0.0.1:7878/something-else*, you’ll get a connection error like those you
376 saw when running the code in Listing 20-1 and Listing 20-2.
378 Now let’s add the code in Listing 20-7 to the `else` block to return a response
379 with the status code 404, which signals that the content for the request was
380 not found. We’ll also return some HTML for a page to render in the browser
381 indicating the response to the end user.
383 <span class="filename">Filename: src/main.rs</span>
386 {{#rustdoc_include ../listings/ch20-web-server/listing-20-07/src/main.rs:here}}
389 <span class="caption">Listing 20-7: Responding with status code 404 and an
390 error page if anything other than */* was requested</span>
392 Here, our response has a status line with status code 404 and the reason
393 phrase `NOT FOUND`. We’re still not returning headers, and the body of the
394 response will be the HTML in the file *404.html*. You’ll need to create a
395 *404.html* file next to *hello.html* for the error page; again feel free to use
396 any HTML you want or use the example HTML in Listing 20-8.
398 <span class="filename">Filename: 404.html</span>
401 {{#include ../listings/ch20-web-server/listing-20-08/404.html}}
404 <span class="caption">Listing 20-8: Sample content for the page to send back
405 with any 404 response</span>
407 With these changes, run your server again. Requesting *127.0.0.1:7878*
408 should return the contents of *hello.html*, and any other request, like
409 *127.0.0.1:7878/foo*, should return the error HTML from *404.html*.
411 ### A Touch of Refactoring
413 At the moment the `if` and `else` blocks have a lot of repetition: they’re both
414 reading files and writing the contents of the files to the stream. The only
415 differences are the status line and the filename. Let’s make the code more
416 concise by pulling out those differences into separate `if` and `else` lines
417 that will assign the values of the status line and the filename to variables;
418 we can then use those variables unconditionally in the code to read the file
419 and write the response. Listing 20-9 shows the resulting code after replacing
420 the large `if` and `else` blocks.
422 <span class="filename">Filename: src/main.rs</span>
425 {{#rustdoc_include ../listings/ch20-web-server/listing-20-09/src/main.rs:here}}
428 <span class="caption">Listing 20-9: Refactoring the `if` and `else` blocks to
429 contain only the code that differs between the two cases</span>
431 Now the `if` and `else` blocks only return the appropriate values for the
432 status line and filename in a tuple; we then use destructuring to assign these
433 two values to `status_line` and `filename` using a pattern in the `let`
434 statement, as discussed in Chapter 18.
436 The previously duplicated code is now outside the `if` and `else` blocks and
437 uses the `status_line` and `filename` variables. This makes it easier to see
438 the difference between the two cases, and it means we have only one place to
439 update the code if we want to change how the file reading and response writing
440 work. The behavior of the code in Listing 20-9 will be the same as that in
443 Awesome! We now have a simple web server in approximately 40 lines of Rust code
444 that responds to one request with a page of content and responds to all other
445 requests with a 404 response.
447 Currently, our server runs in a single thread, meaning it can only serve one
448 request at a time. Let’s examine how that can be a problem by simulating some
449 slow requests. Then we’ll fix it so our server can handle multiple requests at