update source to Ceph Pacific 16.2.2

[ceph.git] / ceph / src / seastar / doc / tutorial.md
diff --git a/ceph/src/seastar/doc/tutorial.md b/ceph/src/seastar/doc/tutorial.md

index 8bb366a6219e7405c2a84e955af90de2970f640f..b9a2933a1d9bbc02d0f559348b13f82afceb99a3 100644 (file)
--- a/ceph/src/seastar/doc/tutorial.md
+++ b/ceph/src/seastar/doc/tutorial.md
@@ -333,7 +333,7 @@ seastar::future<> f() {
  }
  ```
  
-The function `slow()` deserves more explanation. As usual, this function returns a future<int> immediately, and doesn't wait for the sleep to complete, and the code in `f()` can chain a continuation to this future's completion. The future returned by `slow()` is itself a chain of futures: It will become ready once sleep's future becomes ready and then the value 3 is returned. We'll explain below in more details how `then()` returns a future, and how this allows *chaining* futures.
+The function `slow()` deserves more explanation. As usual, this function returns a `future<int>` immediately, and doesn't wait for the sleep to complete, and the code in `f()` can chain a continuation to this future's completion. The future returned by `slow()` is itself a chain of futures: It will become ready once sleep's future becomes ready and then the value 3 is returned. We'll explain below in more details how `then()` returns a future, and how this allows *chaining* futures.
  
  This example begins to show the convenience of the futures programming model, which allows the programmer to neatly encapsulate complex asynchronous operations. slow() might involve a complex asynchronous operation requiring multiple steps, but its user can use it just as easily as a simple sleep(), and Seastar's engine takes care of running the continuations whose futures have become ready at the right time.
  
@@ -398,7 +398,7 @@ this would have meant that the continuation would contain the address of `i`, no
  
  An exception to the capture-by-reference-is-usually-a-mistake rule is the `do_with()` idiom, which we will introduce later. This idiom ensures that an object lives throughout the life of the continuation, and makes capture-by-reference possible, and very convenient.
  
-Using capture-by-*move* in continuations is also very useful in Seastar applications. By **moving** an object into a continuation, we transfer ownership of this object to the continuation, and make it easy for the object to be automatically deleted when the continuation ends. For example, consider a traditional function taking a std::unique_ptr<T>.
+Using capture-by-*move* in continuations is also very useful in Seastar applications. By **moving** an object into a continuation, we transfer ownership of this object to the continuation, and make it easy for the object to be automatically deleted when the continuation ends. For example, consider a traditional function taking a `std::unique_ptr<T>`.
  ```cpp
  int do_something(std::unique_ptr<T> obj) {
       // do some computation based on the contents of obj, let's say the result is 17
@@ -430,6 +430,30 @@ The `[obj = ...]` capture syntax we used here is new to C++14. This is the main
  
  The extra `() mutable` syntax was needed here because by default when C++ captures a value (in this case, the value of std::move(obj)) into a lambda, it makes this value read-only, so our lambda cannot, in this example, move it again. Adding `mutable` removes this artificial restriction.
  
+## Evaluation order considerations (C++14 only)
+
+C++14 (and below) does *not* guarantee that lambda captures in continuations will be evaluated after the futures they relate to are evaluated
+(See https://en.cppreference.com/w/cpp/language/eval_order).
+
+Consequently, avoid the programming pattern below:
+```cpp
+    return do_something(obj).then([obj = std::move(obj)] () mutable {
+        return do_something_else(std::move(obj));
+    });
+```
+
+In the example above, `[obj = std::move(obj)]` might be evaluated before `do_something(obj)` is called, potentially leading to use-after-move of `obj`.
+
+To guarantee the desired evaluation order, the expression above may be broken into separate statments as follows:
+```cpp
+    auto fut = do_something(obj);
+    return fut.then([obj = std::move(obj)] () mutable {
+        return do_something_else(std::move(obj));
+    });
+```
+
+This was changed in C++17. The expression that creates the object the function `then` is called on (the future) is evaluated before all the arguments to the function, so this style is not required in C++17 and above.
+
  ## Chaining continuations
  TODO: We already saw chaining example in slow() above. talk about the return from then, and returning a future and chaining more thens.
  
@@ -521,14 +545,14 @@ Here, `fail()` catches the exception thrown by `inner()`, whatever it might be,
  > Therefore, unless a function --- including asynchronous functions --- is explicitly tagged "`noexcept`", the application should be prepared to handle exceptions thrown from it. In modern C++, code usually uses RAII to be exception-safe without sprinkling it with `try`/`catch`.  `seastar::defer()` is a RAII-based idiom that ensures that some cleanup code is run even if an exception is thrown.
  
  
-Seastar has a convenient generic function, `futurize_apply()`, which can be useful here. `futurize_apply(func, args...)` runs a function which may return either a future value or an immediate value, and in both cases convert the result into a future value. `futurize_apply()` also converts an immediate exception thrown by the function, if any, into a failed future, just like we did above. So using `futurize_apply()` we can make the above example work even if `fail()` did throw exceptions:
+Seastar has a convenient generic function, `futurize_invoke()`, which can be useful here. `futurize_invoke(func, args...)` runs a function which may return either a future value or an immediate value, and in both cases convert the result into a future value. `futurize_invoke()` also converts an immediate exception thrown by the function, if any, into a failed future, just like we did above. So using `futurize_invoke()` we can make the above example work even if `fail()` did throw exceptions:
  
  ```cpp
  seastar::future<> fail() {
      throw my_exception();
  }
  seastar::future<> f() {
-    return seastar::futurize_apply(fail).finally([] {
+    return seastar::futurize_invoke(fail).finally([] {
          std::cout << "cleaning up\n";
      });
  }
@@ -546,7 +570,7 @@ seastar::future<> f() {
  }
  ```
  
-Here, the lambda function of the first continuation does throw an exception instead of returning a failed future. However, we do _not_ have the same problem as before, which only happened because an asynchronous function threw an exception _before_ returning a valid future. Here, `f()` does return a valid future immediately - the failure will only be known later, after `sleep()` resolves. The message in `finally()` will be printed. Under the hood, the methods which attach continuations (such as `then()` and `finally()`) run the continuation functions using `futurize_apply()`, so continuation functions may return immediate values or, in this case, throw an immediate exception, and still work properly.
+Here, the lambda function of the first continuation does throw an exception instead of returning a failed future. However, we do _not_ have the same problem as before, which only happened because an asynchronous function threw an exception _before_ returning a valid future. Here, `f()` does return a valid future immediately - the failure will only be known later, after `sleep()` resolves. The message in `finally()` will be printed. The methods which attach continuations (such as `then()` and `finally()`) run the continuation the same way, so continuation functions may return immediate values or, in this case, throw an immediate exception, and still work properly.
  
  # Lifetime management
  An asynchronous function starts an operation which may continue long after the function returns: The function itself returns a `future<T>` almost immediately, but it may take a while until this future is resolved.
@@ -705,24 +729,13 @@ The fact that `file` objects can be copied very quickly and all copies actually
  seastar::future<uint64_t> slow_size(file f) {
      return seastar::sleep(10ms).then([f] {
          return f.size();
-    }); 
-    // NOTE: something is wrong here! This will be explained below!
+    });
  }
  ```
  
  Note how calling `slow_size` is as simple as calling `slow_size(f)`, passing a copy of `f`, without  needing to do anything special to ensure that `f` is only destroyed when no longer needed. That simply happens naturally when nothing refers to `f` any more.
  
-However, there is one complication. The above example is actually wrong, as the comment at the end of the function suggested. The problem is that the `f.size()` call started an asynchronous operation on `f` (the file's size may be stored on disk, so not immediately available) and yet at this point nothing is holding a copy of `f`... The method call does not increment the reference count of the object even if it an asynchronous method. (Perhaps this something we should rethink?)
-
-So we need to ensure that something does hold on to another copy of `f` until the asynchronous method call completes. This is how we typically do it:
-```cpp
-seastar::future<uint64_t> slow_size(file f) {
-    return seastar::sleep(10ms).then([f] {
-        return f.size().finally([f] {});
-    });
-}
-```
-What we see here is that `f` is copied not only to the continuation which runs `f.size()`, but also into a continuation (a `finally`) which will run after it. So as long as `f.size()` does not complete, that second continuation holds `f` alive. Note how the second continuation seems to have no code (just a {}).  But the important thing is that the compiler automatically adds to it code to destroy its copy of `f` (and potentially the entire file if this reference count went down to 0).
+You may wonder why `return f.size()` in the above example is safe:  Doesn't it start an asynchronous operation on `f` (the file's size may be stored on disk, so not immediately available), and `f` may be  immediately destroyed when we return and nothing keeps holding a copy of `f`? If `f` is really the last reference, that is indeed a bug, but there is another one: the file is never closed. The assumption that makes the code valid is that there is another reference to `f` that will be used to close it. The close member function holds on to the reference count of that object, so it continues to live even if nothing else keeps holding on to it. Since all futures produced by a file object complete before it is closed, all that is needed for correctness is to remember to always close files.
  
  The reference counting has a run-time cost, but it is usually very small; It is important to remember that Seastar objects are always used by a single CPU only, so the reference-count increment and decrement operations are not the slow atomic operations often used for reference counting, but just regular CPU-local integer operations. Moreover, judicious use of `std::move()` and the compiler's optimizer can reduce the number of unnecessary back-and-forth increment and decrement of the reference count.
  
@@ -752,8 +765,9 @@ We present `seastar::thread`, `seastar::async()` and `seastar::future::get()` in
  # Advanced futures
  ## Futures and interruption
  TODO: A future, e.g., sleep(10s) cannot be interrupted. So if we need to, the promise needs to have a mechanism to interrupt it. Mention pipe's close feature, semaphore stop feature, etc.
+
  ## Futures are single use
-TODO: Talk about if we have a future<int> variable, as soon as we get() or then() it, it becomes invalid - we need to store the value somewhere else. Think if there's an alternative we can suggest
+TODO: Talk about if we have a `future<int>` variable, as soon as we `get()` or `then()` it, it becomes invalid - we need to store the value somewhere else. Think if there's an alternative we can suggest
  
  # Fibers
  Seastar continuations are normally short, but often chained to one another, so that one continuation does a bit of work and then schedules another continuation for later. Such chains can be long, and often even involve loopings - see the following section, "Loops". We call such chains "fibers" of execution.
@@ -764,6 +778,7 @@ TODO: Mention fiber-related sections like loops, semaphores, gates, pipes, etc.
  
  # Loops
  A majority of time-consuming computations involve using loops. Seastar provides several primitives for expressing them in a way that composes nicely with the future/promise model. A very important aspect of Seastar loop primitives is that each iteration is followed by a preemption point, thus allowing other tasks to run inbetween iterations.
+
  ## repeat
  A loop created with `repeat` executes its body until it receives a `stop_iteration` object, which informs if the iteration should continue (`stop_iteration::no`) or stop (`stop_iteration::yes`). Next iteration will be launched only after the first one has finished. The loop body passed to `repeat` is expected to have a `future<stop_iteration>` return type.
  ```cpp
@@ -825,11 +840,11 @@ seastar::future<> do_for_all(std::vector<int> numbers) {
          });
      });
  }
-
  ```
  
  ## parallel_for_each
  Parallel for each is a high concurrency variant of `do_for_each`. When using `parallel_for_each`, all iterations are queued simultaneously - which means that there's no guarantee in which order they finish their operations.
+
  ```cpp
  seastar::future<> flush_all_files(seastar::lw_shared_ptr<std::vector<seastar::file>> files) {
      return seastar::parallel_for_each(files, [] (seastar::file f) {
@@ -839,11 +854,31 @@ seastar::future<> flush_all_files(seastar::lw_shared_ptr<std::vector<seastar::fi
  }
  ```
  `parallel_for_each` is a powerful tool, as it allows spawning many tasks in parallel. It can be a great performance gain, but there are also caveats. First of all, too high concurrency may be troublesome - the details can be found in chapter **Limiting parallelism of loops**.
+
+To restrict the concurrency of `parallel_for_each` by an integer number, use `max_concurrent_for_each` that is described below.
+More details about dealing with parallelism can be found in chapter **Limiting parallelism of loops**.
+
  Secondly, take note that the order in which iterations will be executed within a `parallel_for_each` loop is arbitrary - if a strict ordering is needed, consider using `do_for_each` instead.
  
  TODO: map_reduce, as a shortcut (?) for parallel_for_each which needs to produce some results (e.g., logical_or of boolean results), so we don't need to create a lw_shared_ptr explicitly (or do_with).
  
  TODO: See seastar commit "input_stream: Fix possible infinite recursion in consume()" for an example on why recursion is a possible, but bad, replacement for repeat(). See also my comment on https://groups.google.com/d/msg/seastar-dev/CUkLVBwva3Y/3DKGw-9aAQAJ on why Seastar's iteration primitives should be used over tail call optimization.
+
+## max_concurrent_for_each
+Max concurrent for each is a variant of `parallel_for_each` with restricted parallelism.
+It accepts an additional parameter - `max_concurrent` - with which, up to `max_concurrent` iterations are queued simultaneously, with no guarantee in which order they finish their operations.
+
+```cpp
+seastar::future<> flush_all_files(seastar::lw_shared_ptr<std::vector<seastar::file>> files, size_t max_concurrent) {
+    return seastar::max_concurrent_for_each(files, max_concurrent, [] (seastar::file f) {
+        return f.flush();
+    });
+}
+```
+
+Determining the maximum concurrency limit is out of the scope of this document.
+It should typically be derived from the actual capabilities of the system the software is running on, like the number of parallel execution units or I/O channels, so to optimize utilization of resources without overwhelming the system.
+
  # when_all: Waiting for multiple futures
  Above we've seen `parallel_for_each()`, which starts a number of asynchronous operations, and then waits for all to complete. Seastar has another idiom, `when_all()`, for waiting for several already-existing futures to complete.
  
@@ -985,7 +1020,7 @@ seastar::future<> g() {
  }
  ```
  
-But this version is **not** exception safe: Consider what happens if `slow()` throws an exception before returning a future (this is different from `slow()` returning an exceptional future - we discussed this difference in the section about exception handling). In this case, we decreased the counter, but the `finally()` will never be reached, and the counter will never be increased back. There is a way to fix this code, by replacing the call to `slow()` with `seastar::futurize_apply(slow)`. But the point we're trying to make here is not how to fix buggy code, but rather that by using the separate `semaphore::wait()` and `semaphore::signal()` functions, you can very easily get things wrong.
+But this version is **not** exception safe: Consider what happens if `slow()` throws an exception before returning a future (this is different from `slow()` returning an exceptional future - we discussed this difference in the section about exception handling). In this case, we decreased the counter, but the `finally()` will never be reached, and the counter will never be increased back. There is a way to fix this code, by replacing the call to `slow()` with `seastar::futurize_invoke(slow)`. But the point we're trying to make here is not how to fix buggy code, but rather that by using the separate `semaphore::wait()` and `semaphore::signal()` functions, you can very easily get things wrong.
  
  For exception safety, in C++ it is generally not recommended to have separate resource acquisition and release functions.  Instead, C++ offers safer mechanisms for acquiring a resource (in this case seamphore units) and later releasing it: lambda functions, and RAII ("resource acquisition is initialization"):
  
@@ -1072,7 +1107,7 @@ seastar::future<> f() {
      return seastar::do_with(seastar::semaphore(100), [] (auto& limit) {
          return seastar::repeat([&limit] {
              return limit.wait(1).then([&limit] {
-                seastar::futurize_apply(slow).finally([&limit] {
+                seastar::futurize_invoke(slow).finally([&limit] {
                      limit.signal(1); 
                  });
                  return seastar::stop_iteration::no;
@@ -1083,8 +1118,9 @@ seastar::future<> f() {
  ```
  
  Note how this code differs from the code we saw above for limiting the number of parallel invocations of a function `g()`:
+
  1. Here we cannot use a single `thread_local` semaphore. Each call to `f()` has its loop with parallelism of 100, so needs its own semaphore "`limit`", kept alive during the loop with `do_with()`.
-2. Here we do not wait for `slow()` to complete before continuing the loop, i.e., we do not `return` the future chain starting at `futurize_apply(slow)`. The loop continues to the next iteration when a semaphore unit becomes available, while (in our example) 99 other operations might be ongoing in the background and we do not wait for them.
+2. Here we do not wait for `slow()` to complete before continuing the loop, i.e., we do not `return` the future chain starting at `futurize_invoke(slow)`. The loop continues to the next iteration when a semaphore unit becomes available, while (in our example) 99 other operations might be ongoing in the background and we do not wait for them.
  
  In the examples in this section, we cannot use the `with_semaphore()` shortcut. `with_semaphore()` returns a future which only resolves after the lambda's returned future resolves. But in the above example, the loop needs to know when just the semaphore units are available, to start the next iteration --- and not wait for the previous iteration to complete. We could not achieve that with `with_semaphore()`. But the more general exception-safe idiom, `seastar::get_units()`, can be used in this case, and is recommended:
  
@@ -1131,7 +1167,7 @@ seastar::future<> f() {
                  boost::counting_iterator<int>(456), [&gate] (int i) {
              return seastar::get_units(limit, 1).then([&gate] (auto units) {
                  gate.enter();
-                seastar::futurize_apply(slow).finally([&gate, units = std::move(units)] {
+                seastar::futurize_invoke(slow).finally([&gate, units = std::move(units)] {
                      gate.leave();
                  });
                 });
@@ -1249,6 +1285,7 @@ Now, just one second after gate is closed (after the "starting 5" message is pri
  # Introducing shared-nothing programming
  
  TODO: Explain in more detail Seastar's shared-nothing approach where the entire memory is divided up-front to cores, malloc/free and pointers only work on one core.
+
  TODO: Introduce our shared_ptr (and lw_shared_ptr) and sstring and say the standard ones use locked instructions which are unnecessary when we assume these objects (like all others) are for a single thread. Our futures and continuations do the same.
  
  
@@ -1284,25 +1321,26 @@ We begin with a simple example of a TCP network server written in Seastar. This
  #include <seastar/core/seastar.hh>
  #include <seastar/core/reactor.hh>
  #include <seastar/core/future-util.hh>
-#include <iostream>
+#include <seastar/net/api.hh>
  
  seastar::future<> service_loop() {
      return seastar::do_with(seastar::listen(seastar::make_ipv4_address({1234})),
              [] (auto& listener) {
          return seastar::keep_doing([&listener] () {
              return listener.accept().then(
-                [] (seastar::connected_socket s, seastar::socket_address a) {
-                    std::cout << "Accepted connection from " << a << "\n";
-                });
+                [] (seastar::accept_result res) {
+                    std::cout << "Accepted connection from " << res.remote_address << "\n";
+            });
          });
      });
  }
  ```
  
  This code works as follows:
+
  1. The ```listen()``` call creates a ```server_socket``` object, ```listener```, which listens on TCP port 1234 (on any network interface).
  2. We use ```do_with()``` to ensure that the listener socket lives throughout the loop.
-3. To handle one connection, we call ```listener```'s  ```accept()``` method. This method returns a ```future<connected_socket, socket_address>```, i.e., is eventually resolved with an incoming TCP connection from a client (```connected_socket```) and the client's IP address and port (```socket_address```).
+3. To handle one connection, we call ```listener```'s  ```accept()``` method. This method returns a ```future<accept_result>```, i.e., is eventually resolved with an incoming TCP connection from a client (```accept_result.connection```) and the client's IP address and port (```accept_result.remote_address```).
  4. To repeatedly accept new connections, we use the ```keep_doing()``` loop idiom. ```keep_doing()``` runs its lambda parameter over and over, starting the next iteration as soon as the future returned by the previous iteration completes. The iterations only stop if an exception is encountered. The future returned by ```keep_doing()``` itself completes only when the iteration stops (i.e., only on exception).
  
  Output from this server looks like the following example:
@@ -1337,7 +1375,7 @@ Let's advance our example server by outputting some canned response to each conn
  #include <seastar/core/seastar.hh>
  #include <seastar/core/reactor.hh>
  #include <seastar/core/future-util.hh>
-#include <iostream>
+#include <seastar/net/api.hh>
  
  const char* canned_response = "Seastar is the future!\n";
  
@@ -1348,15 +1386,16 @@ seastar::future<> service_loop() {
              [] (auto& listener) {
          return seastar::keep_doing([&listener] () {
              return listener.accept().then(
-                [] (seastar::connected_socket s, seastar::socket_address a) {
-                    auto out = s.output();
-                    return seastar::do_with(std::move(s), std::move(out),
+                    [] (seastar::accept_result res) {
+                auto s = std::move(res.connection);
+                auto out = s.output();
+                return seastar::do_with(std::move(s), std::move(out),
                          [] (auto& s, auto& out) {
-                            return out.write(canned_response).then([&out] {
-                                return out.close();
-                           });
-                   });
-               });
+                    return out.write(canned_response).then([&out] {
+                        return out.close();
+                    });
+                });
+            });
          });
      });
  }
@@ -1385,49 +1424,52 @@ Let's look at a simple example server involving both reads an writes. This is a
  #include <seastar/core/seastar.hh>
  #include <seastar/core/reactor.hh>
  #include <seastar/core/future-util.hh>
+#include <seastar/net/api.hh>
  
  seastar::future<> handle_connection(seastar::connected_socket s,
                                      seastar::socket_address a) {
      auto out = s.output();
      auto in = s.input();
      return do_with(std::move(s), std::move(out), std::move(in),
-        [] (auto& s, auto& out, auto& in) {
-            return seastar::repeat([&out, &in] {
-                return in.read().then([&out] (auto buf) {
-                    if (buf) {
-                        return out.write(std::move(buf)).then([&out] {
-                            return out.flush();
-                        }).then([] {
-                            return seastar::stop_iteration::no;
-                        });
-                    } else {
-                        return seastar::make_ready_future<seastar::stop_iteration>(
+            [] (auto& s, auto& out, auto& in) {
+        return seastar::repeat([&out, &in] {
+            return in.read().then([&out] (auto buf) {
+                if (buf) {
+                    return out.write(std::move(buf)).then([&out] {
+                        return out.flush();
+                    }).then([] {
+                        return seastar::stop_iteration::no;
+                    });
+                } else {
+                    return seastar::make_ready_future<seastar::stop_iteration>(
                              seastar::stop_iteration::yes);
-                    }
-                });
-            }).then([&out] {
-                return out.close();
+                }
              });
+        }).then([&out] {
+            return out.close();
          });
+    });
  }
  
-seastar::future<> service_loop() {
+seastar::future<> service_loop_3() {
      seastar::listen_options lo;
      lo.reuse_address = true;
      return seastar::do_with(seastar::listen(seastar::make_ipv4_address({1234}), lo),
              [] (auto& listener) {
          return seastar::keep_doing([&listener] () {
              return listener.accept().then(
-                [] (seastar::connected_socket s, seastar::socket_address a) {
-                    // Note we ignore, not return, the future returned by
-                    // handle_connection(), so we do not wait for one
-                    // connection to be handled before accepting the next one.
-                    handle_connection(std::move(s), std::move(a));
+                    [] (seastar::accept_result res) {
+                // Note we ignore, not return, the future returned by
+                // handle_connection(), so we do not wait for one
+                // connection to be handled before accepting the next one.
+                (void)handle_connection(std::move(res.connection), std::move(res.remote_address)).handle_exception(
+                        [] (std::exception_ptr ep) {
+                    fmt::print(stderr, "Could not handle connection: {}\n", ep);
                  });
+            });
          });
      });
  }
-
  ```
  
  The main function ```service_loop()``` loops accepting new connections, and for each connection calls ```handle_connection()``` to handle this connection. Our ```handle_connection()``` returns a future saying when handling this connection completed, but importantly, we do ***not*** wait for this future: Remember that ```keep_doing``` will only start the next iteration when the future returned by the previous iteration is resolved. Because we want to allow parallel ongoing connections, we don't want the next ```accept()``` to wait until the previously accepted connection was closed. So we call ```handle_connection()``` to start the handling of the connection, but return nothing from the continuation, which resolves that future immediately, so ```keep_doing``` will continue to the next ```accept()```.
@@ -1438,9 +1480,74 @@ It is often a mistake to silently ignore an exception, so if the future we're ig
  
  The ```handle_connection()``` function itself is straightforward --- it repeatedly calls ```read()``` read on the input stream, to receive a ```temporary_buffer``` with some data, and then moves this temporary buffer into a ```write()``` call on the output stream. The buffer will eventually be freed, automatically, when the ```write()``` is done with it. When ```read()``` eventually returns an empty buffer signifying the end of input, we stop ```repeat```'s iteration by returning a ```stop_iteration::yes```.
  
-# Sharded servers
+# Sharded services
+
+In the previous section we saw that a Seastar application usually needs to run its code on all available CPU cores. We saw that the `seastar::smp::submit_to()` function allows the main function, which initially runs only on the first core, to start the server's code on all `seastar::smp::count` cores.
+
+However, usually one needs not just to run code on each core, but also to have an object that contains the state of this code. Additionally, one may like to interact with those different objects, and also have a mechanism to stop the service running on the different cores.
+
+The `seastar::sharded<T>` template provides a structured way create such a _sharded service_. It creates a separate object of type `T` in each core, and provides mechanisms to interact with those copies, to start some code on each, and finally to cleanly stop the service.
+
+To use `seastar::sharded`, first create a class for the object holding the state of the service on a single core. For example:
+
+```cpp
+#include <seastar/core/future.hh>
+#include <iostream>
+
+class my_service {
+public:
+    std::string _str;
+    my_service(const std::string& str) : _str(str) { }
+    seastar::future<> run() {
+        std::cerr << "running on " << seastar::engine().cpu_id() <<
+            ", _str = " << _str << \n";
+        return seastar::make_ready_future<>();
+    }
+    seastar::future<> stop() {
+        return seastar::make_ready_future<>();
+    }
+};
+```
+
+The only mandatory method in this object is `stop()`, which will be called in each core when we want to stop the sharded service and want to wait until it stops on all cores.
+
+Now let's see how to use it:
+
+```cpp
+#include <seastar/core/sharded.hh>
+
+seastar::sharded<my_service> s;
+
+seastar::future<> f() {
+    return s.start(std::string("hello")).then([] {
+        return s.invoke_on_all([] (my_service& local_service) {
+            return local_service.run();
+        });
+    }).then([] {
+        return s.stop();
+    });
+}
+```
+
+The `s.start()` starts the service by creating a `my_service` object on each of the cores. The arguments to `s.start()`, if any (in this example, `std::string("hello")`), are passed to `my_service`'s constructor.
+
+But `s.start()` did not start running any code yet (besides the object's constructor). For that, we have the `s.invoke_on_all()` which runs the given lambda on all the cores - giving each lambda the local `my_service` object on that core. In this example, we have a `run()` method on each object, so we run that.
+
+Finally, at the end of the run we want to give the service on all cores a chance to shut down cleanly, so we call `s.stop()`. This will call the `stop()` method on each core's object, and wait for all of them to finish. Calling `s.stop()` before destroying `s` is mandatory - Seastar will warn you if you forget to do it.
+
+In addition to `invoke_on_all()` which runs the same code on all shards, another feature a sharded service often needs is for one shard to invoke code another specific shard. This is done by calling the sharded service's `invoke_on()` method. For example:
+
+```cpp
+seastar::sharded<my_service> s;
+...
+return s.invoke_on(0, [] (my_service& local_service) {
+    std::cerr << "invoked on " << seastar::engine().cpu_id() <<
+        ", _str = " << local_service._str << "\n";
+});
+```
+
+This runs the lambda function on shard 0, with a reference to the local `my_service` object on that shard. 
  
-Discuss `sharded<>`, its benefits over directly using smp::submit_to() as above, etc.
  
  # Shutting down cleanly
  
@@ -1455,6 +1562,7 @@ All Seastar applications accept a standard set of command-line arguments, such a
  TODO: list and explain more of these options.
  
  Every Seastar application also accepts the `-h` (or `--help`) option, which lists and explains all the available options --- the standard Seastar ones, and the user-defined ones as explained below.
+
  ## User-defined command-line options
  Seastar parses the command line options (`argv[]`) when it is passed to `app_template::run()`, looking for its own standard options. Therefore, it is not recommended that the application tries to parse `argv[]` on its own because the application might not understand some of the standard Seastar options and not be able to correctly skip them.
  
@@ -1524,6 +1632,7 @@ For example, consider this code:
  ```cpp
  #include <seastar/core/future.hh>
  #include <seastar/core/sleep.hh>
+#include <seastar/core/app-template.hh>
  
  class myexception {};
  
@@ -1535,26 +1644,53 @@ seastar::future<> f() {
      g();
      return seastar::sleep(std::chrono::seconds(1));
  }
+
+int main(int argc, char** argv) {
+    seastar::app_template app;
+    app.run(argc, argv, f);
+}
  ```
  
  Here, the main function `f()` calls `g()`, but doesn't do anything with the future it returns. But this future resolves with an exception, and this exception is silently ignored. So Seastar prints this warning message about the ignored exception:
  ```
-WARN  2018-01-11 13:23:17,976 [shard 0] seastar - Exceptional future ignored:
-myexception, backtrace: 0x41ce24
-  0x63729e
-  0x636cd5
-  0x4fa6ec
-  0x4fb0e8
-  0x5010e0
-  0x41bf96
-  0x41c26c
-  0x46a734
-  0x4fd661
-  0x4fe1e2
-  0x4fe332
-  0x417a0b
-  /lib64/libc.so.6+0x21009
-  0x417b99
+WARN  2020-03-31 11:08:09,208 [shard 0] seastar - Exceptional future ignored: myexception, backtrace:   /lib64/libasan.so.5+0x6ce7f
+  0x1a64193
+  0x1a6265f
+  0xf326cc
+  0xeaf1a0
+  0xeaffe4
+  0xead7be
+  0xeb5917
+  0xee2477
+  0xec312e
+  0xec8fcf
+  0xeec765
+  0xee1b29
+  0xed9fab
+  0xec27c8
+  0xec867f
+  0xf00acc
+  0xef179d
+  0xef1824
+  0xef18b5
+  0xee4827
+  0xee470f
+  0xf00f81
+  0xebac29
+  0xeb9095
+  0xeb9174
+  0xeb925a
+  0xeb9964
+  0xebef89
+  0x10f74c3
+  0x10fb439
+  0x11005fd
+  0xec4f08
+  0xec2f43
+  0xec3461
+  0xeadabe
+  /lib64/libc.so.6+0x271a2
+  0xead52d
  ```
  
  This message says that an exceptional future was ignored, and that the type of the exception was "`myexception`". The type of the exception is usually not enough to pinpoint where the problem happened, so the warning message also includes the backtrace - the call chain - leading to where the exceptional future was destroyed. The backtrace is given as a list of addresses, where code in other shared libraries is written as a shared library plus offset (when ASLR is enabled, the shared libraries are mapped in a different address each time).
@@ -1568,43 +1704,54 @@ seastar-addr2line -e a.out
  And then paste the list of addresses in the warning message, and conclude with a `control-D` (it's also possible, if you want, to put the list of addresses in the `seastar-addr2line` command line). The result looks like this:
  
  ```
-seastar::report_failed_future(std::__exception_ptr::exception_ptr) at /home/nyh/seastar/core/reactor.cc:4201
-seastar::future<>::~future() at /home/nyh/seastar/core/future.hh:828
- (inlined by) f() at /home/nyh/seastar/doc/code/26.cc:11
-std::_Function_handler<seastar::future<> (), seastar::future<> (*)()>::_M_invoke(std::_Any_data const&) at /usr/include/c++/7/bits/std_function.h:302
-std::function<seastar::future<> ()>::operator()() const at /usr/include/c++/7/bits/std_function.h:706
- (inlined by) operator() at /home/nyh/seastar/core/app-template.cc:119
- (inlined by) _M_invoke at /usr/include/c++/7/bits/std_function.h:302
-std::function<seastar::future<int> ()>::operator()() const at /usr/include/c++/7/bits/std_function.h:706
- (inlined by) seastar::future<int> seastar::futurize<seastar::future<int> >::apply<std::function<seastar::future<int> ()>&>(std::function<seastar::future<int> ()>&) at /home/nyh/seastar/core/future.hh:1362
- (inlined by) auto seastar::futurize_apply<std::function<seastar::future<int> ()>&>(std::function<seastar::future<int> ()>&) at /home/nyh/seastar/core/future.hh:1420
- (inlined by) operator() at /home/nyh/seastar/core/app-template.cc:108
- (inlined by) _M_invoke at /usr/include/c++/7/bits/std_function.h:316
-std::function<void ()>::operator()() const at /usr/include/c++/7/bits/std_function.h:706
- (inlined by) seastar::apply_helper<std::function<void ()>, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply(std::function<void ()>&&, std::tuple<>&&) at /home/nyh/seastar/core/apply.hh:36
- (inlined by) auto seastar::apply<std::function<void ()>>(std::function<void ()>&&, std::tuple<>&&) at /home/nyh/seastar/core/apply.hh:44
- (inlined by) std::enable_if<!seastar::is_future<std::result_of<std::function<void ()> ()>::type>::value, seastar::future<> >::type seastar::do_void_futurize_apply_tuple<std::function<void ()>>(std::function<void ()>&&, std::tuple<>&&) at /home/nyh/seastar/core/future.hh:1320
- (inlined by) seastar::future<> seastar::futurize<void>::apply<std::function<void ()>>(std::function<void ()>&&, std::tuple<>&&) at /home/nyh/seastar/core/future.hh:1340
- (inlined by) seastar::future<> seastar::future<>::then<std::function<void ()>, seastar::future<> >(std::function<void ()>&&)::{lambda(auto:1&&)#1}::operator()<seastar::future_state<> >(auto, std::function<void ()>&&) at /home/nyh/seastar/core/future.hh:933
- (inlined by) seastar::continuation<seastar::future<> seastar::future<>::then<std::function<void ()>, seastar::future<> >(std::function<void ()>&&)::{lambda(auto:1&&)#1}>::run_and_dispose() at /home/nyh/seastar/core/future.hh:413
-seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /home/nyh/seastar/core/reactor.cc:2487
-seastar::reactor::run_some_tasks(std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >&) at /home/nyh/seastar/core/reactor.cc:2884
-seastar::reactor::run_some_tasks(std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >&) at /usr/include/c++/7/chrono:377
- (inlined by) seastar::reactor::run() at /home/nyh/seastar/core/reactor.cc:3028
-seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at /home/nyh/seastar/core/app-template.cc:180
-seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) at /home/nyh/seastar/core/app-template.cc:113
-seastar::app_template::run(int, char**, std::function<seastar::future<> ()>&&) at /home/nyh/seastar/core/app-template.cc:122
-main at /home/nyh/seastar/doc/code/main.cc:11
-__libc_start_main at ??:?
-``` 
+void seastar::backtrace<seastar::current_backtrace()::{lambda(seastar::frame)#1}>(seastar::current_backtrace()::{lambda(seastar::frame)#1}&&) at include/seastar/util/backtrace.hh:56
+seastar::current_backtrace() at src/util/backtrace.cc:84
+seastar::report_failed_future(std::__exception_ptr::exception_ptr const&) at src/core/future.cc:116
+seastar::future_state_base::~future_state_base() at include/seastar/core/future.hh:335
+seastar::future_state<>::~future_state() at include/seastar/core/future.hh:414
+ (inlined by) seastar::future<>::~future() at include/seastar/core/future.hh:990
+f() at test.cc:12
+std::_Function_handler<seastar::future<> (), seastar::future<> (*)()>::_M_invoke(std::_Any_data const&) at /usr/include/c++/9/bits/std_function.h:286
+std::function<seastar::future<> ()>::operator()() const at /usr/include/c++/9/bits/std_function.h:690
+seastar::app_template::run(int, char**, std::function<seastar::future<> ()>&&)::{lambda()#1}::operator()() const at src/core/app-template.cc:131
+std::_Function_handler<seastar::future<int> (), seastar::app_template::run(int, char**, std::function<seastar::future<> ()>&&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) at /usr/include/c++/9/bits/std_function.h:286
+std::function<seastar::future<int> ()>::operator()() const at /usr/include/c++/9/bits/std_function.h:690
+seastar::future<int> seastar::futurize<seastar::future<int> >::invoke<std::function<seastar::future<int> ()>&>(std::function<seastar::future<int> ()>&) at include/seastar/core/future.hh:1670
+auto seastar::futurize_invoke<std::function<seastar::future<int> ()>&>(std::function<seastar::future<int> ()>&) at include/seastar/core/future.hh:1754
+seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&)::{lambda()#1}::operator()() at src/core/app-template.cc:120 (discriminator 4)
+std::_Function_handler<void (), seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) at /usr/include/c++/9/bits/std_function.h:300
+std::function<void ()>::operator()() const at /usr/include/c++/9/bits/std_function.h:690
+seastar::apply_helper<std::function<void ()>&, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply(std::function<void ()>&, std::tuple<>&&) at include/seastar/core/apply.hh:36
+auto seastar::apply<std::function<void ()>&>(std::function<void ()>&, std::tuple<>&&) at include/seastar/core/apply.hh:44
+seastar::future<> seastar::futurize<void>::apply<std::function<void ()>&>(std::function<void ()>&, std::tuple<>&&) at include/seastar/core/future.hh:1634
+auto seastar::futurize_apply<std::function<void ()>&>(std::function<void ()>&, std::tuple<>&&) at include/seastar/core/future.hh:1766
+seastar::future<>::then<std::function<void ()>, seastar::future<> >(std::function<void ()>&&)::{lambda()#1}::operator()() at include/seastar/core/future.hh:1191
+seastar::noncopyable_function<seastar::future<> ()>::direct_vtable_for<seastar::future<>::then<std::function<void ()>, seastar::future<> >(std::function<void ()>&&)::{lambda()#1}>::call(seastar::noncopyable_function<seastar::future<> ()> const*) at include/seastar/util/noncopyable_function.hh:101
+seastar::noncopyable_function<seastar::future<> ()>::operator()() const at include/seastar/util/noncopyable_function.hh:184
+seastar::apply_helper<seastar::noncopyable_function<seastar::future<> ()>, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply(seastar::noncopyable_function<seastar::future<> ()>&&, std::tuple<>&&) at include/seastar/core/apply.hh:36
+auto seastar::apply<seastar::noncopyable_function<seastar::future<> ()>>(seastar::noncopyable_function<seastar::future<> ()>&&, std::tuple<>&&) at include/seastar/core/apply.hh:44
+seastar::future<> seastar::futurize<seastar::future<> >::apply<seastar::noncopyable_function<seastar::future<> ()>>(seastar::noncopyable_function<seastar::future<> ()>&&, std::tuple<>&&) at include/seastar/core/future.hh:1660
+seastar::future<>::then_impl_nrvo<seastar::noncopyable_function<seastar::future<> ()>, seastar::future<> >(seastar::noncopyable_function<seastar::future<> ()>&&)::{lambda()#1}::operator()() const::{lambda(seastar::internal::promise_base_with_type<>&, seastar::future_state<>&&)#1}::operator()(seastar::internal::promise_base_with_type<>, seastar::future_state<>) at include/seastar/core/future.hh:1213
+seastar::continuation<seastar::internal::promise_base_with_type<>, seastar::future<>::then_impl_nrvo<seastar::noncopyable_function<seastar::future<> ()>, seastar::future<> >(seastar::noncopyable_function<seastar::future<> ()>&&)::{lambda()#1}::operator()() const::{lambda(seastar::internal::promise_base_with_type<>&, seastar::future_state<>&&)#1}>::run_and_dispose() at include/seastar/core/future.hh:509
+seastar::reactor::run_tasks(seastar::reactor::task_queue&) at src/core/reactor.cc:2124
+seastar::reactor::run_some_tasks() at src/core/reactor.cc:2539 (discriminator 2)
+seastar::reactor::run() at src/core/reactor.cc:2694
+seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at src/core/app-template.cc:199 (discriminator 1)
+seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) at src/core/app-template.cc:115 (discriminator 2)
+seastar::app_template::run(int, char**, std::function<seastar::future<> ()>&&) at src/core/app-template.cc:130 (discriminator 2)
+main at test.cc:19 (discriminator 1)
+__libc_start_main at /usr/src/debug/glibc-2.30-34-g994e529a37/csu/../csu/libc-start.c:308
+_start at ??:?
+```
  
  Most of the lines at the bottom of this backtrace are not interesting, and just showing the internal details of how Seastar ended up running the main function `f()`. The only interesting part is the _first_ few lines:
  
  ```
-seastar::report_failed_future(std::__exception_ptr::exception_ptr) at
- /home/nyh/seastar/core/reactor.cc:4201
-seastar::future<>::~future() at /home/nyh/seastar/core/future.hh:828
- (inlined by) f() at /home/nyh/seastar/doc/code/26.cc:11
+seastar::report_failed_future(std::__exception_ptr::exception_ptr const&) at src/core/future.cc:116
+seastar::future_state_base::~future_state_base() at include/seastar/core/future.hh:335
+seastar::future_state<>::~future_state() at include/seastar/core/future.hh:414
+ (inlined by) seastar::future<>::~future() at include/seastar/core/future.hh:990
+f() at test.cc:12
  ```
  
  Here we see that the warning message was printed by the `seastar::report_failed_future()` function which was called when destroying a future (`future<>::~future`) that had not been handled. The future's destructor was called in line 11 of our test code (`26.cc`), which is indeed the line where we called `g()` and ignored its result.  
@@ -1666,8 +1813,11 @@ In the current implementation, both `make_exception_future_with_backtrace` and `
  
  
  ## Debugging with gdb
+
+```
  handle SIGUSR1 pass noprint
  handle SIGALRM pass noprint
+```
  
  # Promise objects
  
@@ -1710,6 +1860,7 @@ So `seastar::foreign_ptr<>` not only has functional benefits (namely, to run the
  Above, we discussed the case of transferring ownership of an object to a another shard, via `seastar::foreign_ptr<std::unique_ptr<T>>`. However, sometimes the sender does not want to relinquish ownership of the object. Sometimes, it wants the remote thread to operate on its object and return with the object intact. Sometimes, it wants to send the same object to multiple shards. In such cases, `seastar::foreign_ptr<seastar::lw_shared_ptr<T>> is useful. The user needs to watch out, of course, not to operate on the same object from multiple threads concurrently. If this cannot be ensured by program logic alone, some methods of serialization must be used - such as running the operations on the home shard with `submit_to()` as described above.
  
  Normally, a `seastar::foreign_ptr` cannot not be copied - only moved. However, when it holds a smart pointer that can be copied (namely, a `shared_ptr`), one may want to make an additional copy of that pointer and create a second `foreign_ptr`. Doing this is inefficient and asynchronous (it requires communicating with the original owner of the object to create the copies), so a method `future<foreign_ptr> copy()` needs to be explicitly used instead of the normal copy constructor.
+
  # Seastar::thread
  Seastar's programming model, using futures and continuations, is very powerful and efficient.  However, as we've already seen in examples above, it is also relatively verbose: Every time that we need to wait before proceeding with a computation, we need to write another continuation. We also need to worry about passing the data between the different continuations (using techniques like those described in the [Lifetime management] section). Simple flow-control constructs such as loops also become more involved using continuations. For example, consider this simple classical synchronous code:
  ```cpp
@@ -1922,11 +2073,15 @@ Counters: 2'273'783'385, 4'549'995'716
  ```
  ## Latency
  TODO: Task quota, preempt, loops with built-in preemption check, etc.
+
  ## Disk I/O scheduler
  TODO
+
  ## Network scheduler
  TODO: Say that not yet available. Give example of potential problem - e.g., sharing a slow WAN link.
+
  ## Controllers
  TODO: Talk about how to dynamically change the number of shares, and why.
+
  ## Multi-tenancy
  TODO