]> git.proxmox.com Git - systemd.git/blame - docs/PORTABLE_SERVICES.md
New upstream version 249~rc1
[systemd.git] / docs / PORTABLE_SERVICES.md
CommitLineData
7c20daf6
FS
1---
2title: Portable Services Introduction
46cdbd49
BR
3category: Concepts
4layout: default
7c20daf6
FS
5---
6
b012e921
MB
7# Portable Services Introduction
8
9This systemd version includes a preview of the "portable service"
10concept. "Portable Services" are supposed to be an incremental improvement over
11traditional system services, making two specific facets of container management
12available to system services more readily. Specifically:
13
141. The bundling of applications, i.e. packing up multiple services, their
15 binaries and all their dependencies in a single image, and running them
16 directly from it.
17
182. Stricter default security policies, i.e. sand-boxing of applications.
19
20The primary tool for interfacing with "portable services" is the new
21"portablectl" program. It's currently shipped in /usr/lib/systemd/portablectl
22(i.e. not in the `$PATH`), since it's not yet considered part of the officially
23supported systemd interfaces — it's a preview still after all.
24
25Portable services don't bring anything inherently new to the table. All they do
26is put together known concepts in a slightly nicer way to cover a specific set
27of use-cases in a nicer way.
28
6e866b33 29## So, what *is* a "Portable Service"?
b012e921
MB
30
31A portable service is ultimately just an OS tree, either inside of a directory
32tree, or inside a raw disk image containing a Linux file system. This tree is
33called the "image". It can be "attached" or "detached" from the system. When
34"attached" specific systemd units from the image are made available on the host
35system, then behaving pretty much exactly like locally installed system
36services. When "detached" these units are removed again from the host, leaving
37no artifacts around (except maybe messages they might have logged).
38
39The OS tree/image can be created with any tool of your choice. For example, you
40can use `dnf --installroot=` if you like, or `debootstrap`, the image format is
41entirely generic, and doesn't have to carry any specific metadata beyond what
42distribution images carry anyway. Or to say this differently: the image format
43doesn't define any new metadata as unit files and OS tree directories or disk
44images are already sufficient, and pretty universally available these days. One
45particularly nice tool for creating suitable images is
46[mkosi](https://github.com/systemd/mkosi), but many other existing tools will
47do too.
48
49If you so will, "Portable Services" are a nicer way to manage chroot()
50environments, with better security, tooling and behavior.
51
6e866b33 52## Where's the difference to a "Container"?
b012e921
MB
53
54"Container" is a very vague term, after all it is used for
55systemd-nspawn/LXC-type OS containers, for Docker/rkt-like micro service
56containers, and even certain 'lightweight' VM runtimes.
57
58The "portable service" concept ultimately will not provide a fully isolated
59environment to the payload, like containers mostly intend to. Instead they are
60from the beginning more alike regular system services, can be controlled with
61the same tools, are exposed the same way in all infrastructure and so on. Their
8b3d4ff0 62main difference is that they use a different root directory than the rest of the
b012e921 63system. Hence, the intention is not to run code in a different, isolated world
8b3d4ff0 64from the host — like most containers would do it — but to run it in the same
b012e921
MB
65world, but with stricter access controls on what the service can see and do.
66
67As one point of differentiation: as programs run as "portable services" are
68pretty much regular system services, they won't run as PID 1 (like Docker would
8b3d4ff0 69do it), but as normal processes. A corollary of that is that they aren't supposed
b012e921
MB
70to manage anything in their own environment (such as the network) as the
71execution environment is mostly shared with the rest of the system.
72
73The primary focus use-case of "portable services" is to extend the host system
74with encapsulated extensions, but provide almost full integration with the rest
75of the system, though possibly restricted by effective security knobs. This
76focus includes system extensions otherwise sometimes called "super-privileged
77containers".
78
79Note that portable services are only available for system services, not for
8b3d4ff0
MB
80user services (i.e. the functionality cannot be used for the stuff
81bubblewrap/flatpak is focusing on).
b012e921 82
6e866b33 83## Mode of Operation
b012e921 84
8b3d4ff0 85If you have a portable service image, maybe in a raw disk image called
b012e921
MB
86`foobar_0.7.23.raw`, then attaching the services to the host is as easy as:
87
88```
89# /usr/lib/systemd/portablectl attach foobar_0.7.23.raw
90```
91
92This command does the following:
93
7c20daf6
FS
941. It dissects the image, checks and validates the `/etc/os-release`
95 (or `/usr/lib/os-release`, see below) data of the image, and looks for
96 all included unit files.
b012e921
MB
97
982. It copies out all unit files with a suffix of `.service`, `.socket`,
99 `.target`, `.timer` and `.path`. whose name begins with the image's name
100 (with the .raw removed), truncated at the first underscore (if there is
101 one). This prefix name generated from the image name must be followed by a
102 ".", "-" or "@" character in the unit name. Or in other words, given the
103 image name of `foobar_0.7.23.raw` all unit files matching
104 `foobar-*.{service|socket|target|timer|path}`,
105 `foobar@.{service|socket|target|timer|path}` as well as
106 `foobar.*.{service|socket|target|timer|path}` and
107 `foobar.{service|socket|target|timer|path}` are copied out. These unit files
6e866b33
MB
108 are placed in `/etc/systemd/system.attached/` (which is part of the normal
109 unit file search path of PID 1, and thus loaded exactly like regular unit
110 files). Within the images the unit files are looked for at the usual
111 locations, i.e. in `/usr/lib/systemd/system/` and `/etc/systemd/system/` and
112 so on, relative to the image's root.
b012e921
MB
113
1143. For each such unit file a drop-in file is created. Let's say
115 `foobar-waldo.service` was one of the unit files copied to
6e866b33
MB
116 `/etc/systemd/system.attached/`, then a drop-in file
117 `/etc/systemd/system.attached/foobar-waldo.service.d/20-portable.conf` is
118 created, containing a few lines of additional configuration:
b012e921
MB
119
120 ```
121 [Service]
122 RootImage=/path/to/foobar.raw
123 Environment=PORTABLE=foobar
124 LogExtraFields=PORTABLE=foobar
125 ```
126
1274. For each such unit a "profile" drop-in is linked in. This "profile" drop-in
128 generally contains security options that lock down the service. By default
129 the `default` profile is used, which provides a medium level of
130 security. There's also `trusted` which runs the service at the highest
131 privileges, i.e. host's root and everything. The `strict` profile comes with
132 the toughest security restrictions. Finally, `nonetwork` is like `default`
133 but without network access. Users may define their own profiles too (or
134 modify the existing ones)
135
136And that's already it.
137
8b3d4ff0 138Note that the images need to stay around (and in the same location) as long as the
b012e921
MB
139portable service is attached. If an image is moved, the `RootImage=` line
140written to the unit drop-in would point to an non-existing place, and break the
141logic.
142
143The `portablectl detach` command executes the reverse operation: it looks for
144the drop-ins and the unit files associated with the image, and removes them
145again.
146
8b3d4ff0 147Note that `portablectl attach` won't enable or start any of the units it copies
b012e921
MB
148out. This still has to take place in a second, separate step. (That said We
149might add options to do this automatically later on.).
150
6e866b33 151## Requirements on Images
b012e921
MB
152
153Note that portable services don't introduce any new image format, but most OS
154images should just work the way they are. Specifically, the following
155requirements are made for an image that can be attached/detached with
156`portablectl`.
157
bb4f798a
MB
1581. It must contain an executable that shall be invoked, along with all its
159 dependencies. If binary code, the code needs to be compiled for an
160 architecture compatible with the host.
b012e921
MB
161
1622. The image must either be a plain sub-directory (or btrfs subvolume)
163 containing the binaries and its dependencies in a classic Linux OS tree, or
164 must be a raw disk image either containing only one, naked file system, or
165 an image with a partition table understood by the Linux kernel with only a
166 single partition defined, or alternatively, a GPT partition table with a set
167 of properly marked partitions following the [Discoverable Partitions
46cdbd49 168 Specification](https://systemd.io/DISCOVERABLE_PARTITIONS).
b012e921
MB
169
1703. The image must at least contain one matching unit file, with the right name
171 prefix and suffix (see above). The unit file is searched in the usual paths,
172 i.e. primarily /etc/systemd/system/ and /usr/lib/systemd/system/ within the
173 image. (The implementation will check a couple of other paths too, but it's
174 recommended to use these two paths.)
175
7c20daf6
FS
1764. The image must contain an os-release file, either in `/etc/os-release` or
177 `/usr/lib/os-release`. The file should follow the standard format.
178
1795. The image must contain the files `/etc/resolv.conf` and `/etc/machine-id`
180 (empty files are ok), they will be bind mounted from the host at runtime.
b012e921 181
bb4f798a
MB
1826. The image must contain directories `/proc/`, `/sys/`, `/dev/`, `/run/`,
183 `/tmp/`, `/var/tmp/` that can be mounted over with the corresponding version
184 from the host.
185
1867. The OS might require other files or directories to be in place. For example,
187 if the image is built based on glibc, the dynamic loader needs to be
188 available in `/lib/ld-linux.so.2` or `/lib64/ld-linux-x86-64.so.2` (or
189 similar, depending on architecture), and if the distribution implements a
190 merged `/usr/` tree, this means `/lib` and/or `/lib64` need to be symlinks
191 to their respective counterparts below `/usr/`. For details see your
192 distribution's documentation.
193
194Note that images created by tools such as `debootstrap`, `dnf --installroot=`
195or `mkosi` generally qualify for all of the above in one way or another. If you
196wonder what the most minimal image would be that complies with the requirements
197above, it could consist of this:
b012e921
MB
198
199```
bb4f798a
MB
200/usr/bin/minimald # a statically compiled binary
201/usr/lib/systemd/system/minimal-test.service # the unit file for the service, with ExecStart=/usr/bin/minimald
202/usr/lib/os-release # an os-release file explaining what this is
203/etc/resolv.conf # empty file to mount over with host's version
204/etc/machine-id # ditto
205/proc/ # empty directory to use as mount point for host's API fs
206/sys/ # ditto
207/dev/ # ditto
208/run/ # ditto
209/tmp/ # ditto
210/var/tmp/ # ditto
b012e921
MB
211```
212
213And that's it.
214
215Note that qualifying images do not have to contain an init system of their
216own. If they do, it's fine, it will be ignored by the portable service logic,
217but they generally don't have to, and it might make sense to avoid any, to keep
218images minimal.
219
bb4f798a
MB
220If the image is writable, and some of the files or directories that are
221overmounted from the host do not exist yet they are automatically created. On
222read-only, immutable images (e.g. squashfs images) all files and directories to
223over-mount must exist already.
224
b012e921 225Note that as no new image format or metadata is defined, it's very
8b3d4ff0 226straightforward to define images than can be made use of in a number of
b012e921
MB
227different ways. For example, by using `mkosi -b` you can trivially build a
228single, unified image that:
229
2301. Can be attached as portable service, to run any container services natively
231 on the host.
232
2332. Can be run as OS container, using `systemd-nspawn`, by booting the image
234 with `systemd-nspawn -i -b`.
235
2363. Can be booted directly as VM image, using a generic VM executor such as
237 `virtualbox`/`qemu`/`kvm`
238
2394. Can be booted directly on bare-metal systems.
240
241Of course, to facilitate 2, 3 and 4 you need to include an init system in the
242image. To facility 3 and 4 you also need to include a boot loader in the
243image. As mentioned `mkosi -b` takes care of all of that for you, but any other
244image generator should work too.
245
6e866b33 246## Execution Environment
b012e921
MB
247
248Note that the code in portable service images is run exactly like regular
249services. Hence there's no new execution environment to consider. Oh, unlike
250Docker would do it, as these are regular system services they aren't run as PID
2511 either, but with regular PID values.
252
6e866b33 253## Access to host resources
b012e921
MB
254
255If services shipped with this mechanism shall be able to access host resources
256(such as files or AF_UNIX sockets for IPC), use the normal `BindPaths=` and
257`BindReadOnlyPaths=` settings in unit files to mount them in. In fact the
258`default` profile mentioned above makes use of this to ensure
259`/etc/resolv.conf`, the D-Bus system bus socket or write access to the logging
260subsystem are available to the service.
261
6e866b33 262## Instantiation
b012e921
MB
263
264Sometimes it makes sense to instantiate the same set of services multiple
265times. The portable service concept does not introduce a new logic for this. It
266is recommended to use the regular unit templating of systemd for this, i.e. to
267include template units such as `foobar@.service`, so that instantiation is as
268simple as:
269
270```
271# /usr/lib/systemd/portablectl attach foobar_0.7.23.raw
272# systemctl enable --now foobar@instancea.service
273# systemctl enable --now foobar@instanceb.service
274
275```
276
277The benefit of this approach is that templating works exactly the same for
278units shipped with the OS itself as for attached portable services.
279
6e866b33 280## Immutable images with local data
b012e921
MB
281
282It's a good idea to keep portable service images read-only during normal
283operation. In fact all but the `trusted` profile will default to this kind of
284behaviour, by setting the `ProtectSystem=strict` option. In this case writable
285service data may be placed on the host file system. Use `StateDirectory=` in
286the unit files to enable such behaviour and add a local data directory to the
287services copied onto the host.