]> git.proxmox.com Git - pve-docs.git/blob - pct.adoc
pct: supported distros: do not just throw in a link without explanation
[pve-docs.git] / pct.adoc
1 [[chapter_pct]]
2 ifdef::manvolnum[]
3 pct(1)
4 ======
5 :pve-toplevel:
6
7 NAME
8 ----
9
10 pct - Tool to manage Linux Containers (LXC) on Proxmox VE
11
12
13 SYNOPSIS
14 --------
15
16 include::pct.1-synopsis.adoc[]
17
18 DESCRIPTION
19 -----------
20 endif::manvolnum[]
21
22 ifndef::manvolnum[]
23 Proxmox Container Toolkit
24 =========================
25 :pve-toplevel:
26 endif::manvolnum[]
27 ifdef::wiki[]
28 :title: Linux Container
29 endif::wiki[]
30
31 Containers are a lightweight alternative to fully virtualized machines (VMs).
32 They use the kernel of the host system that they run on, instead of emulating a
33 full operating system (OS). This means that containers can access resources on
34 the host system directly.
35
36 The runtime costs for containers is low, usually negligible. However, there are
37 some drawbacks that need be considered:
38
39 * Only Linux distributions can be run in Proxmox Containers. It is not possible to run
40 other operating systems like, for example, FreeBSD or Microsoft Windows
41 inside a container.
42
43 * For security reasons, access to host resources needs to be restricted.
44 Therefore, containers run in their own separate namespaces. Additionally some
45 syscalls (user space requests to the Linux kernel) are not allowed within containers.
46
47 {pve} uses https://linuxcontainers.org/lxc/introduction/[Linux Containers (LXC)] as its underlying
48 container technology. The ``Proxmox Container Toolkit'' (`pct`) simplifies the
49 usage and management of LXC, by providing an interface that abstracts
50 complex tasks.
51
52 Containers are tightly integrated with {pve}. This means that they are aware of
53 the cluster setup, and they can use the same network and storage resources as
54 virtual machines. You can also use the {pve} firewall, or manage containers
55 using the HA framework.
56
57 Our primary goal is to offer an environment that provides the benefits of using a
58 VM, but without the additional overhead. This means that Proxmox Containers can
59 be categorized as ``System Containers'', rather than ``Application Containers''.
60
61 NOTE: If you want to run application containers, for example, 'Docker' images, it
62 is recommended that you run them inside a Proxmox Qemu VM. This will give you
63 all the advantages of application containerization, while also providing the
64 benefits that VMs offer, such as strong isolation from the host and the ability
65 to live-migrate, which otherwise isn't possible with containers.
66
67
68 Technology Overview
69 -------------------
70
71 * LXC (https://linuxcontainers.org/)
72
73 * Integrated into {pve} graphical web user interface (GUI)
74
75 * Easy to use command line tool `pct`
76
77 * Access via {pve} REST API
78
79 * 'lxcfs' to provide containerized /proc file system
80
81 * Control groups ('cgroups') for resource isolation and limitation
82
83 * 'AppArmor' and 'seccomp' to improve security
84
85 * Modern Linux kernels
86
87 * Image based deployment (xref:pct_supported_distributions[templates])
88
89 * Uses {pve} xref:chapter_storage[storage library]
90
91 * Container setup from host (network, DNS, storage, etc.)
92
93
94 [[pct_supported_distributions]]
95 Supported Distributions
96 -----------------------
97
98 List of officially supported distributions can be found below.
99
100 Templates for the following distributions are available through our
101 repositories. You can use xref:pct_container_images[pveam] tool or the
102 Graphical User Interface to download them.
103
104 Alpine Linux
105 ~~~~~~~~~~~~
106
107 [quote, 'https://alpinelinux.org']
108 ____
109 Alpine Linux is a security-oriented, lightweight Linux distribution based on
110 musl libc and busybox.
111 ____
112
113 For currently supported releases see: https://alpinelinux.org/releases/
114
115 Arch Linux
116 ~~~~~~~~~~
117
118 [quote, 'https://archlinux.org/']
119 ____
120 Arch Linux, a lightweight and flexible Linux® distribution that tries to Keep It Simple.
121 ____
122
123 Arch Linux is using a rolling-release model, see its wiki for more details:
124
125 https://wiki.archlinux.org/title/Arch_Linux
126
127 CentOS, Almalinux, Rocky Linux
128 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
129
130 CentOS / CentOS Stream
131 ^^^^^^^^^^^^^^^^^^^^^^
132
133 [quote, 'https://centos.org']
134 ____
135 The CentOS Linux distribution is a stable, predictable, manageable and
136 reproducible platform derived from the sources of Red Hat Enterprise Linux
137 (RHEL)
138 ____
139
140 For currently supported releases see:
141
142 https://wiki.centos.org/About/Product
143
144 Almalinux
145 ^^^^^^^^^
146
147 [quote, 'https://almalinux.org']
148 ____
149 An Open Source, community owned and governed, forever-free enterprise Linux
150 distribution, focused on long-term stability, providing a robust
151 production-grade platform. AlmaLinux OS is 1:1 binary compatible with RHEL® and
152 pre-Stream CentOS.
153 ____
154
155
156 For currently supported releases see:
157
158 https://en.wikipedia.org/wiki/AlmaLinux#Releases
159
160 Rocky Linux
161 ^^^^^^^^^^^
162
163 [quote, 'https://rockylinux.org']
164 ____
165 Rocky Linux is a community enterprise operating system designed to be 100%
166 bug-for-bug compatible with America's top enterprise Linux distribution now
167 that its downstream partner has shifted direction.
168 ____
169
170 For currently supported releases see:
171
172 https://en.wikipedia.org/wiki/Rocky_Linux#Releases
173
174 Debian
175 ~~~~~~
176
177 [quote, 'https://www.debian.org/intro/index#software']
178 ____
179 Debian is a free operating system, developed and maintained by the Debian
180 project. A free Linux distribution with thousands of applications to meet our
181 users' needs.
182 ____
183
184 For currently supported releases see:
185
186 https://www.debian.org/releases/stable/releasenotes
187
188 Devuan
189 ~~~~~~
190
191 [quote, 'https://www.devuan.org']
192 ____
193 Devuan GNU+Linux is a fork of Debian without systemd that allows users to
194 reclaim control over their system by avoiding unnecessary entanglements and
195 ensuring Init Freedom.
196 ____
197
198 For currently supported releases see:
199
200 https://www.devuan.org/os/releases
201
202 Fedora
203 ~~~~~~
204
205 [quote, 'https://getfedora.org']
206 ____
207 Fedora creates an innovative, free, and open source platform for hardware,
208 clouds, and containers that enables software developers and community members
209 to build tailored solutions for their users.
210 ____
211
212 For currently supported releases see:
213
214 https://fedoraproject.org/wiki/Releases
215
216 Gentoo
217 ~~~~~~
218
219 [quote, 'https://www.gentoo.org']
220 ____
221 a highly flexible, source-based Linux distribution.
222 ____
223
224 Gentoo is using a rolling-release model.
225
226 OpenSUSE
227 ~~~~~~~~
228
229 [quote, 'https://www.opensuse.org']
230 ____
231 The makers' choice for sysadmins, developers and desktop users.
232 ____
233
234 For currently supported releases see:
235
236 https://get.opensuse.org/leap/
237
238 Ubuntu
239 ~~~~~~
240
241 [quote, 'https://ubuntu.com/']
242 ____
243 Ubuntu is the modern, open source operating system on Linux for the enterprise
244 server, desktop, cloud, and IoT.
245 ____
246
247 For currently supported releases see:
248
249 https://wiki.ubuntu.com/Releases
250
251 [[pct_container_images]]
252 Container Images
253 ----------------
254
255 Container images, sometimes also referred to as ``templates'' or
256 ``appliances'', are `tar` archives which contain everything to run a container.
257
258 {pve} itself provides a variety of basic templates for the
259 xref:pct_supported_distributions[most common Linux distributions]. They can be
260 downloaded using the GUI or the `pveam` (short for {pve} Appliance Manager)
261 command line utility. Additionally, https://www.turnkeylinux.org/[TurnKey
262 Linux] container templates are also available to download.
263
264 The list of available templates is updated daily through the 'pve-daily-update'
265 timer. You can also trigger an update manually by executing:
266
267 ----
268 # pveam update
269 ----
270
271 To view the list of available images run:
272
273 ----
274 # pveam available
275 ----
276
277 You can restrict this large list by specifying the `section` you are
278 interested in, for example basic `system` images:
279
280 .List available system images
281 ----
282 # pveam available --section system
283 system alpine-3.12-default_20200823_amd64.tar.xz
284 system alpine-3.13-default_20210419_amd64.tar.xz
285 system alpine-3.14-default_20210623_amd64.tar.xz
286 system archlinux-base_20210420-1_amd64.tar.gz
287 system centos-7-default_20190926_amd64.tar.xz
288 system centos-8-default_20201210_amd64.tar.xz
289 system debian-9.0-standard_9.7-1_amd64.tar.gz
290 system debian-10-standard_10.7-1_amd64.tar.gz
291 system devuan-3.0-standard_3.0_amd64.tar.gz
292 system fedora-33-default_20201115_amd64.tar.xz
293 system fedora-34-default_20210427_amd64.tar.xz
294 system gentoo-current-default_20200310_amd64.tar.xz
295 system opensuse-15.2-default_20200824_amd64.tar.xz
296 system ubuntu-16.04-standard_16.04.5-1_amd64.tar.gz
297 system ubuntu-18.04-standard_18.04.1-1_amd64.tar.gz
298 system ubuntu-20.04-standard_20.04-1_amd64.tar.gz
299 system ubuntu-20.10-standard_20.10-1_amd64.tar.gz
300 system ubuntu-21.04-standard_21.04-1_amd64.tar.gz
301 ----
302
303 Before you can use such a template, you need to download them into one of your
304 storages. If you're unsure to which one, you can simply use the `local` named
305 storage for that purpose. For clustered installations, it is preferred to use a
306 shared storage so that all nodes can access those images.
307
308 ----
309 # pveam download local debian-10.0-standard_10.0-1_amd64.tar.gz
310 ----
311
312 You are now ready to create containers using that image, and you can list all
313 downloaded images on storage `local` with:
314
315 ----
316 # pveam list local
317 local:vztmpl/debian-10.0-standard_10.0-1_amd64.tar.gz 219.95MB
318 ----
319
320 TIP: You can also use the {pve} web interface GUI to download, list and delete
321 container templates.
322
323 `pct` uses them to create a new container, for example:
324
325 ----
326 # pct create 999 local:vztmpl/debian-10.0-standard_10.0-1_amd64.tar.gz
327 ----
328
329 The above command shows you the full {pve} volume identifiers. They include the
330 storage name, and most other {pve} commands can use them. For example you can
331 delete that image later with:
332
333 ----
334 # pveam remove local:vztmpl/debian-10.0-standard_10.0-1_amd64.tar.gz
335 ----
336
337
338 [[pct_settings]]
339 Container Settings
340 ------------------
341
342 [[pct_general]]
343 General Settings
344 ~~~~~~~~~~~~~~~~
345
346 [thumbnail="screenshot/gui-create-ct-general.png"]
347
348 General settings of a container include
349
350 * the *Node* : the physical server on which the container will run
351 * the *CT ID*: a unique number in this {pve} installation used to identify your
352 container
353 * *Hostname*: the hostname of the container
354 * *Resource Pool*: a logical group of containers and VMs
355 * *Password*: the root password of the container
356 * *SSH Public Key*: a public key for connecting to the root account over SSH
357 * *Unprivileged container*: this option allows to choose at creation time
358 if you want to create a privileged or unprivileged container.
359
360 Unprivileged Containers
361 ^^^^^^^^^^^^^^^^^^^^^^^
362
363 Unprivileged containers use a new kernel feature called user namespaces.
364 The root UID 0 inside the container is mapped to an unprivileged user outside
365 the container. This means that most security issues (container escape, resource
366 abuse, etc.) in these containers will affect a random unprivileged user, and
367 would be a generic kernel security bug rather than an LXC issue. The LXC team
368 thinks unprivileged containers are safe by design.
369
370 This is the default option when creating a new container.
371
372 NOTE: If the container uses systemd as an init system, please be aware the
373 systemd version running inside the container should be equal to or greater than
374 220.
375
376
377 Privileged Containers
378 ^^^^^^^^^^^^^^^^^^^^^
379
380 Security in containers is achieved by using mandatory access control 'AppArmor'
381 restrictions, 'seccomp' filters and Linux kernel namespaces. The LXC team
382 considers this kind of container as unsafe, and they will not consider new
383 container escape exploits to be security issues worthy of a CVE and quick fix.
384 That's why privileged containers should only be used in trusted environments.
385
386
387 [[pct_cpu]]
388 CPU
389 ~~~
390
391 [thumbnail="screenshot/gui-create-ct-cpu.png"]
392
393 You can restrict the number of visible CPUs inside the container using the
394 `cores` option. This is implemented using the Linux 'cpuset' cgroup
395 (**c**ontrol *group*).
396 A special task inside `pvestatd` tries to distribute running containers among
397 available CPUs periodically.
398 To view the assigned CPUs run the following command:
399
400 ----
401 # pct cpusets
402 ---------------------
403 102: 6 7
404 105: 2 3 4 5
405 108: 0 1
406 ---------------------
407 ----
408
409 Containers use the host kernel directly. All tasks inside a container are
410 handled by the host CPU scheduler. {pve} uses the Linux 'CFS' (**C**ompletely
411 **F**air **S**cheduler) scheduler by default, which has additional bandwidth
412 control options.
413
414 [horizontal]
415
416 `cpulimit`: :: You can use this option to further limit assigned CPU time.
417 Please note that this is a floating point number, so it is perfectly valid to
418 assign two cores to a container, but restrict overall CPU consumption to half a
419 core.
420 +
421 ----
422 cores: 2
423 cpulimit: 0.5
424 ----
425
426 `cpuunits`: :: This is a relative weight passed to the kernel scheduler. The
427 larger the number is, the more CPU time this container gets. Number is relative
428 to the weights of all the other running containers. The default is 1024. You
429 can use this setting to prioritize some containers.
430
431
432 [[pct_memory]]
433 Memory
434 ~~~~~~
435
436 [thumbnail="screenshot/gui-create-ct-memory.png"]
437
438 Container memory is controlled using the cgroup memory controller.
439
440 [horizontal]
441
442 `memory`: :: Limit overall memory usage. This corresponds to the
443 `memory.limit_in_bytes` cgroup setting.
444
445 `swap`: :: Allows the container to use additional swap memory from the host
446 swap space. This corresponds to the `memory.memsw.limit_in_bytes` cgroup
447 setting, which is set to the sum of both value (`memory + swap`).
448
449
450 [[pct_mount_points]]
451 Mount Points
452 ~~~~~~~~~~~~
453
454 [thumbnail="screenshot/gui-create-ct-root-disk.png"]
455
456 The root mount point is configured with the `rootfs` property. You can
457 configure up to 256 additional mount points. The corresponding options are
458 called `mp0` to `mp255`. They can contain the following settings:
459
460 include::pct-mountpoint-opts.adoc[]
461
462 Currently there are three types of mount points: storage backed mount points,
463 bind mounts, and device mounts.
464
465 .Typical container `rootfs` configuration
466 ----
467 rootfs: thin1:base-100-disk-1,size=8G
468 ----
469
470
471 Storage Backed Mount Points
472 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
473
474 Storage backed mount points are managed by the {pve} storage subsystem and come
475 in three different flavors:
476
477 - Image based: these are raw images containing a single ext4 formatted file
478 system.
479 - ZFS subvolumes: these are technically bind mounts, but with managed storage,
480 and thus allow resizing and snapshotting.
481 - Directories: passing `size=0` triggers a special case where instead of a raw
482 image a directory is created.
483
484 NOTE: The special option syntax `STORAGE_ID:SIZE_IN_GB` for storage backed
485 mount point volumes will automatically allocate a volume of the specified size
486 on the specified storage. For example, calling
487
488 ----
489 pct set 100 -mp0 thin1:10,mp=/path/in/container
490 ----
491
492 will allocate a 10GB volume on the storage `thin1` and replace the volume ID
493 place holder `10` with the allocated volume ID, and setup the moutpoint in the
494 container at `/path/in/container`
495
496
497 Bind Mount Points
498 ^^^^^^^^^^^^^^^^^
499
500 Bind mounts allow you to access arbitrary directories from your Proxmox VE host
501 inside a container. Some potential use cases are:
502
503 - Accessing your home directory in the guest
504 - Accessing an USB device directory in the guest
505 - Accessing an NFS mount from the host in the guest
506
507 Bind mounts are considered to not be managed by the storage subsystem, so you
508 cannot make snapshots or deal with quotas from inside the container. With
509 unprivileged containers you might run into permission problems caused by the
510 user mapping and cannot use ACLs.
511
512 NOTE: The contents of bind mount points are not backed up when using `vzdump`.
513
514 WARNING: For security reasons, bind mounts should only be established using
515 source directories especially reserved for this purpose, e.g., a directory
516 hierarchy under `/mnt/bindmounts`. Never bind mount system directories like
517 `/`, `/var` or `/etc` into a container - this poses a great security risk.
518
519 NOTE: The bind mount source path must not contain any symlinks.
520
521 For example, to make the directory `/mnt/bindmounts/shared` accessible in the
522 container with ID `100` under the path `/shared`, use a configuration line like
523 `mp0: /mnt/bindmounts/shared,mp=/shared` in `/etc/pve/lxc/100.conf`.
524 Alternatively, use `pct set 100 -mp0 /mnt/bindmounts/shared,mp=/shared` to
525 achieve the same result.
526
527
528 Device Mount Points
529 ^^^^^^^^^^^^^^^^^^^
530
531 Device mount points allow to mount block devices of the host directly into the
532 container. Similar to bind mounts, device mounts are not managed by {PVE}'s
533 storage subsystem, but the `quota` and `acl` options will be honored.
534
535 NOTE: Device mount points should only be used under special circumstances. In
536 most cases a storage backed mount point offers the same performance and a lot
537 more features.
538
539 NOTE: The contents of device mount points are not backed up when using
540 `vzdump`.
541
542
543 [[pct_container_network]]
544 Network
545 ~~~~~~~
546
547 [thumbnail="screenshot/gui-create-ct-network.png"]
548
549 You can configure up to 10 network interfaces for a single container.
550 The corresponding options are called `net0` to `net9`, and they can contain the
551 following setting:
552
553 include::pct-network-opts.adoc[]
554
555
556 [[pct_startup_and_shutdown]]
557 Automatic Start and Shutdown of Containers
558 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
559
560 To automatically start a container when the host system boots, select the
561 option 'Start at boot' in the 'Options' panel of the container in the web
562 interface or run the following command:
563
564 ----
565 # pct set CTID -onboot 1
566 ----
567
568 .Start and Shutdown Order
569 // use the screenshot from qemu - its the same
570 [thumbnail="screenshot/gui-qemu-edit-start-order.png"]
571
572 If you want to fine tune the boot order of your containers, you can use the
573 following parameters:
574
575 * *Start/Shutdown order*: Defines the start order priority. For example, set it
576 to 1 if you want the CT to be the first to be started. (We use the reverse
577 startup order for shutdown, so a container with a start order of 1 would be
578 the last to be shut down)
579 * *Startup delay*: Defines the interval between this container start and
580 subsequent containers starts. For example, set it to 240 if you want to wait
581 240 seconds before starting other containers.
582 * *Shutdown timeout*: Defines the duration in seconds {pve} should wait
583 for the container to be offline after issuing a shutdown command.
584 By default this value is set to 60, which means that {pve} will issue a
585 shutdown request, wait 60s for the machine to be offline, and if after 60s
586 the machine is still online will notify that the shutdown action failed.
587
588 Please note that containers without a Start/Shutdown order parameter will
589 always start after those where the parameter is set, and this parameter only
590 makes sense between the machines running locally on a host, and not
591 cluster-wide.
592
593 If you require a delay between the host boot and the booting of the first
594 container, see the section on
595 xref:first_guest_boot_delay[Proxmox VE Node Management].
596
597
598 Hookscripts
599 ~~~~~~~~~~~
600
601 You can add a hook script to CTs with the config property `hookscript`.
602
603 ----
604 # pct set 100 -hookscript local:snippets/hookscript.pl
605 ----
606
607 It will be called during various phases of the guests lifetime. For an example
608 and documentation see the example script under
609 `/usr/share/pve-docs/examples/guest-example-hookscript.pl`.
610
611 Security Considerations
612 -----------------------
613
614 Containers use the kernel of the host system. This exposes an attack surface
615 for malicious users. In general, full virtual machines provide better
616 isolation. This should be considered if containers are provided to unknown or
617 untrusted people.
618
619 To reduce the attack surface, LXC uses many security features like AppArmor,
620 CGroups and kernel namespaces.
621
622 AppArmor
623 ~~~~~~~~
624
625 AppArmor profiles are used to restrict access to possibly dangerous actions.
626 Some system calls, i.e. `mount`, are prohibited from execution.
627
628 To trace AppArmor activity, use:
629
630 ----
631 # dmesg | grep apparmor
632 ----
633
634 Although it is not recommended, AppArmor can be disabled for a container. This
635 brings security risks with it. Some syscalls can lead to privilege escalation
636 when executed within a container if the system is misconfigured or if a LXC or
637 Linux Kernel vulnerability exists.
638
639 To disable AppArmor for a container, add the following line to the container
640 configuration file located at `/etc/pve/lxc/CTID.conf`:
641
642 ----
643 lxc.apparmor.profile = unconfined
644 ----
645
646 WARNING: Please note that this is not recommended for production use.
647
648
649 [[pct_cgroup]]
650 Control Groups ('cgroup')
651 ~~~~~~~~~~~~~~~~~~~~~~~~~
652
653 'cgroup' is a kernel
654 mechanism used to hierarchically organize processes and distribute system
655 resources.
656
657 The main resources controlled via 'cgroups' are CPU time, memory and swap
658 limits, and access to device nodes. 'cgroups' are also used to "freeze" a
659 container before taking snapshots.
660
661 There are 2 versions of 'cgroups' currently available,
662 https://www.kernel.org/doc/html/v5.11/admin-guide/cgroup-v1/index.html[legacy]
663 and
664 https://www.kernel.org/doc/html/v5.11/admin-guide/cgroup-v2.html['cgroupv2'].
665
666 Since {pve} 7.0, the default is a pure 'cgroupv2' environment. Previously a
667 "hybrid" setup was used, where resource control was mainly done in 'cgroupv1'
668 with an additional 'cgroupv2' controller which could take over some subsystems
669 via the 'cgroup_no_v1' kernel command line parameter. (See the
670 https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html[kernel
671 parameter documentation] for details.)
672
673 [[pct_cgroup_compat]]
674 CGroup Version Compatibility
675 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
676 The main difference between pure 'cgroupv2' and the old hybrid environments
677 regarding {pve} is that with 'cgroupv2' memory and swap are now controlled
678 independently. The memory and swap settings for containers can map directly to
679 these values, whereas previously only the memory limit and the limit of the
680 *sum* of memory and swap could be limited.
681
682 Another important difference is that the 'devices' controller is configured in a
683 completely different way. Because of this, file system quotas are currently not
684 supported in a pure 'cgroupv2' environment.
685
686 'cgroupv2' support by the container's OS is needed to run in a pure 'cgroupv2'
687 environment. Containers running 'systemd' version 231 or newer support
688 'cgroupv2' footnote:[this includes all newest major versions of container
689 templates shipped by {pve}], as do containers not using 'systemd' as init
690 system footnote:[for example Alpine Linux].
691
692 [NOTE]
693 ====
694 CentOS 7 and Ubuntu 16.10 are two prominent Linux distributions releases,
695 which have a 'systemd' version that is too old to run in a 'cgroupv2'
696 environment, you can either
697
698 * Upgrade the whole distribution to a newer release. For the examples above, that
699 could be Ubuntu 18.04 or 20.04, and CentOS 8 (or RHEL/CentOS derivatives like
700 AlmaLinux or Rocky Linux). This has the benefit to get the newest bug and
701 security fixes, often also new features, and moving the EOL date in the future.
702
703 * Upgrade the Containers systemd version. If the distribution provides a
704 backports repository this can be an easy and quick stop-gap measurement.
705
706 * Move the container, or its services, to a Virtual Machine. Virtual Machines
707 have a much less interaction with the host, that's why one can install
708 decades old OS versions just fine there.
709
710 * Switch back to the legacy 'cgroup' controller. Note that while it can be a
711 valid solution, it's not a permanent one. There's a high likelihood that a
712 future {pve} major release, for example 8.0, cannot support the legacy
713 controller anymore.
714 ====
715
716 [[pct_cgroup_change_version]]
717 Changing CGroup Version
718 ^^^^^^^^^^^^^^^^^^^^^^^
719
720 TIP: If file system quotas are not required and all containers support 'cgroupv2',
721 it is recommended to stick to the new default.
722
723 To switch back to the previous version the following kernel command line
724 parameter can be used:
725
726 ----
727 systemd.unified_cgroup_hierarchy=0
728 ----
729
730 See xref:sysboot_edit_kernel_cmdline[this section] on editing the kernel boot
731 command line on where to add the parameter.
732
733 // TODO: seccomp a bit more.
734 // TODO: pve-lxc-syscalld
735
736
737 Guest Operating System Configuration
738 ------------------------------------
739
740 {pve} tries to detect the Linux distribution in the container, and modifies
741 some files. Here is a short list of things done at container startup:
742
743 set /etc/hostname:: to set the container name
744
745 modify /etc/hosts:: to allow lookup of the local hostname
746
747 network setup:: pass the complete network setup to the container
748
749 configure DNS:: pass information about DNS servers
750
751 adapt the init system:: for example, fix the number of spawned getty processes
752
753 set the root password:: when creating a new container
754
755 rewrite ssh_host_keys:: so that each container has unique keys
756
757 randomize crontab:: so that cron does not start at the same time on all containers
758
759 Changes made by {PVE} are enclosed by comment markers:
760
761 ----
762 # --- BEGIN PVE ---
763 <data>
764 # --- END PVE ---
765 ----
766
767 Those markers will be inserted at a reasonable location in the file. If such a
768 section already exists, it will be updated in place and will not be moved.
769
770 Modification of a file can be prevented by adding a `.pve-ignore.` file for it.
771 For instance, if the file `/etc/.pve-ignore.hosts` exists then the `/etc/hosts`
772 file will not be touched. This can be a simple empty file created via:
773
774 ----
775 # touch /etc/.pve-ignore.hosts
776 ----
777
778 Most modifications are OS dependent, so they differ between different
779 distributions and versions. You can completely disable modifications by
780 manually setting the `ostype` to `unmanaged`.
781
782 OS type detection is done by testing for certain files inside the
783 container. {pve} first checks the `/etc/os-release` file
784 footnote:[/etc/os-release replaces the multitude of per-distribution
785 release files https://manpages.debian.org/stable/systemd/os-release.5.en.html].
786 If that file is not present, or it does not contain a clearly recognizable
787 distribution identifier the following distribution specific release files are
788 checked.
789
790 Ubuntu:: inspect /etc/lsb-release (`DISTRIB_ID=Ubuntu`)
791
792 Debian:: test /etc/debian_version
793
794 Fedora:: test /etc/fedora-release
795
796 RedHat or CentOS:: test /etc/redhat-release
797
798 ArchLinux:: test /etc/arch-release
799
800 Alpine:: test /etc/alpine-release
801
802 Gentoo:: test /etc/gentoo-release
803
804 NOTE: Container start fails if the configured `ostype` differs from the auto
805 detected type.
806
807
808 [[pct_container_storage]]
809 Container Storage
810 -----------------
811
812 The {pve} LXC container storage model is more flexible than traditional
813 container storage models. A container can have multiple mount points. This
814 makes it possible to use the best suited storage for each application.
815
816 For example the root file system of the container can be on slow and cheap
817 storage while the database can be on fast and distributed storage via a second
818 mount point. See section <<pct_mount_points, Mount Points>> for further
819 details.
820
821 Any storage type supported by the {pve} storage library can be used. This means
822 that containers can be stored on local (for example `lvm`, `zfs` or directory),
823 shared external (like `iSCSI`, `NFS`) or even distributed storage systems like
824 Ceph. Advanced storage features like snapshots or clones can be used if the
825 underlying storage supports them. The `vzdump` backup tool can use snapshots to
826 provide consistent container backups.
827
828 Furthermore, local devices or local directories can be mounted directly using
829 'bind mounts'. This gives access to local resources inside a container with
830 practically zero overhead. Bind mounts can be used as an easy way to share data
831 between containers.
832
833
834 FUSE Mounts
835 ~~~~~~~~~~~
836
837 WARNING: Because of existing issues in the Linux kernel's freezer subsystem the
838 usage of FUSE mounts inside a container is strongly advised against, as
839 containers need to be frozen for suspend or snapshot mode backups.
840
841 If FUSE mounts cannot be replaced by other mounting mechanisms or storage
842 technologies, it is possible to establish the FUSE mount on the Proxmox host
843 and use a bind mount point to make it accessible inside the container.
844
845
846 Using Quotas Inside Containers
847 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
848
849 Quotas allow to set limits inside a container for the amount of disk space that
850 each user can use.
851
852 NOTE: This currently requires the use of legacy 'cgroups'.
853
854 NOTE: This only works on ext4 image based storage types and currently only
855 works with privileged containers.
856
857 Activating the `quota` option causes the following mount options to be used for
858 a mount point:
859 `usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0`
860
861 This allows quotas to be used like on any other system. You can initialize the
862 `/aquota.user` and `/aquota.group` files by running:
863
864 ----
865 # quotacheck -cmug /
866 # quotaon /
867 ----
868
869 Then edit the quotas using the `edquota` command. Refer to the documentation of
870 the distribution running inside the container for details.
871
872 NOTE: You need to run the above commands for every mount point by passing the
873 mount point's path instead of just `/`.
874
875
876 Using ACLs Inside Containers
877 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
878
879 The standard Posix **A**ccess **C**ontrol **L**ists are also available inside
880 containers. ACLs allow you to set more detailed file ownership than the
881 traditional user/group/others model.
882
883
884 Backup of Container mount points
885 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
886
887 To include a mount point in backups, enable the `backup` option for it in the
888 container configuration. For an existing mount point `mp0`
889
890 ----
891 mp0: guests:subvol-100-disk-1,mp=/root/files,size=8G
892 ----
893
894 add `backup=1` to enable it.
895
896 ----
897 mp0: guests:subvol-100-disk-1,mp=/root/files,size=8G,backup=1
898 ----
899
900 NOTE: When creating a new mount point in the GUI, this option is enabled by
901 default.
902
903 To disable backups for a mount point, add `backup=0` in the way described
904 above, or uncheck the *Backup* checkbox on the GUI.
905
906 Replication of Containers mount points
907 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
908
909 By default, additional mount points are replicated when the Root Disk is
910 replicated. If you want the {pve} storage replication mechanism to skip a mount
911 point, you can set the *Skip replication* option for that mount point.
912 As of {pve} 5.0, replication requires a storage of type `zfspool`. Adding a
913 mount point to a different type of storage when the container has replication
914 configured requires to have *Skip replication* enabled for that mount point.
915
916
917 Backup and Restore
918 ------------------
919
920
921 Container Backup
922 ~~~~~~~~~~~~~~~~
923
924 It is possible to use the `vzdump` tool for container backup. Please refer to
925 the `vzdump` manual page for details.
926
927
928 Restoring Container Backups
929 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
930
931 Restoring container backups made with `vzdump` is possible using the `pct
932 restore` command. By default, `pct restore` will attempt to restore as much of
933 the backed up container configuration as possible. It is possible to override
934 the backed up configuration by manually setting container options on the
935 command line (see the `pct` manual page for details).
936
937 NOTE: `pvesm extractconfig` can be used to view the backed up configuration
938 contained in a vzdump archive.
939
940 There are two basic restore modes, only differing by their handling of mount
941 points:
942
943
944 ``Simple'' Restore Mode
945 ^^^^^^^^^^^^^^^^^^^^^^^
946
947 If neither the `rootfs` parameter nor any of the optional `mpX` parameters are
948 explicitly set, the mount point configuration from the backed up configuration
949 file is restored using the following steps:
950
951 . Extract mount points and their options from backup
952 . Create volumes for storage backed mount points on the storage provided with
953 the `storage` parameter (default: `local`).
954 . Extract files from backup archive
955 . Add bind and device mount points to restored configuration (limited to root
956 user)
957
958 NOTE: Since bind and device mount points are never backed up, no files are
959 restored in the last step, but only the configuration options. The assumption
960 is that such mount points are either backed up with another mechanism (e.g.,
961 NFS space that is bind mounted into many containers), or not intended to be
962 backed up at all.
963
964 This simple mode is also used by the container restore operations in the web
965 interface.
966
967
968 ``Advanced'' Restore Mode
969 ^^^^^^^^^^^^^^^^^^^^^^^^^
970
971 By setting the `rootfs` parameter (and optionally, any combination of `mpX`
972 parameters), the `pct restore` command is automatically switched into an
973 advanced mode. This advanced mode completely ignores the `rootfs` and `mpX`
974 configuration options contained in the backup archive, and instead only uses
975 the options explicitly provided as parameters.
976
977 This mode allows flexible configuration of mount point settings at restore
978 time, for example:
979
980 * Set target storages, volume sizes and other options for each mount point
981 individually
982 * Redistribute backed up files according to new mount point scheme
983 * Restore to device and/or bind mount points (limited to root user)
984
985
986 Managing Containers with `pct`
987 ------------------------------
988
989 The ``Proxmox Container Toolkit'' (`pct`) is the command line tool to manage
990 {pve} containers. It enables you to create or destroy containers, as well as
991 control the container execution (start, stop, reboot, migrate, etc.). It can be
992 used to set parameters in the config file of a container, for example the
993 network configuration or memory limits.
994
995 CLI Usage Examples
996 ~~~~~~~~~~~~~~~~~~
997
998 Create a container based on a Debian template (provided you have already
999 downloaded the template via the web interface)
1000
1001 ----
1002 # pct create 100 /var/lib/vz/template/cache/debian-10.0-standard_10.0-1_amd64.tar.gz
1003 ----
1004
1005 Start container 100
1006
1007 ----
1008 # pct start 100
1009 ----
1010
1011 Start a login session via getty
1012
1013 ----
1014 # pct console 100
1015 ----
1016
1017 Enter the LXC namespace and run a shell as root user
1018
1019 ----
1020 # pct enter 100
1021 ----
1022
1023 Display the configuration
1024
1025 ----
1026 # pct config 100
1027 ----
1028
1029 Add a network interface called `eth0`, bridged to the host bridge `vmbr0`, set
1030 the address and gateway, while it's running
1031
1032 ----
1033 # pct set 100 -net0 name=eth0,bridge=vmbr0,ip=192.168.15.147/24,gw=192.168.15.1
1034 ----
1035
1036 Reduce the memory of the container to 512MB
1037
1038 ----
1039 # pct set 100 -memory 512
1040 ----
1041
1042 Destroying a container always removes it from Access Control Lists and it always
1043 removes the firewall configuration of the container. You have to activate
1044 '--purge', if you want to additionally remove the container from replication jobs,
1045 backup jobs and HA resource configurations.
1046
1047 ----
1048 # pct destroy 100 --purge
1049 ----
1050
1051
1052
1053 Obtaining Debugging Logs
1054 ~~~~~~~~~~~~~~~~~~~~~~~~
1055
1056 In case `pct start` is unable to start a specific container, it might be
1057 helpful to collect debugging output by passing the `--debug` flag (replace `CTID` with
1058 the container's CTID):
1059
1060 ----
1061 # pct start CTID --debug
1062 ----
1063
1064 Alternatively, you can use the following `lxc-start` command, which will save
1065 the debug log to the file specified by the `-o` output option:
1066
1067 ----
1068 # lxc-start -n CTID -F -l DEBUG -o /tmp/lxc-CTID.log
1069 ----
1070
1071 This command will attempt to start the container in foreground mode, to stop
1072 the container run `pct shutdown CTID` or `pct stop CTID` in a second terminal.
1073
1074 The collected debug log is written to `/tmp/lxc-CTID.log`.
1075
1076 NOTE: If you have changed the container's configuration since the last start
1077 attempt with `pct start`, you need to run `pct start` at least once to also
1078 update the configuration used by `lxc-start`.
1079
1080 [[pct_migration]]
1081 Migration
1082 ---------
1083
1084 If you have a cluster, you can migrate your Containers with
1085
1086 ----
1087 # pct migrate <ctid> <target>
1088 ----
1089
1090 This works as long as your Container is offline. If it has local volumes or
1091 mount points defined, the migration will copy the content over the network to
1092 the target host if the same storage is defined there.
1093
1094 Running containers cannot live-migrated due to technical limitations. You can
1095 do a restart migration, which shuts down, moves and then starts a container
1096 again on the target node. As containers are very lightweight, this results
1097 normally only in a downtime of some hundreds of milliseconds.
1098
1099 A restart migration can be done through the web interface or by using the
1100 `--restart` flag with the `pct migrate` command.
1101
1102 A restart migration will shut down the Container and kill it after the
1103 specified timeout (the default is 180 seconds). Then it will migrate the
1104 Container like an offline migration and when finished, it starts the Container
1105 on the target node.
1106
1107 [[pct_configuration]]
1108 Configuration
1109 -------------
1110
1111 The `/etc/pve/lxc/<CTID>.conf` file stores container configuration, where
1112 `<CTID>` is the numeric ID of the given container. Like all other files stored
1113 inside `/etc/pve/`, they get automatically replicated to all other cluster
1114 nodes.
1115
1116 NOTE: CTIDs < 100 are reserved for internal purposes, and CTIDs need to be
1117 unique cluster wide.
1118
1119 .Example Container Configuration
1120 ----
1121 ostype: debian
1122 arch: amd64
1123 hostname: www
1124 memory: 512
1125 swap: 512
1126 net0: bridge=vmbr0,hwaddr=66:64:66:64:64:36,ip=dhcp,name=eth0,type=veth
1127 rootfs: local:107/vm-107-disk-1.raw,size=7G
1128 ----
1129
1130 The configuration files are simple text files. You can edit them using a normal
1131 text editor, for example, `vi` or `nano`.
1132 This is sometimes useful to do small corrections, but keep in mind that you
1133 need to restart the container to apply such changes.
1134
1135 For that reason, it is usually better to use the `pct` command to generate and
1136 modify those files, or do the whole thing using the GUI.
1137 Our toolkit is smart enough to instantaneously apply most changes to running
1138 containers. This feature is called ``hot plug'', and there is no need to restart
1139 the container in that case.
1140
1141 In cases where a change cannot be hot-plugged, it will be registered as a
1142 pending change (shown in red color in the GUI).
1143 They will only be applied after rebooting the container.
1144
1145
1146 File Format
1147 ~~~~~~~~~~~
1148
1149 The container configuration file uses a simple colon separated key/value
1150 format. Each line has the following format:
1151
1152 -----
1153 # this is a comment
1154 OPTION: value
1155 -----
1156
1157 Blank lines in those files are ignored, and lines starting with a `#` character
1158 are treated as comments and are also ignored.
1159
1160 It is possible to add low-level, LXC style configuration directly, for example:
1161
1162 ----
1163 lxc.init_cmd: /sbin/my_own_init
1164 ----
1165
1166 or
1167
1168 ----
1169 lxc.init_cmd = /sbin/my_own_init
1170 ----
1171
1172 The settings are passed directly to the LXC low-level tools.
1173
1174
1175 [[pct_snapshots]]
1176 Snapshots
1177 ~~~~~~~~~
1178
1179 When you create a snapshot, `pct` stores the configuration at snapshot time
1180 into a separate snapshot section within the same configuration file. For
1181 example, after creating a snapshot called ``testsnapshot'', your configuration
1182 file will look like this:
1183
1184 .Container configuration with snapshot
1185 ----
1186 memory: 512
1187 swap: 512
1188 parent: testsnaphot
1189 ...
1190
1191 [testsnaphot]
1192 memory: 512
1193 swap: 512
1194 snaptime: 1457170803
1195 ...
1196 ----
1197
1198 There are a few snapshot related properties like `parent` and `snaptime`. The
1199 `parent` property is used to store the parent/child relationship between
1200 snapshots. `snaptime` is the snapshot creation time stamp (Unix epoch).
1201
1202
1203 [[pct_options]]
1204 Options
1205 ~~~~~~~
1206
1207 include::pct.conf.5-opts.adoc[]
1208
1209
1210 Locks
1211 -----
1212
1213 Container migrations, snapshots and backups (`vzdump`) set a lock to prevent
1214 incompatible concurrent actions on the affected container. Sometimes you need
1215 to remove such a lock manually (e.g., after a power failure).
1216
1217 ----
1218 # pct unlock <CTID>
1219 ----
1220
1221 CAUTION: Only do this if you are sure the action which set the lock is no
1222 longer running.
1223
1224
1225 ifdef::manvolnum[]
1226
1227 Files
1228 ------
1229
1230 `/etc/pve/lxc/<CTID>.conf`::
1231
1232 Configuration file for the container '<CTID>'.
1233
1234
1235 include::pve-copyright.adoc[]
1236 endif::manvolnum[]