X-Git-Url: https://git.proxmox.com/?a=blobdiff_plain;f=pct.adoc;h=b585b3069fe6bd0300f8ea9539199dbd0cb91382;hb=69ab602f6a2fe5b328c89393fbad8ea7b34b97b7;hp=eb911e0416140937eb98fbfaf7fdff52073d37fc;hpb=53e3cd6f30f6f1c59adce8b131a9ad0232624359;p=pve-docs.git diff --git a/pct.adoc b/pct.adoc index eb911e0..b585b30 100644 --- a/pct.adoc +++ b/pct.adoc @@ -2,7 +2,6 @@ ifdef::manvolnum[] pct(1) ====== -include::attributes.txt[] :pve-toplevel: NAME @@ -23,95 +22,95 @@ endif::manvolnum[] ifndef::manvolnum[] Proxmox Container Toolkit ========================= -include::attributes.txt[] :pve-toplevel: endif::manvolnum[] ifdef::wiki[] :title: Linux Container endif::wiki[] -Containers are a lightweight alternative to fully virtualized -VMs. Instead of emulating a complete Operating System (OS), containers -simply use the OS of the host they run on. This implies that all -containers use the same kernel, and that they can access resources -from the host directly. - -This is great because containers do not waste CPU power nor memory due -to kernel emulation. Container run-time costs are close to zero and -usually negligible. But there are also some drawbacks you need to -consider: - -* You can only run Linux based OS inside containers, i.e. it is not - possible to run FreeBSD or MS Windows inside. - -* For security reasons, access to host resources needs to be - restricted. This is done with AppArmor, SecComp filters and other - kernel features. Be prepared that some syscalls are not allowed - inside containers. - -{pve} uses https://linuxcontainers.org/[LXC] as underlying container -technology. We consider LXC as low-level library, which provides -countless options. It would be too difficult to use those tools -directly. Instead, we provide a small wrapper called `pct`, the -"Proxmox Container Toolkit". - -The toolkit is tightly coupled with {pve}. That means that it is aware -of the cluster setup, and it can use the same network and storage -resources as fully virtualized VMs. You can even use the {pve} -firewall, or manage containers using the HA framework. - -Our primary goal is to offer an environment as one would get from a -VM, but without the additional overhead. We call this "System -Containers". - -NOTE: If you want to run micro-containers (with docker, rkt, ...), it +Containers are a lightweight alternative to fully virtualized machines (VMs). +They use the kernel of the host system that they run on, instead of emulating a +full operating system (OS). This means that containers can access resources on +the host system directly. + +The runtime costs for containers is low, usually negligible. However, there are +some drawbacks that need be considered: + +* Only Linux distributions can be run in containers.It is not possible to run + other Operating Systems like, for example, FreeBSD or Microsoft Windows + inside a container. + +* For security reasons, access to host resources needs to be restricted. + Containers run in their own separate namespaces. Additionally some syscalls + are not allowed within containers. + +{pve} uses https://linuxcontainers.org/[Linux Containers (LXC)] as underlying +container technology. The ``Proxmox Container Toolkit'' (`pct`) simplifies the +usage and management of LXC containers. + +Containers are tightly integrated with {pve}. This means that they are aware of +the cluster setup, and they can use the same network and storage resources as +virtual machines. You can also use the {pve} firewall, or manage containers +using the HA framework. + +Our primary goal is to offer an environment as one would get from a VM, but +without the additional overhead. We call this ``System Containers''. + +NOTE: If you want to run micro-containers, for example, 'Docker' or 'rkt', it is best to run them inside a VM. -Security Considerations ------------------------ +Technology Overview +------------------- -Containers use the same kernel as the host, so there is a big attack -surface for malicious users. You should consider this fact if you -provide containers to totally untrusted people. In general, fully -virtualized VMs provide better isolation. +* LXC (https://linuxcontainers.org/) -The good news is that LXC uses many kernel security features like -AppArmor, CGroups and PID and user namespaces, which makes containers -usage quite secure. We distinguish two types of containers: +* Integrated into {pve} graphical web user interface (GUI) +* Easy to use command line tool `pct` -Privileged Containers -~~~~~~~~~~~~~~~~~~~~~ +* Access via {pve} REST API -Security is done by dropping capabilities, using mandatory access -control (AppArmor), SecComp filters and namespaces. The LXC team -considers this kind of container as unsafe, and they will not consider -new container escape exploits to be security issues worthy of a CVE -and quick fix. So you should use this kind of containers only inside a -trusted environment, or when no untrusted task is running as root in -the container. +* 'lxcfs' to provide containerized /proc file system +* Control groups ('cgroups') for resource isolation and limitation -Unprivileged Containers -~~~~~~~~~~~~~~~~~~~~~~~ +* 'AppArmor' and 'seccomp' to improve security + +* Modern Linux kernels + +* Image based deployment (templates) + +* Uses {pve} xref:chapter_storage[storage library] -This kind of containers use a new kernel feature called user -namespaces. The root UID 0 inside the container is mapped to an -unprivileged user outside the container. This means that most security -issues (container escape, resource abuse, ...) in those containers -will affect a random unprivileged user, and so would be a generic -kernel security bug rather than an LXC issue. The LXC team thinks -unprivileged containers are safe by design. +* Container setup from host (network, DNS, storage, etc.) +Security Considerations +----------------------- + +Containers use the kernel of the host system. This creates a big attack surface +for malicious users. This should be considered if containers are provided to +untrustworthy people. In general, full virtual machines provide better +isolation. + +However, LXC uses many security features like AppArmor, CGroups and kernel +namespaces to reduce the attack surface. + +AppArmor profiles are used to restrict access to possibly dangerous actions. +Some system calls, i.e. `mount`, are prohibited from execution. + +To trace AppArmor activity, use: + +---- +# dmesg | grep apparmor +---- + Guest Operating System Configuration ------------------------------------ -We normally try to detect the operating system type inside the -container, and then modify some files inside the container to make -them work as expected. Here is a short list of things we do at -container startup: +{pve} tries to detect the Linux distribution in the container, and modifies +some files. Here is a short list of things done at container startup: set /etc/hostname:: to set the container name @@ -137,20 +136,20 @@ Changes made by {PVE} are enclosed by comment markers: # --- END PVE --- ---- -Those markers will be inserted at a reasonable location in the -file. If such a section already exists, it will be updated in place -and will not be moved. +Those markers will be inserted at a reasonable location in the file. If such a +section already exists, it will be updated in place and will not be moved. -Modification of a file can be prevented by adding a `.pve-ignore.` -file for it. For instance, if the file `/etc/.pve-ignore.hosts` -exists then the `/etc/hosts` file will not be touched. This can be a -simple empty file creatd via: +Modification of a file can be prevented by adding a `.pve-ignore.` file for it. +For instance, if the file `/etc/.pve-ignore.hosts` exists then the `/etc/hosts` +file will not be touched. This can be a simple empty file created via: - # touch /etc/.pve-ignore.hosts +---- +# touch /etc/.pve-ignore.hosts +---- Most modifications are OS dependent, so they differ between different -distributions and versions. You can completely disable modifications -by manually setting the `ostype` to `unmanaged`. +distributions and versions. You can completely disable modifications by +manually setting the `ostype` to `unmanaged`. OS type detection is done by testing for certain files inside the container: @@ -173,213 +172,334 @@ NOTE: Container start fails if the configured `ostype` differs from the auto detected type. -[[pct_configuration]] -Configuration -------------- +[[pct_container_images]] +Container Images +---------------- -The `/etc/pve/lxc/.conf` file stores container configuration, -where `` is the numeric ID of the given container. Like all -other files stored inside `/etc/pve/`, they get automatically -replicated to all other cluster nodes. +Container images, sometimes also referred to as ``templates'' or +``appliances'', are `tar` archives which contain everything to run a container. +`pct` uses them to create a new container, for example: -NOTE: CTIDs < 100 are reserved for internal purposes, and CTIDs need to be -unique cluster wide. +---- +# pct create 999 local:vztmpl/debian-10.0-standard_10.0-1_amd64.tar.gz +---- + +{pve} itself provides a variety of basic templates for the most common Linux +distributions. They can be downloaded using the GUI or the `pveam` (short for +{pve} Appliance Manager) command line utility. +Additionally, https://www.turnkeylinux.org/[TurnKey Linux] container templates +are also available to download. + +The list of available templates is updated daily via cron. To trigger it +manually: -.Example Container Configuration ---- -ostype: debian -arch: amd64 -hostname: www -memory: 512 -swap: 512 -net0: bridge=vmbr0,hwaddr=66:64:66:64:64:36,ip=dhcp,name=eth0,type=veth -rootfs: local:107/vm-107-disk-1.raw,size=7G +# pveam update ---- -Those configuration files are simple text files, and you can edit them -using a normal text editor (`vi`, `nano`, ...). This is sometimes -useful to do small corrections, but keep in mind that you need to -restart the container to apply such changes. +To view the list of available images run: -For that reason, it is usually better to use the `pct` command to -generate and modify those files, or do the whole thing using the GUI. -Our toolkit is smart enough to instantaneously apply most changes to -running containers. This feature is called "hot plug", and there is no -need to restart the container in that case. +---- +# pveam available +---- +You can restrict this large list by specifying the `section` you are +interested in, for example basic `system` images: -File Format +.List available system images +---- +# pveam available --section system +system alpine-3.10-default_20190626_amd64.tar.xz +system alpine-3.9-default_20190224_amd64.tar.xz +system archlinux-base_20190924-1_amd64.tar.gz +system centos-6-default_20191016_amd64.tar.xz +system centos-7-default_20190926_amd64.tar.xz +system centos-8-default_20191016_amd64.tar.xz +system debian-10.0-standard_10.0-1_amd64.tar.gz +system debian-8.0-standard_8.11-1_amd64.tar.gz +system debian-9.0-standard_9.7-1_amd64.tar.gz +system fedora-30-default_20190718_amd64.tar.xz +system fedora-31-default_20191029_amd64.tar.xz +system gentoo-current-default_20190718_amd64.tar.xz +system opensuse-15.0-default_20180907_amd64.tar.xz +system opensuse-15.1-default_20190719_amd64.tar.xz +system ubuntu-16.04-standard_16.04.5-1_amd64.tar.gz +system ubuntu-18.04-standard_18.04.1-1_amd64.tar.gz +system ubuntu-19.04-standard_19.04-1_amd64.tar.gz +system ubuntu-19.10-standard_19.10-1_amd64.tar.gz +---- + +Before you can use such a template, you need to download them into one of your +storages. You can simply use storage `local` for that purpose. For clustered +installations, it is preferred to use a shared storage so that all nodes can +access those images. + +---- +# pveam download local debian-10.0-standard_10.0-1_amd64.tar.gz +---- + +You are now ready to create containers using that image, and you can list all +downloaded images on storage `local` with: + +---- +# pveam list local +local:vztmpl/debian-10.0-standard_10.0-1_amd64.tar.gz 219.95MB +---- + +The above command shows you the full {pve} volume identifiers. They include the +storage name, and most other {pve} commands can use them. For example you can +delete that image later with: + +---- +# pveam remove local:vztmpl/debian-10.0-standard_10.0-1_amd64.tar.gz +---- + +[[pct_container_storage]] +Container Storage +----------------- + +The {pve} LXC container storage model is more flexible than traditional +container storage models. A container can have multiple mount points. This +makes it possible to use the best suited storage for each application. + +For example the root file system of the container can be on slow and cheap +storage while the database can be on fast and distributed storage via a second +mount point. See section <> for further +details. + +Any storage type supported by the {pve} storage library can be used. This means +that containers can be stored on local (for example `lvm`, `zfs` or directory), +shared external (like `iSCSI`, `NFS`) or even distributed storage systems like +Ceph. Advanced storage features like snapshots or clones can be used if the +underlying storage supports them. The `vzdump` backup tool can use snapshots to +provide consistent container backups. + +Furthermore, local devices or local directories can be mounted directly using +'bind mounts'. This gives access to local resources inside a container with +practically zero overhead. Bind mounts can be used as an easy way to share data +between containers. + + +FUSE Mounts ~~~~~~~~~~~ -Container configuration files use a simple colon separated key/value -format. Each line has the following format: +WARNING: Because of existing issues in the Linux kernel's freezer subsystem the +usage of FUSE mounts inside a container is strongly advised against, as +containers need to be frozen for suspend or snapshot mode backups. ------ -# this is a comment -OPTION: value ------ +If FUSE mounts cannot be replaced by other mounting mechanisms or storage +technologies, it is possible to establish the FUSE mount on the Proxmox host +and use a bind mount point to make it accessible inside the container. -Blank lines in those files are ignored, and lines starting with a `#` -character are treated as comments and are also ignored. -It is possible to add low-level, LXC style configuration directly, for -example: +Using Quotas Inside Containers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - lxc.init_cmd: /sbin/my_own_init +Quotas allow to set limits inside a container for the amount of disk space that +each user can use. -or +NOTE: This only works on ext4 image based storage types and currently only +works with privileged containers. + +Activating the `quota` option causes the following mount options to be used for +a mount point: +`usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0` + +This allows quotas to be used like on any other system. You can initialize the +`/aquota.user` and `/aquota.group` files by running: - lxc.init_cmd = /sbin/my_own_init +---- +# quotacheck -cmug / +# quotaon / +---- -Those settings are directly passed to the LXC low-level tools. +Then edit the quotas using the `edquota` command. Refer to the documentation of +the distribution running inside the container for details. +NOTE: You need to run the above commands for every mount point by passing the +mount point's path instead of just `/`. -[[pct_snapshots]] -Snapshots -~~~~~~~~~ -When you create a snapshot, `pct` stores the configuration at snapshot -time into a separate snapshot section within the same configuration -file. For example, after creating a snapshot called ``testsnapshot'', -your configuration file will look like this: +Using ACLs Inside Containers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The standard Posix **A**ccess **C**ontrol **L**ists are also available inside +containers. ACLs allow you to set more detailed file ownership than the +traditional user/group/others model. + + +Backup of Container mount points +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To include a mount point in backups, enable the `backup` option for it in the +container configuration. For an existing mount point `mp0` -.Container configuration with snapshot ---- -memory: 512 -swap: 512 -parent: testsnaphot -... +mp0: guests:subvol-100-disk-1,mp=/root/files,size=8G +---- + +add `backup=1` to enable it. -[testsnaphot] -memory: 512 -swap: 512 -snaptime: 1457170803 -... +---- +mp0: guests:subvol-100-disk-1,mp=/root/files,size=8G,backup=1 ---- -There are a few snapshot related properties like `parent` and -`snaptime`. The `parent` property is used to store the parent/child -relationship between snapshots. `snaptime` is the snapshot creation -time stamp (Unix epoch). +NOTE: When creating a new mount point in the GUI, this option is enabled by +default. +To disable backups for a mount point, add `backup=0` in the way described +above, or uncheck the *Backup* checkbox on the GUI. -[[pct_options]] -Options -~~~~~~~ +Replication of Containers mount points +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -include::pct.conf.5-opts.adoc[] +By default, additional mount points are replicated when the Root Disk is +replicated. If you want the {pve} storage replication mechanism to skip a mount +point, you can set the *Skip replication* option for that mount point. +As of {pve} 5.0, replication requires a storage of type `zfspool`. Adding a +mount point to a different type of storage when the container has replication +configured requires to have *Skip replication* enabled for that mount point. +[[pct_settings]] +Container Settings +------------------ -[[pct_container_images]] -Container Images ----------------- +[[pct_general]] +General Settings +~~~~~~~~~~~~~~~~ -Container images, sometimes also referred to as ``templates'' or -``appliances'', are `tar` archives which contain everything to run a -container. You can think of it as a tidy container backup. Like most -modern container toolkits, `pct` uses those images when you create a -new container, for example: +[thumbnail="screenshot/gui-create-ct-general.png"] - pct create 999 local:vztmpl/debian-8.0-standard_8.0-1_amd64.tar.gz +General settings of a container include -{pve} itself ships a set of basic templates for most common -operating systems, and you can download them using the `pveam` (short -for {pve} Appliance Manager) command line utility. You can also -download https://www.turnkeylinux.org/[TurnKey Linux] containers using -that tool (or the graphical user interface). +* the *Node* : the physical server on which the container will run +* the *CT ID*: a unique number in this {pve} installation used to identify your + container +* *Hostname*: the hostname of the container +* *Resource Pool*: a logical group of containers and VMs +* *Password*: the root password of the container +* *SSH Public Key*: a public key for connecting to the root account over SSH +* *Unprivileged container*: this option allows to choose at creation time + if you want to create a privileged or unprivileged container. -Our image repositories contain a list of available images, and there -is a cron job run each day to download that list. You can trigger that -update manually with: +Unprivileged Containers +^^^^^^^^^^^^^^^^^^^^^^^ - pveam update +Unprivileged containers use a new kernel feature called user namespaces. +The root UID 0 inside the container is mapped to an unprivileged user outside +the container. This means that most security issues (container escape, resource +abuse, etc.) in these containers will affect a random unprivileged user, and +would be a generic kernel security bug rather than an LXC issue. The LXC team +thinks unprivileged containers are safe by design. -After that you can view the list of available images using: +This is the default option when creating a new container. - pveam available +NOTE: If the container uses systemd as an init system, please be aware the +systemd version running inside the container should be equal to or greater than +220. -You can restrict this large list by specifying the `section` you are -interested in, for example basic `system` images: -.List available system images +Privileged Containers +^^^^^^^^^^^^^^^^^^^^^ + +Security in containers is achieved by using mandatory access control +('AppArmor'), 'seccomp' filters and namespaces. The LXC team considers this +kind of container as unsafe, and they will not consider new container escape +exploits to be security issues worthy of a CVE and quick fix. That's why +privileged containers should only be used in trusted environments. + +Although it is not recommended, AppArmor can be disabled for a container. This +brings security risks with it. Some syscalls can lead to privilege escalation +when executed within a container if the system is misconfigured or if a LXC or +Linux Kernel vulnerability exists. + +To disable AppArmor for a container, add the following line to the container +configuration file located at `/etc/pve/lxc/CTID.conf`: + ---- -# pveam available --section system -system archlinux-base_2015-24-29-1_x86_64.tar.gz -system centos-7-default_20160205_amd64.tar.xz -system debian-6.0-standard_6.0-7_amd64.tar.gz -system debian-7.0-standard_7.0-3_amd64.tar.gz -system debian-8.0-standard_8.0-1_amd64.tar.gz -system ubuntu-12.04-standard_12.04-1_amd64.tar.gz -system ubuntu-14.04-standard_14.04-1_amd64.tar.gz -system ubuntu-15.04-standard_15.04-1_amd64.tar.gz -system ubuntu-15.10-standard_15.10-1_amd64.tar.gz +lxc.apparmor_profile = unconfined ---- -Before you can use such a template, you need to download them into one -of your storages. You can simply use storage `local` for that -purpose. For clustered installations, it is preferred to use a shared -storage so that all nodes can access those images. +WARNING: Please note that this is not recommended for production use. + + +[[pct_cpu]] +CPU +~~~ - pveam download local debian-8.0-standard_8.0-1_amd64.tar.gz +[thumbnail="screenshot/gui-create-ct-cpu.png"] -You are now ready to create containers using that image, and you can -list all downloaded images on storage `local` with: +You can restrict the number of visible CPUs inside the container using the +`cores` option. This is implemented using the Linux 'cpuset' cgroup +(**c**ontrol *group*). +A special task inside `pvestatd` tries to distribute running containers among +available CPUs periodically. +To view the assigned CPUs run the following command: ---- -# pveam list local -local:vztmpl/debian-8.0-standard_8.0-1_amd64.tar.gz 190.20MB +# pct cpusets + --------------------- + 102: 6 7 + 105: 2 3 4 5 + 108: 0 1 + --------------------- ---- -The above command shows you the full {pve} volume identifiers. They include -the storage name, and most other {pve} commands can use them. For -example you can delete that image later with: +Containers use the host kernel directly. All tasks inside a container are +handled by the host CPU scheduler. {pve} uses the Linux 'CFS' (**C**ompletely +**F**air **S**cheduler) scheduler by default, which has additional bandwidth +control options. - pveam remove local:vztmpl/debian-8.0-standard_8.0-1_amd64.tar.gz +[horizontal] +`cpulimit`: :: You can use this option to further limit assigned CPU time. +Please note that this is a floating point number, so it is perfectly valid to +assign two cores to a container, but restrict overall CPU consumption to half a +core. ++ +---- +cores: 2 +cpulimit: 0.5 +---- -[[pct_container_storage]] -Container Storage ------------------ +`cpuunits`: :: This is a relative weight passed to the kernel scheduler. The +larger the number is, the more CPU time this container gets. Number is relative +to the weights of all the other running containers. The default is 1024. You +can use this setting to prioritize some containers. -Traditional containers use a very simple storage model, only allowing -a single mount point, the root file system. This was further -restricted to specific file system types like `ext4` and `nfs`. -Additional mounts are often done by user provided scripts. This turned -out to be complex and error prone, so we try to avoid that now. - -Our new LXC based container model is more flexible regarding -storage. First, you can have more than a single mount point. This -allows you to choose a suitable storage for each application. For -example, you can use a relatively slow (and thus cheap) storage for -the container root file system. Then you can use a second mount point -to mount a very fast, distributed storage for your database -application. - -The second big improvement is that you can use any storage type -supported by the {pve} storage library. That means that you can store -your containers on local `lvmthin` or `zfs`, shared `iSCSI` storage, -or even on distributed storage systems like `ceph`. It also enables us -to use advanced storage features like snapshots and clones. `vzdump` -can also use the snapshot feature to provide consistent container -backups. - -Last but not least, you can also mount local devices directly, or -mount local directories using bind mounts. That way you can access -local storage inside containers with zero overhead. Such bind mounts -also provide an easy way to share data between different containers. +[[pct_memory]] +Memory +~~~~~~ +[thumbnail="screenshot/gui-create-ct-memory.png"] + +Container memory is controlled using the cgroup memory controller. + +[horizontal] + +`memory`: :: Limit overall memory usage. This corresponds to the +`memory.limit_in_bytes` cgroup setting. + +`swap`: :: Allows the container to use additional swap memory from the host +swap space. This corresponds to the `memory.memsw.limit_in_bytes` cgroup +setting, which is set to the sum of both value (`memory + swap`). + + +[[pct_mount_points]] Mount Points ~~~~~~~~~~~~ -The root mount point is configured with the `rootfs` property, and you can -configure up to 10 additional mount points. The corresponding options -are called `mp0` to `mp9`, and they can contain the following setting: +[thumbnail="screenshot/gui-create-ct-root-disk.png"] + +The root mount point is configured with the `rootfs` property. You can +configure up to 256 additional mount points. The corresponding options are +called `mp0` to `mp255`. They can contain the following settings: include::pct-mountpoint-opts.adoc[] -Currently there are basically three types of mount points: storage backed -mount points, bind mounts and device mounts. +Currently there are three types of mount points: storage backed mount points, +bind mounts, and device mounts. .Typical container `rootfs` configuration ---- @@ -400,6 +520,18 @@ in three different flavors: - Directories: passing `size=0` triggers a special case where instead of a raw image a directory is created. +NOTE: The special option syntax `STORAGE_ID:SIZE_IN_GB` for storage backed +mount point volumes will automatically allocate a volume of the specified size +on the specified storage. For example, calling + +---- +pct set 100 -mp0 thin1:10,mp=/path/in/container +---- + +will allocate a 10GB volume on the storage `thin1` and replace the volume ID +place holder `10` with the allocated volume ID, and setup the moutpoint in the +container at `/path/in/container` + Bind Mount Points ^^^^^^^^^^^^^^^^^ @@ -418,11 +550,10 @@ user mapping and cannot use ACLs. NOTE: The contents of bind mount points are not backed up when using `vzdump`. -WARNING: For security reasons, bind mounts should only be established -using source directories especially reserved for this purpose, e.g., a -directory hierarchy under `/mnt/bindmounts`. Never bind mount system -directories like `/`, `/var` or `/etc` into a container - this poses a -great security risk. +WARNING: For security reasons, bind mounts should only be established using +source directories especially reserved for this purpose, e.g., a directory +hierarchy under `/mnt/bindmounts`. Never bind mount system directories like +`/`, `/var` or `/etc` into a container - this poses a great security risk. NOTE: The bind mount source path must not contain any symlinks. @@ -444,67 +575,72 @@ NOTE: Device mount points should only be used under special circumstances. In most cases a storage backed mount point offers the same performance and a lot more features. -NOTE: The contents of device mount points are not backed up when using `vzdump`. +NOTE: The contents of device mount points are not backed up when using +`vzdump`. -FUSE Mounts -~~~~~~~~~~~ - -WARNING: Because of existing issues in the Linux kernel's freezer -subsystem the usage of FUSE mounts inside a container is strongly -advised against, as containers need to be frozen for suspend or -snapshot mode backups. +[[pct_container_network]] +Network +~~~~~~~ -If FUSE mounts cannot be replaced by other mounting mechanisms or storage -technologies, it is possible to establish the FUSE mount on the Proxmox host -and use a bind mount point to make it accessible inside the container. +[thumbnail="screenshot/gui-create-ct-network.png"] +You can configure up to 10 network interfaces for a single container. +The corresponding options are called `net0` to `net9`, and they can contain the +following setting: -Using Quotas Inside Containers -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +include::pct-network-opts.adoc[] -Quotas allow to set limits inside a container for the amount of disk -space that each user can use. This only works on ext4 image based -storage types and currently does not work with unprivileged -containers. -Activating the `quota` option causes the following mount options to be -used for a mount point: -`usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0` +[[pct_startup_and_shutdown]] +Automatic Start and Shutdown of Containers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -This allows quotas to be used like you would on any other system. You -can initialize the `/aquota.user` and `/aquota.group` files by running +To automatically start a container when the host system boots, select the +option 'Start at boot' in the 'Options' panel of the container in the web +interface or run the following command: ---- -quotacheck -cmug / -quotaon / +# pct set CTID -onboot 1 ---- -and edit the quotas via the `edquota` command. Refer to the documentation -of the distribution running inside the container for details. - -NOTE: You need to run the above commands for every mount point by passing -the mount point's path instead of just `/`. - - -Using ACLs Inside Containers -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The standard Posix **A**ccess **C**ontrol **L**ists are also available inside containers. -ACLs allow you to set more detailed file ownership than the traditional user/ -group/others model. - - -[[pct_container_network]] -Container Network ------------------ +.Start and Shutdown Order +// use the screenshot from qemu - its the same +[thumbnail="screenshot/gui-qemu-edit-start-order.png"] + +If you want to fine tune the boot order of your containers, you can use the +following parameters: + +* *Start/Shutdown order*: Defines the start order priority. For example, set it + to 1 if you want the CT to be the first to be started. (We use the reverse + startup order for shutdown, so a container with a start order of 1 would be + the last to be shut down) +* *Startup delay*: Defines the interval between this container start and + subsequent containers starts. For example, set it to 240 if you want to wait + 240 seconds before starting other containers. +* *Shutdown timeout*: Defines the duration in seconds {pve} should wait + for the container to be offline after issuing a shutdown command. + By default this value is set to 60, which means that {pve} will issue a + shutdown request, wait 60s for the machine to be offline, and if after 60s + the machine is still online will notify that the shutdown action failed. + +Please note that containers without a Start/Shutdown order parameter will +always start after those where the parameter is set, and this parameter only +makes sense between the machines running locally on a host, and not +cluster-wide. + +Hookscripts +~~~~~~~~~~~ -You can configure up to 10 network interfaces for a single -container. The corresponding options are called `net0` to `net9`, and -they can contain the following setting: +You can add a hook script to CTs with the config property `hookscript`. -include::pct-network-opts.adoc[] +---- +# pct set 100 -hookscript local:snippets/hookscript.pl +---- +It will be called during various phases of the guests lifetime. For an example +and documentation see the example script under +`/usr/share/pve-docs/examples/guest-example-hookscript.pl`. Backup and Restore ------------------ @@ -513,18 +649,18 @@ Backup and Restore Container Backup ~~~~~~~~~~~~~~~~ -It is possible to use the `vzdump` tool for container backup. Please -refer to the `vzdump` manual page for details. +It is possible to use the `vzdump` tool for container backup. Please refer to +the `vzdump` manual page for details. Restoring Container Backups ~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Restoring container backups made with `vzdump` is possible using the -`pct restore` command. By default, `pct restore` will attempt to restore as much -of the backed up container configuration as possible. It is possible to override -the backed up configuration by manually setting container options on the command -line (see the `pct` manual page for details). +Restoring container backups made with `vzdump` is possible using the `pct +restore` command. By default, `pct restore` will attempt to restore as much of +the backed up container configuration as possible. It is possible to override +the backed up configuration by manually setting container options on the +command line (see the `pct` manual page for details). NOTE: `pvesm extractconfig` can be used to view the backed up configuration contained in a vzdump archive. @@ -536,15 +672,16 @@ points: ``Simple'' Restore Mode ^^^^^^^^^^^^^^^^^^^^^^^ -If neither the `rootfs` parameter nor any of the optional `mpX` parameters -are explicitly set, the mount point configuration from the backed up -configuration file is restored using the following steps: +If neither the `rootfs` parameter nor any of the optional `mpX` parameters are +explicitly set, the mount point configuration from the backed up configuration +file is restored using the following steps: . Extract mount points and their options from backup . Create volumes for storage backed mount points (on storage provided with the -`storage` parameter, or default local storage if unset) + `storage` parameter, or default local storage if unset) . Extract files from backup archive -. Add bind and device mount points to restored configuration (limited to root user) +. Add bind and device mount points to restored configuration (limited to root + user) NOTE: Since bind and device mount points are never backed up, no files are restored in the last step, but only the configuration options. The assumption @@ -562,14 +699,14 @@ interface. By setting the `rootfs` parameter (and optionally, any combination of `mpX` parameters), the `pct restore` command is automatically switched into an advanced mode. This advanced mode completely ignores the `rootfs` and `mpX` -configuration options contained in the backup archive, and instead only -uses the options explicitly provided as parameters. +configuration options contained in the backup archive, and instead only uses +the options explicitly provided as parameters. -This mode allows flexible configuration of mount point settings at restore time, -for example: +This mode allows flexible configuration of mount point settings at restore +time, for example: * Set target storages, volume sizes and other options for each mount point -individually + individually * Redistribute backed up files according to new mount point scheme * Restore to device and/or bind mount points (limited to root user) @@ -577,44 +714,58 @@ individually Managing Containers with `pct` ------------------------------ -`pct` is the tool to manage Linux Containers on {pve}. You can create -and destroy containers, and control execution (start, stop, migrate, -...). You can use pct to set parameters in the associated config file, -like network configuration or memory limits. - +The ``Proxmox Container Toolkit'' (`pct`) is the command line tool to manage +{pve} containers. It enables you to create or destroy containers, as well as +control the container execution (start, stop, reboot, migrate, etc.). It can be +used to set parameters in the config file of a container, for example the +network configuration or memory limits. CLI Usage Examples ~~~~~~~~~~~~~~~~~~ -Create a container based on a Debian template (provided you have -already downloaded the template via the web interface) +Create a container based on a Debian template (provided you have already +downloaded the template via the web interface) - pct create 100 /var/lib/vz/template/cache/debian-8.0-standard_8.0-1_amd64.tar.gz +---- +# pct create 100 /var/lib/vz/template/cache/debian-10.0-standard_10.0-1_amd64.tar.gz +---- Start container 100 - pct start 100 +---- +# pct start 100 +---- Start a login session via getty - pct console 100 +---- +# pct console 100 +---- Enter the LXC namespace and run a shell as root user - pct enter 100 +---- +# pct enter 100 +---- Display the configuration - pct config 100 +---- +# pct config 100 +---- -Add a network interface called `eth0`, bridged to the host bridge `vmbr0`, -set the address and gateway, while it's running +Add a network interface called `eth0`, bridged to the host bridge `vmbr0`, set +the address and gateway, while it's running - pct set 100 -net0 name=eth0,bridge=vmbr0,ip=192.168.15.147/24,gw=192.168.15.1 +---- +# pct set 100 -net0 name=eth0,bridge=vmbr0,ip=192.168.15.147/24,gw=192.168.15.1 +---- Reduce the memory of the container to 512MB - pct set 100 -memory 512 +---- +# pct set 100 -memory 512 +---- Obtaining Debugging Logs @@ -624,9 +775,12 @@ In case `pct start` is unable to start a specific container, it might be helpful to collect debugging output by running `lxc-start` (replace `ID` with the container's ID): - lxc-start -n ID -F -l DEBUG -o /tmp/lxc-ID.log +---- +# lxc-start -n ID -F -l DEBUG -o /tmp/lxc-ID.log +---- -This command will attempt to start the container in foreground mode, to stop the container run `pct shutdown ID` or `pct stop ID` in a second terminal. +This command will attempt to start the container in foreground mode, to stop +the container run `pct shutdown ID` or `pct stop ID` in a second terminal. The collected debug log is written to `/tmp/lxc-ID.log`. @@ -634,78 +788,156 @@ NOTE: If you have changed the container's configuration since the last start attempt with `pct start`, you need to run `pct start` at least once to also update the configuration used by `lxc-start`. -Locks ------ +[[pct_migration]] +Migration +--------- -Container migrations, snapshots and backups (`vzdump`) set a lock to -prevent incompatible concurrent actions on the affected container. Sometimes -you need to remove such a lock manually (e.g., after a power failure). +If you have a cluster, you can migrate your Containers with - pct unlock +---- +# pct migrate +---- -CAUTION: Only do that if you are sure the action which set the lock is -no longer running. +This works as long as your Container is offline. If it has local volumes or +mount points defined, the migration will copy the content over the network to +the target host if the same storage is defined there. +If you want to migrate online Containers, the only way is to use restart +migration. This can be initiated with the -restart flag and the optional +-timeout parameter. -Container Advantages --------------------- +A restart migration will shut down the Container and kill it after the +specified timeout (the default is 180 seconds). Then it will migrate the +Container like an offline migration and when finished, it starts the Container +on the target node. -* Simple, and fully integrated into {pve}. Setup looks similar to a normal - VM setup. +[[pct_configuration]] +Configuration +------------- -** Storage (ZFS, LVM, NFS, Ceph, ...) +The `/etc/pve/lxc/.conf` file stores container configuration, where +`` is the numeric ID of the given container. Like all other files stored +inside `/etc/pve/`, they get automatically replicated to all other cluster +nodes. -** Network +NOTE: CTIDs < 100 are reserved for internal purposes, and CTIDs need to be +unique cluster wide. -** Authentication +.Example Container Configuration +---- +ostype: debian +arch: amd64 +hostname: www +memory: 512 +swap: 512 +net0: bridge=vmbr0,hwaddr=66:64:66:64:64:36,ip=dhcp,name=eth0,type=veth +rootfs: local:107/vm-107-disk-1.raw,size=7G +---- -** Cluster +The configuration files are simple text files. You can edit them using a normal +text editor (`vi`, `nano`, etc). +This is sometimes useful to do small corrections, but keep in mind that you +need to restart the container to apply such changes. -* Fast: minimal overhead, as fast as bare metal +For that reason, it is usually better to use the `pct` command to generate and +modify those files, or do the whole thing using the GUI. +Our toolkit is smart enough to instantaneously apply most changes to running +containers. This feature is called "hot plug", and there is no need to restart +the container in that case. -* High density (perfect for idle workloads) +In cases where a change cannot be hot plugged, it will be registered as a +pending change (shown in red color in the GUI). +They will only be applied after rebooting the container. -* REST API -* Direct hardware access +File Format +~~~~~~~~~~~ +The container configuration file uses a simple colon separated key/value +format. Each line has the following format: -Technology Overview -------------------- +----- +# this is a comment +OPTION: value +----- -* Integrated into {pve} graphical user interface (GUI) +Blank lines in those files are ignored, and lines starting with a `#` character +are treated as comments and are also ignored. -* LXC (https://linuxcontainers.org/) +It is possible to add low-level, LXC style configuration directly, for example: -* lxcfs to provide containerized /proc file system +---- +lxc.init_cmd: /sbin/my_own_init +---- -* AppArmor +or -* CRIU: for live migration (planned) +---- +lxc.init_cmd = /sbin/my_own_init +---- -* We use latest available kernels (4.4.X) +The settings are passed directly to the LXC low-level tools. -* Image based deployment (templates) -* Container setup from host (network, DNS, storage, ...) +[[pct_snapshots]] +Snapshots +~~~~~~~~~ +When you create a snapshot, `pct` stores the configuration at snapshot time +into a separate snapshot section within the same configuration file. For +example, after creating a snapshot called ``testsnapshot'', your configuration +file will look like this: -ifdef::manvolnum[] +.Container configuration with snapshot +---- +memory: 512 +swap: 512 +parent: testsnaphot +... -Files ------- +[testsnaphot] +memory: 512 +swap: 512 +snaptime: 1457170803 +... +---- -`/etc/pve/lxc/.conf`:: +There are a few snapshot related properties like `parent` and `snaptime`. The +`parent` property is used to store the parent/child relationship between +snapshots. `snaptime` is the snapshot creation time stamp (Unix epoch). -Configuration file for the container ''. +[[pct_options]] +Options +~~~~~~~ -include::pve-copyright.adoc[] -endif::manvolnum[] +include::pct.conf.5-opts.adoc[] +Locks +----- +Container migrations, snapshots and backups (`vzdump`) set a lock to prevent +incompatible concurrent actions on the affected container. Sometimes you need +to remove such a lock manually (e.g., after a power failure). +---- +# pct unlock +---- +CAUTION: Only do this if you are sure the action which set the lock is no +longer running. +ifdef::manvolnum[] + +Files +------ + +`/etc/pve/lxc/.conf`:: + +Configuration file for the container ''. + + +include::pve-copyright.adoc[] +endif::manvolnum[]