X-Git-Url: https://git.proxmox.com/?p=pve-docs.git;a=blobdiff_plain;f=pct.adoc;h=2cb4bbe8c4eacf0da871ef030ecce624a672415a;hp=b55ce1d2c0f828e8031f68175c44be6c8870e4a3;hb=856993e4166495537f42e0b9c3a51c966227feab;hpb=99f6ae1a7fd9ec8bb911deea84614439403bfb02 diff --git a/pct.adoc b/pct.adoc index b55ce1d..2cb4bbe 100644 --- a/pct.adoc +++ b/pct.adoc @@ -2,7 +2,6 @@ ifdef::manvolnum[] pct(1) ====== -include::attributes.txt[] :pve-toplevel: NAME @@ -23,7 +22,6 @@ endif::manvolnum[] ifndef::manvolnum[] Proxmox Container Toolkit ========================= -include::attributes.txt[] :pve-toplevel: endif::manvolnum[] ifdef::wiki[] @@ -85,7 +83,7 @@ Technology Overview * CRIU: for live migration (planned) -* Use latest available kernels (4.4.X) +* Runs on modern Linux kernels * Image based deployment (templates) @@ -104,32 +102,7 @@ virtualized VMs provide better isolation. The good news is that LXC uses many kernel security features like AppArmor, CGroups and PID and user namespaces, which makes containers -usage quite secure. We distinguish two types of containers: - - -Privileged Containers -~~~~~~~~~~~~~~~~~~~~~ - -Security is done by dropping capabilities, using mandatory access -control (AppArmor), SecComp filters and namespaces. The LXC team -considers this kind of container as unsafe, and they will not consider -new container escape exploits to be security issues worthy of a CVE -and quick fix. So you should use this kind of containers only inside a -trusted environment, or when no untrusted task is running as root in -the container. - - -Unprivileged Containers -~~~~~~~~~~~~~~~~~~~~~~~ - -This kind of containers use a new kernel feature called user -namespaces. The root UID 0 inside the container is mapped to an -unprivileged user outside the container. This means that most security -issues (container escape, resource abuse, ...) in those containers -will affect a random unprivileged user, and so would be a generic -kernel security bug rather than an LXC issue. The LXC team thinks -unprivileged containers are safe by design. - +usage quite secure. Guest Operating System Configuration ------------------------------------ @@ -170,7 +143,7 @@ and will not be moved. Modification of a file can be prevented by adding a `.pve-ignore.` file for it. For instance, if the file `/etc/.pve-ignore.hosts` exists then the `/etc/hosts` file will not be touched. This can be a -simple empty file creatd via: +simple empty file created via: # touch /etc/.pve-ignore.hosts @@ -282,7 +255,8 @@ allows you to choose a suitable storage for each application. For example, you can use a relatively slow (and thus cheap) storage for the container root file system. Then you can use a second mount point to mount a very fast, distributed storage for your database -application. +application. See section <> for further +details. The second big improvement is that you can use any storage type supported by the {pve} storage library. That means that you can store @@ -298,9 +272,193 @@ local storage inside containers with zero overhead. Such bind mounts also provide an easy way to share data between different containers. +FUSE Mounts +~~~~~~~~~~~ + +WARNING: Because of existing issues in the Linux kernel's freezer +subsystem the usage of FUSE mounts inside a container is strongly +advised against, as containers need to be frozen for suspend or +snapshot mode backups. + +If FUSE mounts cannot be replaced by other mounting mechanisms or storage +technologies, it is possible to establish the FUSE mount on the Proxmox host +and use a bind mount point to make it accessible inside the container. + + +Using Quotas Inside Containers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Quotas allow to set limits inside a container for the amount of disk +space that each user can use. This only works on ext4 image based +storage types and currently does not work with unprivileged +containers. + +Activating the `quota` option causes the following mount options to be +used for a mount point: +`usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0` + +This allows quotas to be used like you would on any other system. You +can initialize the `/aquota.user` and `/aquota.group` files by running + +---- +quotacheck -cmug / +quotaon / +---- + +and edit the quotas via the `edquota` command. Refer to the documentation +of the distribution running inside the container for details. + +NOTE: You need to run the above commands for every mount point by passing +the mount point's path instead of just `/`. + + +Using ACLs Inside Containers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The standard Posix **A**ccess **C**ontrol **L**ists are also available inside containers. +ACLs allow you to set more detailed file ownership than the traditional user/ +group/others model. + + +Backup of Containers mount points +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +By default additional mount points besides the Root Disk mount point are not +included in backups. You can reverse this default behavior by setting the +*Backup* option on a mount point. +// see PVE::VZDump::LXC::prepare() + +Replication of Containers mount points +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +By default additional mount points are replicated when the Root Disk +is replicated. If you want the {pve} storage replication mechanism to skip a + mount point when starting a replication job, you can set the +*Skip replication* option on that mount point. + +As of {pve} 5.0, replication requires a storage of type `zfspool`, so adding a + mount point to a different type of storage when the container has replication + configured requires to *Skip replication* for that mount point. + + +[[pct_settings]] +Container Settings +------------------ + +[[pct_general]] +General Settings +~~~~~~~~~~~~~~~~ + +[thumbnail="screenshot/gui-create-ct-general.png"] + +General settings of a container include + +* the *Node* : the physical server on which the container will run +* the *CT ID*: a unique number in this {pve} installation used to identify your container +* *Hostname*: the hostname of the container +* *Resource Pool*: a logical group of containers and VMs +* *Password*: the root password of the container +* *SSH Public Key*: a public key for connecting to the root account over SSH +* *Unprivileged container*: this option allows to choose at creation time +if you want to create a privileged or unprivileged container. + + +Privileged Containers +^^^^^^^^^^^^^^^^^^^^^ + +Security is done by dropping capabilities, using mandatory access +control (AppArmor), SecComp filters and namespaces. The LXC team +considers this kind of container as unsafe, and they will not consider +new container escape exploits to be security issues worthy of a CVE +and quick fix. So you should use this kind of containers only inside a +trusted environment, or when no untrusted task is running as root in +the container. + + +Unprivileged Containers +^^^^^^^^^^^^^^^^^^^^^^^ + +This kind of containers use a new kernel feature called user +namespaces. The root UID 0 inside the container is mapped to an +unprivileged user outside the container. This means that most security +issues (container escape, resource abuse, ...) in those containers +will affect a random unprivileged user, and so would be a generic +kernel security bug rather than an LXC issue. The LXC team thinks +unprivileged containers are safe by design. + +NOTE: If the container uses systemd as an init system, please be +aware the systemd version running inside the container should be equal +or greater than 220. + +[[pct_cpu]] +CPU +~~~ + +[thumbnail="screenshot/gui-create-ct-cpu.png"] + +You can restrict the number of visible CPUs inside the container using +the `cores` option. This is implemented using the Linux 'cpuset' +cgroup (**c**ontrol *group*). A special task inside `pvestatd` tries +to distribute running containers among available CPUs. You can view +the assigned CPUs using the following command: + +---- +# pct cpusets + --------------------- + 102: 6 7 + 105: 2 3 4 5 + 108: 0 1 + --------------------- +---- + +Containers use the host kernel directly, so all task inside a +container are handled by the host CPU scheduler. {pve} uses the Linux +'CFS' (**C**ompletely **F**air **S**cheduler) scheduler by default, +which has additional bandwidth control options. + +[horizontal] + +`cpulimit`: :: You can use this option to further limit assigned CPU +time. Please note that this is a floating point number, so it is +perfectly valid to assign two cores to a container, but restrict +overall CPU consumption to half a core. ++ +---- +cores: 2 +cpulimit: 0.5 +---- + +`cpuunits`: :: This is a relative weight passed to the kernel +scheduler. The larger the number is, the more CPU time this container +gets. Number is relative to the weights of all the other running +containers. The default is 1024. You can use this setting to +prioritize some containers. + + +[[pct_memory]] +Memory +~~~~~~ + +[thumbnail="screenshot/gui-create-ct-memory.png"] + +Container memory is controlled using the cgroup memory controller. + +[horizontal] + +`memory`: :: Limit overall memory usage. This corresponds +to the `memory.limit_in_bytes` cgroup setting. + +`swap`: :: Allows the container to use additional swap memory from the +host swap space. This corresponds to the `memory.memsw.limit_in_bytes` +cgroup setting, which is set to the sum of both value (`memory + +swap`). + + +[[pct_mount_points]] Mount Points ~~~~~~~~~~~~ +[thumbnail="screenshot/gui-create-ct-root-disk.png"] + The root mount point is configured with the `rootfs` property, and you can configure up to 10 additional mount points. The corresponding options are called `mp0` to `mp9`, and they can contain the following setting: @@ -329,6 +487,13 @@ in three different flavors: - Directories: passing `size=0` triggers a special case where instead of a raw image a directory is created. +NOTE: The special option syntax `STORAGE_ID:SIZE_IN_GB` for storage backed +mount point volumes will automatically allocate a volume of the specified size +on the specified storage. E.g., calling +`pct set 100 -mp0 thin1:10,mp=/path/in/container` will allocate a 10GB volume +on the storage `thin1` and replace the volume ID place holder `10` with the +allocated volume ID. + Bind Mount Points ^^^^^^^^^^^^^^^^^ @@ -376,64 +541,65 @@ more features. NOTE: The contents of device mount points are not backed up when using `vzdump`. -FUSE Mounts -~~~~~~~~~~~ - -WARNING: Because of existing issues in the Linux kernel's freezer -subsystem the usage of FUSE mounts inside a container is strongly -advised against, as containers need to be frozen for suspend or -snapshot mode backups. - -If FUSE mounts cannot be replaced by other mounting mechanisms or storage -technologies, it is possible to establish the FUSE mount on the Proxmox host -and use a bind mount point to make it accessible inside the container. +[[pct_container_network]] +Network +~~~~~~~ +[thumbnail="screenshot/gui-create-ct-network.png"] -Using Quotas Inside Containers -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +You can configure up to 10 network interfaces for a single +container. The corresponding options are called `net0` to `net9`, and +they can contain the following setting: -Quotas allow to set limits inside a container for the amount of disk -space that each user can use. This only works on ext4 image based -storage types and currently does not work with unprivileged -containers. +include::pct-network-opts.adoc[] -Activating the `quota` option causes the following mount options to be -used for a mount point: -`usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0` -This allows quotas to be used like you would on any other system. You -can initialize the `/aquota.user` and `/aquota.group` files by running +[[pct_startup_and_shutdown]] +Automatic Start and Shutdown of Containers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ----- -quotacheck -cmug / -quotaon / ----- +After creating your containers, you probably want them to start automatically +when the host system boots. For this you need to select the option 'Start at +boot' from the 'Options' Tab of your container in the web interface, or set it with +the following command: -and edit the quotas via the `edquota` command. Refer to the documentation -of the distribution running inside the container for details. + pct set -onboot 1 -NOTE: You need to run the above commands for every mount point by passing -the mount point's path instead of just `/`. +.Start and Shutdown Order +// use the screenshot from qemu - its the same +[thumbnail="screenshot/gui-qemu-edit-start-order.png"] +If you want to fine tune the boot order of your containers, you can use the following +parameters : -Using ACLs Inside Containers -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +* *Start/Shutdown order*: Defines the start order priority. E.g. set it to 1 if +you want the CT to be the first to be started. (We use the reverse startup +order for shutdown, so a container with a start order of 1 would be the last to +be shut down) +* *Startup delay*: Defines the interval between this container start and subsequent +containers starts . E.g. set it to 240 if you want to wait 240 seconds before starting +other containers. +* *Shutdown timeout*: Defines the duration in seconds {pve} should wait +for the container to be offline after issuing a shutdown command. +By default this value is set to 60, which means that {pve} will issue a +shutdown request, wait 60s for the machine to be offline, and if after 60s +the machine is still online will notify that the shutdown action failed. -The standard Posix **A**ccess **C**ontrol **L**ists are also available inside containers. -ACLs allow you to set more detailed file ownership than the traditional user/ -group/others model. +Please note that containers without a Start/Shutdown order parameter will always +start after those where the parameter is set, and this parameter only +makes sense between the machines running locally on a host, and not +cluster-wide. +Hookscripts +~~~~~~~~~~~ -[[pct_container_network]] -Container Network ------------------ - -You can configure up to 10 network interfaces for a single -container. The corresponding options are called `net0` to `net9`, and -they can contain the following setting: +You can add a hook script to CTs with the config property `hookscript`. -include::pct-network-opts.adoc[] + pct set 100 -hookscript local:snippets/hookscript.pl +It will be called during various phases of the guests lifetime. +For an example and documentation see the example script under +`/usr/share/pve-docs/examples/guest-example-hookscript.pl`. Backup and Restore ------------------ @@ -563,6 +729,26 @@ NOTE: If you have changed the container's configuration since the last start attempt with `pct start`, you need to run `pct start` at least once to also update the configuration used by `lxc-start`. +[[pct_migration]] +Migration +--------- + +If you have a cluster, you can migrate your Containers with + + pct migrate + +This works as long as your Container is offline. If it has local volumes or +mountpoints defined, the migration will copy the content over the network to +the target host if there is the same storage defined. + +If you want to migrate online Containers, the only way is to use +restart migration. This can be initiated with the -restart flag and the optional +-timeout parameter. + +A restart migration will shut down the Container and kill it after the specified +timeout (the default is 180 seconds). Then it will migrate the Container +like an offline migration and when finished, it starts the Container on the +target node. [[pct_configuration]] Configuration