X-Git-Url: https://git.proxmox.com/?p=pve-docs.git;a=blobdiff_plain;f=pct.adoc;h=8c8db744f2c96e68e74db6d841f4123115b4af09;hp=611ff484b9c7bf0272e2dcdc97373cbf788340ed;hb=3f13c1c31b8bb60378b715975cc4eda7283fe308;hpb=38fd0958719a329859b3d0d719c37d5df15a2d8d diff --git a/pct.adoc b/pct.adoc index 611ff48..8c8db74 100644 --- a/pct.adoc +++ b/pct.adoc @@ -24,11 +24,208 @@ Proxmox Container Toolkit include::attributes.txt[] endif::manvolnum[] -'pct' is a tool to manages Linux Containers (LXC). You can create and -destroy containers, and control execution -(start/stop/suspend/resume). Besides that, you can use pct to set -parameters in the associated config file, like network configuration -or memory. + +Containers are a lightweight alternative to fully virtualized +VMs. Instead of emulating a complete Operating System (OS), containers +simply use the OS of the host they run on. This implies that all +containers use the same kernel, and that they can access resources +from the host directly. + +This is great because containers do not waste CPU power nor memory due +to kernel emulation. Container run-time costs are close to zero and +usually negligible. But there are also some drawbacks you need to +consider: + +* You can only run Linux based OS inside containers, i.e. it is not + possible to run Free BSD or MS Windows inside. + +* For security reasons, access to host resources need to be + restricted. This is done with AppArmor, SecComp filters and other + kernel feature. Be prepared that some syscalls are not allowed + inside containers. + +{pve} uses https://linuxcontainers.org/[LXC] as underlying container +technology. We consider LXC as low-level library, which provides +countless options. It would be to difficult to use those tools +directly. Instead, we provide a small wrapper called `pct`, the +"Proxmox Container Toolkit". + +The toolkit it tightly coupled with {pve}. That means that it is aware +of the cluster setup, and it can use the same network and storage +resources as fully virtualized VMs. You can even use the {pve} +firewall, or manage containers using the HA framework. + +Our primary goal is to offer an environment as one would get from a +VM, but without the additional overhead. We call this "System +Containers". + +NOTE: If you want to run micro-containers (with docker, rct, ...), it +is best to run them inside a VM. + + +Security Considerations +----------------------- + +Containers use the same kernel as the host, so there is a big attack +surface for malicious users. You should consider this fact if you +provide containers to totally untrusted people. In general, fully +virtualized VM provides better isolation. + +The good news is that LXC uses many kernel security features like +AppArmor, CGroups and PID and user namespaces, which makes containers +usage quite secure. We distinguish two types of containers: + +Privileged containers +~~~~~~~~~~~~~~~~~~~~~ + +Security is done by dropping capabilities, using mandatory access +control (AppArmor), SecComp filters and namespaces. The LXC team +considers this kind of container as unsafe, and they will not consider +new container escape exploits to be security issues worthy of a CVE +and quick fix. So you should use this kind of containers only inside a +trusted environment, or when no untrusted task is running as root in +the container. + +Unprivileged containers +~~~~~~~~~~~~~~~~~~~~~~~ + +This kind of containers use a new kernel feature, called user +namespaces. The root uid 0 inside the container is mapped to an +unprivileged user outside the container. This means that most security +issues (container escape, resource abuse, ...) in those containers +will affect a random unprivileged user, and so would be a generic +kernel security bug rather than a LXC issue. LXC people think +unprivileged containers are safe by design. + + +Configuration +------------- + +The '/etc/pve/lxc/.conf' files stores container configuration, +where '' is the numeric ID of the given container. Note that +CTIDs < 100 are reserved for internal purposes. CTIDs need to be +unique - cluster wide. Files are stored inside '/etc/pve/', so they get +automatically replicated to all other cluster nodes. + +Those configuration files are simple text files, and you can edit them +using a normal text editor ('vi', 'nano', ...). This is sometimes +useful to do small corrections, but keep in mind that you need to +restart the container to apply such changes. + +For that reason, it is usually better to use the 'pct' command to +generate and modify those files, or do the whole thing using the GUI. +Our toolkit is smart enough to instantaneously apply most changes to +running containers (hot plug). + + +File Format +~~~~~~~~~~~ + +Container configuration files use a simple colon separated key/value +format. Each line has the following format: + + # this is a comment + OPTION: value + +Blank lines in those files are ignored, and lines starting with a '#' +character are treated as comments and are also ignored. + +It is possible to add low-level, LXC style configuration directly, for +example: + + lxc.init_cmd: /sbin/my_own_init + +or + + lxc.init_cmd = /sbin/my_own_init + +Those settings are directly passed to the LXC low-level tools. + + +Guest Operating System Configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +We normally try to detect the operating system type inside the +container, and then modify some files inside the container to make +them work as expected. Here is a short list of things we do at +container startup: + +set /etc/hostname:: to set the container name + +modify /etc/hosts:: allow to lookup the local hostname + +network setup:: pass the complete network setup to the container + +configure DNS:: pass information about DNS servers + +adopt the init system:: for example, fix the number os spawned getty processes + +set the root password:: when creating a new container + +rewrite ssh_host_keys:: so that each container has unique keys + +randomize crontab:: so that cron does not start at same time on all containers + +Above task depends on the OS type, so the implementation is different +for each OS type. You can also disable any modifications by manually +setting the 'ostype' to 'unmanaged'. + +OS type detection is done by testing for certain files inside the +container: + +Ubuntu:: inspect /etc/lsb-release ('DISTRIB_ID=Ubuntu') + +Debian:: test /etc/debian_version + +Fedora:: test /etc/fedora-release + +RedHat or CentOS:: test /etc/redhat-release + +ArchLinux:: test /etc/arch-release + +Alpine:: test /etc/alpine-release + +NOTE: Container start fails is configured 'ostype' differs from auto +detected type. + +Container Storage +----------------- + +Traditional containers use a very simple storage model, only allowing +a single mount point, the root file system. This was further +restricted to specific file system types like 'ext4' and 'nfs'. +Additional mounts are often done by user provided scripts. This turend +out to be complex and error prone, so we trie to avoid that now. + +Our new LXC based container model is more flexible regarding +storage. First, you can have more than a single mount point. This +allows you to choose a suitable storage for each application. For +example, you can use a relatively slow (and thus cheap) storage for +the container root file system. Then you can use a second mount point +to mount a very fast, distributed storage for your database +application. + +The second big improvement is that you can use any storage type +supported by the {pve} storage library. That means that you can store +your containers on local 'lvmthin' or 'zfs', shared 'iSCSI' storage, +or even on distributed storage systems like 'ceph'. And it enables us +to use advanced storage features like snapshots and clones. 'vzdump' +can also use the snapshots feature to provide consistent container +backups. + +Last but not least, you can also mount local devices directly, or +mount local directories using bind mounts. That way you can access +local storage inside containers with zero overhead. Such bind mounts +also provides an easy way to share data between different containers. + + +Managing Containers with 'pct' +------------------------------ + +'pct' is the tool to manage Linux Containers on {pve}. You can create +and destroy containers, and control execution (start, stop, migrate, +...). You can use pct to set parameters in the associated config file, +like network configuration or memory. CLI Usage Examples ------------------ @@ -66,9 +263,9 @@ Reduce the memory of the container to 512MB Files ------ -'/etc/pve/lxc/.conf':: +'/etc/pve/lxc/.conf':: -Configuration file for the container +Configuration file for the container ''. Container Advantages