pct.adoc: add Guest Operating System Configuration

[pve-docs.git] / pct.adoc
diff --git a/pct.adoc b/pct.adoc

index 611ff484b9c7bf0272e2dcdc97373cbf788340ed..8c8db744f2c96e68e74db6d841f4123115b4af09 100644 (file)
--- a/pct.adoc
+++ b/pct.adoc
@@ -24,11 +24,208 @@ Proxmox Container Toolkit
  include::attributes.txt[]
  endif::manvolnum[]
  
-'pct' is a tool to manages Linux Containers (LXC). You can create and
-destroy containers, and control execution
-(start/stop/suspend/resume). Besides that, you can use pct to set
-parameters in the associated config file, like network configuration
-or memory.
+
+Containers are a lightweight alternative to fully virtualized
+VMs. Instead of emulating a complete Operating System (OS), containers
+simply use the OS of the host they run on. This implies that all
+containers use the same kernel, and that they can access resources
+from the host directly.
+
+This is great because containers do not waste CPU power nor memory due
+to kernel emulation. Container run-time costs are close to zero and
+usually negligible. But there are also some drawbacks you need to
+consider:
+
+* You can only run Linux based OS inside containers, i.e. it is not
+  possible to run Free BSD or MS Windows inside.
+
+* For security reasons, access to host resources need to be
+  restricted. This is done with AppArmor, SecComp filters and other
+  kernel feature. Be prepared that some syscalls are not allowed
+  inside containers.
+
+{pve} uses https://linuxcontainers.org/[LXC] as underlying container
+technology. We consider LXC as low-level library, which provides
+countless options. It would be to difficult to use those tools
+directly. Instead, we provide a small wrapper called `pct`, the
+"Proxmox Container Toolkit".
+
+The toolkit it tightly coupled with {pve}. That means that it is aware
+of the cluster setup, and it can use the same network and storage
+resources as fully virtualized VMs. You can even use the {pve}
+firewall, or manage containers using the HA framework.
+
+Our primary goal is to offer an environment as one would get from a
+VM, but without the additional overhead. We call this "System
+Containers".
+
+NOTE: If you want to run micro-containers (with docker, rct, ...), it
+is best to run them inside a VM.
+
+
+Security Considerations
+-----------------------
+
+Containers use the same kernel as the host, so there is a big attack
+surface for malicious users. You should consider this fact if you
+provide containers to totally untrusted people. In general, fully
+virtualized VM provides better isolation.
+
+The good news is that LXC uses many kernel security features like
+AppArmor, CGroups and PID and user namespaces, which makes containers
+usage quite secure. We distinguish two types of containers:
+
+Privileged containers
+~~~~~~~~~~~~~~~~~~~~~
+
+Security is done by dropping capabilities, using mandatory access
+control (AppArmor), SecComp filters and namespaces. The LXC team
+considers this kind of container as unsafe, and they will not consider
+new container escape exploits to be security issues worthy of a CVE
+and quick fix. So you should use this kind of containers only inside a
+trusted environment, or when no untrusted task is running as root in
+the container.
+
+Unprivileged containers
+~~~~~~~~~~~~~~~~~~~~~~~
+
+This kind of containers use a new kernel feature, called user
+namespaces. The root uid 0 inside the container is mapped to an
+unprivileged user outside the container. This means that most security
+issues (container escape, resource abuse, ...) in those containers
+will affect a random unprivileged user, and so would be a generic
+kernel security bug rather than a LXC issue. LXC people think
+unprivileged containers are safe by design.
+
+
+Configuration
+-------------
+
+The '/etc/pve/lxc/<CTID>.conf' files stores container configuration,
+where '<CTID>' is the numeric ID of the given container. Note that
+CTIDs < 100 are reserved for internal purposes. CTIDs need to be
+unique - cluster wide. Files are stored inside '/etc/pve/', so they get
+automatically replicated to all other cluster nodes.
+
+Those configuration files are simple text files, and you can edit them
+using a normal text editor ('vi', 'nano', ...). This is sometimes
+useful to do small corrections, but keep in mind that you need to
+restart the container to apply such changes.
+
+For that reason, it is usually better to use the 'pct' command to
+generate and modify those files, or do the whole thing using the GUI.
+Our toolkit is smart enough to instantaneously apply most changes to
+running containers (hot plug).
+
+
+File Format
+~~~~~~~~~~~
+
+Container configuration files use a simple colon separated key/value
+format. Each line has the following format:
+
+ # this is a comment
+ OPTION: value
+
+Blank lines in those files are ignored, and lines starting with a '#'
+character are treated as comments and are also ignored.
+
+It is possible to add low-level, LXC style configuration directly, for
+example:
+
+ lxc.init_cmd: /sbin/my_own_init
+
+or
+
+ lxc.init_cmd = /sbin/my_own_init
+
+Those settings are directly passed to the LXC low-level tools.
+
+
+Guest Operating System Configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+We normally try to detect the operating system type inside the
+container, and then modify some files inside the container to make
+them work as expected. Here is a short list of things we do at
+container startup:
+
+set /etc/hostname:: to set the container name
+
+modify /etc/hosts:: allow to lookup the local hostname
+
+network setup:: pass the complete network setup to the container
+
+configure DNS:: pass information about DNS servers
+
+adopt the init system:: for example, fix the number os spawned getty processes
+
+set the root password:: when creating a new container
+
+rewrite ssh_host_keys:: so that each container has unique keys
+
+randomize crontab:: so that cron does not start at same time on all containers
+
+Above task depends on the OS type, so the implementation is different
+for each OS type. You can also disable any modifications by manually
+setting the 'ostype' to 'unmanaged'.
+
+OS type detection is done by testing for certain files inside the
+container:
+
+Ubuntu:: inspect /etc/lsb-release ('DISTRIB_ID=Ubuntu')
+
+Debian:: test /etc/debian_version
+
+Fedora:: test /etc/fedora-release
+
+RedHat or CentOS:: test /etc/redhat-release
+
+ArchLinux:: test /etc/arch-release
+
+Alpine:: test /etc/alpine-release
+
+NOTE: Container start fails is configured 'ostype' differs from auto
+detected type.
+
+Container Storage
+-----------------
+
+Traditional containers use a very simple storage model, only allowing
+a single mount point, the root file system. This was further
+restricted to specific file system types like 'ext4' and 'nfs'.
+Additional mounts are often done by user provided scripts. This turend
+out to be complex and error prone, so we trie to avoid that now.
+
+Our new LXC based container model is more flexible regarding
+storage. First, you can have more than a single mount point. This
+allows you to choose a suitable storage for each application. For
+example, you can use a relatively slow (and thus cheap) storage for
+the container root file system. Then you can use a second mount point
+to mount a very fast, distributed storage for your database
+application.
+
+The second big improvement is that you can use any storage type
+supported by the {pve} storage library. That means that you can store
+your containers on local 'lvmthin' or 'zfs', shared 'iSCSI' storage,
+or even on distributed storage systems like 'ceph'. And it enables us
+to use advanced storage features like snapshots and clones. 'vzdump'
+can also use the snapshots feature to provide consistent container
+backups.
+
+Last but not least, you can also mount local devices directly, or
+mount local directories using bind mounts. That way you can access
+local storage inside containers with zero overhead. Such bind mounts
+also provides an easy way to share data between different containers.
+
+
+Managing Containers with 'pct'
+------------------------------
+
+'pct' is the tool to manage Linux Containers on {pve}. You can create
+and destroy containers, and control execution (start, stop, migrate,
+...). You can use pct to set parameters in the associated config file,
+like network configuration or memory.
  
  CLI Usage Examples
  ------------------
@@ -66,9 +263,9 @@ Reduce the memory of the container to 512MB
  Files
  ------
  
-'/etc/pve/lxc/<vmid>.conf'::
+'/etc/pve/lxc/<CTID>.conf'::
  
-Configuration file for the container <vmid>
+Configuration file for the container '<CTID>'.
  
  
  Container Advantages