]> git.proxmox.com Git - pve-docs.git/blame_incremental - pct.adoc
ha-manager: fix copy-paste error
[pve-docs.git] / pct.adoc
... / ...
CommitLineData
1ifdef::manvolnum[]
2PVE({manvolnum})
3================
4include::attributes.txt[]
5
6NAME
7----
8
9pct - Tool to manage Linux Containers (LXC) on Proxmox VE
10
11
12SYNOPSYS
13--------
14
15include::pct.1-synopsis.adoc[]
16
17DESCRIPTION
18-----------
19endif::manvolnum[]
20
21ifndef::manvolnum[]
22Proxmox Container Toolkit
23=========================
24include::attributes.txt[]
25endif::manvolnum[]
26
27
28Containers are a lightweight alternative to fully virtualized
29VMs. Instead of emulating a complete Operating System (OS), containers
30simply use the OS of the host they run on. This implies that all
31containers use the same kernel, and that they can access resources
32from the host directly.
33
34This is great because containers do not waste CPU power nor memory due
35to kernel emulation. Container run-time costs are close to zero and
36usually negligible. But there are also some drawbacks you need to
37consider:
38
39* You can only run Linux based OS inside containers, i.e. it is not
40 possible to run FreeBSD or MS Windows inside.
41
42* For security reasons, access to host resources needs to be
43 restricted. This is done with AppArmor, SecComp filters and other
44 kernel features. Be prepared that some syscalls are not allowed
45 inside containers.
46
47{pve} uses https://linuxcontainers.org/[LXC] as underlying container
48technology. We consider LXC as low-level library, which provides
49countless options. It would be too difficult to use those tools
50directly. Instead, we provide a small wrapper called `pct`, the
51"Proxmox Container Toolkit".
52
53The toolkit is tightly coupled with {pve}. That means that it is aware
54of the cluster setup, and it can use the same network and storage
55resources as fully virtualized VMs. You can even use the {pve}
56firewall, or manage containers using the HA framework.
57
58Our primary goal is to offer an environment as one would get from a
59VM, but without the additional overhead. We call this "System
60Containers".
61
62NOTE: If you want to run micro-containers (with docker, rkt, ...), it
63is best to run them inside a VM.
64
65
66Security Considerations
67-----------------------
68
69Containers use the same kernel as the host, so there is a big attack
70surface for malicious users. You should consider this fact if you
71provide containers to totally untrusted people. In general, fully
72virtualized VMs provide better isolation.
73
74The good news is that LXC uses many kernel security features like
75AppArmor, CGroups and PID and user namespaces, which makes containers
76usage quite secure. We distinguish two types of containers:
77
78
79Privileged Containers
80~~~~~~~~~~~~~~~~~~~~~
81
82Security is done by dropping capabilities, using mandatory access
83control (AppArmor), SecComp filters and namespaces. The LXC team
84considers this kind of container as unsafe, and they will not consider
85new container escape exploits to be security issues worthy of a CVE
86and quick fix. So you should use this kind of containers only inside a
87trusted environment, or when no untrusted task is running as root in
88the container.
89
90
91Unprivileged Containers
92~~~~~~~~~~~~~~~~~~~~~~~
93
94This kind of containers use a new kernel feature called user
95namespaces. The root UID 0 inside the container is mapped to an
96unprivileged user outside the container. This means that most security
97issues (container escape, resource abuse, ...) in those containers
98will affect a random unprivileged user, and so would be a generic
99kernel security bug rather than an LXC issue. The LXC team thinks
100unprivileged containers are safe by design.
101
102
103Configuration
104-------------
105
106The `/etc/pve/lxc/<CTID>.conf` file stores container configuration,
107where `<CTID>` is the numeric ID of the given container. Like all
108other files stored inside `/etc/pve/`, they get automatically
109replicated to all other cluster nodes.
110
111NOTE: CTIDs < 100 are reserved for internal purposes, and CTIDs need to be
112unique cluster wide.
113
114.Example Container Configuration
115----
116ostype: debian
117arch: amd64
118hostname: www
119memory: 512
120swap: 512
121net0: bridge=vmbr0,hwaddr=66:64:66:64:64:36,ip=dhcp,name=eth0,type=veth
122rootfs: local:107/vm-107-disk-1.raw,size=7G
123----
124
125Those configuration files are simple text files, and you can edit them
126using a normal text editor (`vi`, `nano`, ...). This is sometimes
127useful to do small corrections, but keep in mind that you need to
128restart the container to apply such changes.
129
130For that reason, it is usually better to use the `pct` command to
131generate and modify those files, or do the whole thing using the GUI.
132Our toolkit is smart enough to instantaneously apply most changes to
133running containers. This feature is called "hot plug", and there is no
134need to restart the container in that case.
135
136
137File Format
138~~~~~~~~~~~
139
140Container configuration files use a simple colon separated key/value
141format. Each line has the following format:
142
143 # this is a comment
144 OPTION: value
145
146Blank lines in those files are ignored, and lines starting with a `#`
147character are treated as comments and are also ignored.
148
149It is possible to add low-level, LXC style configuration directly, for
150example:
151
152 lxc.init_cmd: /sbin/my_own_init
153
154or
155
156 lxc.init_cmd = /sbin/my_own_init
157
158Those settings are directly passed to the LXC low-level tools.
159
160
161Snapshots
162~~~~~~~~~
163
164When you create a snapshot, `pct` stores the configuration at snapshot
165time into a separate snapshot section within the same configuration
166file. For example, after creating a snapshot called ``testsnapshot'',
167your configuration file will look like this:
168
169.Container configuration with snapshot
170----
171memory: 512
172swap: 512
173parent: testsnaphot
174...
175
176[testsnaphot]
177memory: 512
178swap: 512
179snaptime: 1457170803
180...
181----
182
183There are a few snapshot related properties like `parent` and
184`snaptime`. The `parent` property is used to store the parent/child
185relationship between snapshots. `snaptime` is the snapshot creation
186time stamp (Unix epoch).
187
188
189Guest Operating System Configuration
190~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
191
192We normally try to detect the operating system type inside the
193container, and then modify some files inside the container to make
194them work as expected. Here is a short list of things we do at
195container startup:
196
197set /etc/hostname:: to set the container name
198
199modify /etc/hosts:: to allow lookup of the local hostname
200
201network setup:: pass the complete network setup to the container
202
203configure DNS:: pass information about DNS servers
204
205adapt the init system:: for example, fix the number of spawned getty processes
206
207set the root password:: when creating a new container
208
209rewrite ssh_host_keys:: so that each container has unique keys
210
211randomize crontab:: so that cron does not start at the same time on all containers
212
213Changes made by {PVE} are enclosed by comment markers:
214
215----
216# --- BEGIN PVE ---
217<data>
218# --- END PVE ---
219----
220
221Those markers will be inserted at a reasonable location in the
222file. If such a section already exists, it will be updated in place
223and will not be moved.
224
225Modification of a file can be prevented by adding a `.pve-ignore.`
226file for it. For instance, if the file `/etc/.pve-ignore.hosts`
227exists then the `/etc/hosts` file will not be touched. This can be a
228simple empty file creatd via:
229
230 # touch /etc/.pve-ignore.hosts
231
232Most modifications are OS dependent, so they differ between different
233distributions and versions. You can completely disable modifications
234by manually setting the `ostype` to `unmanaged`.
235
236OS type detection is done by testing for certain files inside the
237container:
238
239Ubuntu:: inspect /etc/lsb-release (`DISTRIB_ID=Ubuntu`)
240
241Debian:: test /etc/debian_version
242
243Fedora:: test /etc/fedora-release
244
245RedHat or CentOS:: test /etc/redhat-release
246
247ArchLinux:: test /etc/arch-release
248
249Alpine:: test /etc/alpine-release
250
251Gentoo:: test /etc/gentoo-release
252
253NOTE: Container start fails if the configured `ostype` differs from the auto
254detected type.
255
256
257Options
258~~~~~~~
259
260include::pct.conf.5-opts.adoc[]
261
262
263Container Images
264----------------
265
266Container images, sometimes also referred to as ``templates'' or
267``appliances'', are `tar` archives which contain everything to run a
268container. You can think of it as a tidy container backup. Like most
269modern container toolkits, `pct` uses those images when you create a
270new container, for example:
271
272 pct create 999 local:vztmpl/debian-8.0-standard_8.0-1_amd64.tar.gz
273
274Proxmox itself ships a set of basic templates for most common
275operating systems, and you can download them using the `pveam` (short
276for {pve} Appliance Manager) command line utility. You can also
277download https://www.turnkeylinux.org/[TurnKey Linux] containers using
278that tool (or the graphical user interface).
279
280Our image repositories contain a list of available images, and there
281is a cron job run each day to download that list. You can trigger that
282update manually with:
283
284 pveam update
285
286After that you can view the list of available images using:
287
288 pveam available
289
290You can restrict this large list by specifying the `section` you are
291interested in, for example basic `system` images:
292
293.List available system images
294----
295# pveam available --section system
296system archlinux-base_2015-24-29-1_x86_64.tar.gz
297system centos-7-default_20160205_amd64.tar.xz
298system debian-6.0-standard_6.0-7_amd64.tar.gz
299system debian-7.0-standard_7.0-3_amd64.tar.gz
300system debian-8.0-standard_8.0-1_amd64.tar.gz
301system ubuntu-12.04-standard_12.04-1_amd64.tar.gz
302system ubuntu-14.04-standard_14.04-1_amd64.tar.gz
303system ubuntu-15.04-standard_15.04-1_amd64.tar.gz
304system ubuntu-15.10-standard_15.10-1_amd64.tar.gz
305----
306
307Before you can use such a template, you need to download them into one
308of your storages. You can simply use storage `local` for that
309purpose. For clustered installations, it is preferred to use a shared
310storage so that all nodes can access those images.
311
312 pveam download local debian-8.0-standard_8.0-1_amd64.tar.gz
313
314You are now ready to create containers using that image, and you can
315list all downloaded images on storage `local` with:
316
317----
318# pveam list local
319local:vztmpl/debian-8.0-standard_8.0-1_amd64.tar.gz 190.20MB
320----
321
322The above command shows you the full {pve} volume identifiers. They include
323the storage name, and most other {pve} commands can use them. For
324example you can delete that image later with:
325
326 pveam remove local:vztmpl/debian-8.0-standard_8.0-1_amd64.tar.gz
327
328
329Container Storage
330-----------------
331
332Traditional containers use a very simple storage model, only allowing
333a single mount point, the root file system. This was further
334restricted to specific file system types like `ext4` and `nfs`.
335Additional mounts are often done by user provided scripts. This turned
336out to be complex and error prone, so we try to avoid that now.
337
338Our new LXC based container model is more flexible regarding
339storage. First, you can have more than a single mount point. This
340allows you to choose a suitable storage for each application. For
341example, you can use a relatively slow (and thus cheap) storage for
342the container root file system. Then you can use a second mount point
343to mount a very fast, distributed storage for your database
344application.
345
346The second big improvement is that you can use any storage type
347supported by the {pve} storage library. That means that you can store
348your containers on local `lvmthin` or `zfs`, shared `iSCSI` storage,
349or even on distributed storage systems like `ceph`. It also enables us
350to use advanced storage features like snapshots and clones. `vzdump`
351can also use the snapshot feature to provide consistent container
352backups.
353
354Last but not least, you can also mount local devices directly, or
355mount local directories using bind mounts. That way you can access
356local storage inside containers with zero overhead. Such bind mounts
357also provide an easy way to share data between different containers.
358
359
360Mount Points
361~~~~~~~~~~~~
362
363The root mount point is configured with the `rootfs` property, and you can
364configure up to 10 additional mount points. The corresponding options
365are called `mp0` to `mp9`, and they can contain the following setting:
366
367include::pct-mountpoint-opts.adoc[]
368
369Currently there are basically three types of mount points: storage backed
370mount points, bind mounts and device mounts.
371
372.Typical container `rootfs` configuration
373----
374rootfs: thin1:base-100-disk-1,size=8G
375----
376
377
378Storage Backed Mount Points
379^^^^^^^^^^^^^^^^^^^^^^^^^^^
380
381Storage backed mount points are managed by the {pve} storage subsystem and come
382in three different flavors:
383
384- Image based: these are raw images containing a single ext4 formatted file
385 system.
386- ZFS subvolumes: these are technically bind mounts, but with managed storage,
387 and thus allow resizing and snapshotting.
388- Directories: passing `size=0` triggers a special case where instead of a raw
389 image a directory is created.
390
391
392Bind Mount Points
393^^^^^^^^^^^^^^^^^
394
395Bind mounts allow you to access arbitrary directories from your Proxmox VE host
396inside a container. Some potential use cases are:
397
398- Accessing your home directory in the guest
399- Accessing an USB device directory in the guest
400- Accessing an NFS mount from the host in the guest
401
402Bind mounts are considered to not be managed by the storage subsystem, so you
403cannot make snapshots or deal with quotas from inside the container. With
404unprivileged containers you might run into permission problems caused by the
405user mapping and cannot use ACLs.
406
407NOTE: The contents of bind mount points are not backed up when using `vzdump`.
408
409WARNING: For security reasons, bind mounts should only be established
410using source directories especially reserved for this purpose, e.g., a
411directory hierarchy under `/mnt/bindmounts`. Never bind mount system
412directories like `/`, `/var` or `/etc` into a container - this poses a
413great security risk.
414
415NOTE: The bind mount source path must not contain any symlinks.
416
417For example, to make the directory `/mnt/bindmounts/shared` accessible in the
418container with ID `100` under the path `/shared`, use a configuration line like
419`mp0: /mnt/bindmounts/shared,mp=/shared` in `/etc/pve/lxc/100.conf`.
420Alternatively, use `pct set 100 -mp0 /mnt/bindmounts/shared,mp=/shared` to
421achieve the same result.
422
423
424Device Mount Points
425^^^^^^^^^^^^^^^^^^^
426
427Device mount points allow to mount block devices of the host directly into the
428container. Similar to bind mounts, device mounts are not managed by {PVE}'s
429storage subsystem, but the `quota` and `acl` options will be honored.
430
431NOTE: Device mount points should only be used under special circumstances. In
432most cases a storage backed mount point offers the same performance and a lot
433more features.
434
435NOTE: The contents of device mount points are not backed up when using `vzdump`.
436
437
438FUSE Mounts
439~~~~~~~~~~~
440
441WARNING: Because of existing issues in the Linux kernel's freezer
442subsystem the usage of FUSE mounts inside a container is strongly
443advised against, as containers need to be frozen for suspend or
444snapshot mode backups.
445
446If FUSE mounts cannot be replaced by other mounting mechanisms or storage
447technologies, it is possible to establish the FUSE mount on the Proxmox host
448and use a bind mount point to make it accessible inside the container.
449
450
451Using Quotas Inside Containers
452~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
453
454Quotas allow to set limits inside a container for the amount of disk
455space that each user can use. This only works on ext4 image based
456storage types and currently does not work with unprivileged
457containers.
458
459Activating the `quota` option causes the following mount options to be
460used for a mount point:
461`usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0`
462
463This allows quotas to be used like you would on any other system. You
464can initialize the `/aquota.user` and `/aquota.group` files by running
465
466----
467quotacheck -cmug /
468quotaon /
469----
470
471and edit the quotas via the `edquota` command. Refer to the documentation
472of the distribution running inside the container for details.
473
474NOTE: You need to run the above commands for every mount point by passing
475the mount point's path instead of just `/`.
476
477
478Using ACLs Inside Containers
479~~~~~~~~~~~~~~~~~~~~~~~~~~~~
480
481The standard Posix **A**ccess **C**ontrol **L**ists are also available inside containers.
482ACLs allow you to set more detailed file ownership than the traditional user/
483group/others model.
484
485
486Container Network
487-----------------
488
489You can configure up to 10 network interfaces for a single
490container. The corresponding options are called `net0` to `net9`, and
491they can contain the following setting:
492
493include::pct-network-opts.adoc[]
494
495
496Backup and Restore
497------------------
498
499
500Container Backup
501~~~~~~~~~~~~~~~~
502
503It is possible to use the `vzdump` tool for container backup. Please
504refer to the `vzdump` manual page for details.
505
506
507Restoring Container Backups
508~~~~~~~~~~~~~~~~~~~~~~~~~~~
509
510Restoring container backups made with `vzdump` is possible using the
511`pct restore` command. By default, `pct restore` will attempt to restore as much
512of the backed up container configuration as possible. It is possible to override
513the backed up configuration by manually setting container options on the command
514line (see the `pct` manual page for details).
515
516NOTE: `pvesm extractconfig` can be used to view the backed up configuration
517contained in a vzdump archive.
518
519There are two basic restore modes, only differing by their handling of mount
520points:
521
522
523``Simple'' Restore Mode
524^^^^^^^^^^^^^^^^^^^^^^^
525
526If neither the `rootfs` parameter nor any of the optional `mpX` parameters
527are explicitly set, the mount point configuration from the backed up
528configuration file is restored using the following steps:
529
530. Extract mount points and their options from backup
531. Create volumes for storage backed mount points (on storage provided with the
532`storage` parameter, or default local storage if unset)
533. Extract files from backup archive
534. Add bind and device mount points to restored configuration (limited to root user)
535
536NOTE: Since bind and device mount points are never backed up, no files are
537restored in the last step, but only the configuration options. The assumption
538is that such mount points are either backed up with another mechanism (e.g.,
539NFS space that is bind mounted into many containers), or not intended to be
540backed up at all.
541
542This simple mode is also used by the container restore operations in the web
543interface.
544
545
546``Advanced'' Restore Mode
547^^^^^^^^^^^^^^^^^^^^^^^^^
548
549By setting the `rootfs` parameter (and optionally, any combination of `mpX`
550parameters), the `pct restore` command is automatically switched into an
551advanced mode. This advanced mode completely ignores the `rootfs` and `mpX`
552configuration options contained in the backup archive, and instead only
553uses the options explicitly provided as parameters.
554
555This mode allows flexible configuration of mount point settings at restore time,
556for example:
557
558* Set target storages, volume sizes and other options for each mount point
559individually
560* Redistribute backed up files according to new mount point scheme
561* Restore to device and/or bind mount points (limited to root user)
562
563
564Managing Containers with `pct`
565------------------------------
566
567`pct` is the tool to manage Linux Containers on {pve}. You can create
568and destroy containers, and control execution (start, stop, migrate,
569...). You can use pct to set parameters in the associated config file,
570like network configuration or memory limits.
571
572
573CLI Usage Examples
574~~~~~~~~~~~~~~~~~~
575
576Create a container based on a Debian template (provided you have
577already downloaded the template via the web interface)
578
579 pct create 100 /var/lib/vz/template/cache/debian-8.0-standard_8.0-1_amd64.tar.gz
580
581Start container 100
582
583 pct start 100
584
585Start a login session via getty
586
587 pct console 100
588
589Enter the LXC namespace and run a shell as root user
590
591 pct enter 100
592
593Display the configuration
594
595 pct config 100
596
597Add a network interface called `eth0`, bridged to the host bridge `vmbr0`,
598set the address and gateway, while it's running
599
600 pct set 100 -net0 name=eth0,bridge=vmbr0,ip=192.168.15.147/24,gw=192.168.15.1
601
602Reduce the memory of the container to 512MB
603
604 pct set 100 -memory 512
605
606
607Files
608------
609
610`/etc/pve/lxc/<CTID>.conf`::
611
612Configuration file for the container '<CTID>'.
613
614
615Container Advantages
616--------------------
617
618* Simple, and fully integrated into {pve}. Setup looks similar to a normal
619 VM setup.
620
621** Storage (ZFS, LVM, NFS, Ceph, ...)
622
623** Network
624
625** Authentication
626
627** Cluster
628
629* Fast: minimal overhead, as fast as bare metal
630
631* High density (perfect for idle workloads)
632
633* REST API
634
635* Direct hardware access
636
637
638Technology Overview
639-------------------
640
641- Integrated into {pve} graphical user interface (GUI)
642
643- LXC (https://linuxcontainers.org/)
644
645- cgmanager for cgroup management
646
647- lxcfs to provive containerized /proc file system
648
649- apparmor
650
651- CRIU: for live migration (planned)
652
653- We use latest available kernels (4.4.X)
654
655- Image based deployment (templates)
656
657- Container setup from host (Network, DNS, Storage, ...)
658
659
660ifdef::manvolnum[]
661include::pve-copyright.adoc[]
662endif::manvolnum[]
663
664
665
666
667
668
669