pct.adoc

   1 [[chapter_pct]]
   2 ifdef::manvolnum[]
   3 pct(1)
   4 ======
   5 include::attributes.txt[]
   6 :pve-toplevel:
   7
   8 NAME
   9 ----
  10
  11 pct - Tool to manage Linux Containers (LXC) on Proxmox VE
  12
  13
  14 SYNOPSIS
  15 --------
  16
  17 include::pct.1-synopsis.adoc[]
  18
  19 DESCRIPTION
  20 -----------
  21 endif::manvolnum[]
  22
  23 ifndef::manvolnum[]
  24 Proxmox Container Toolkit
  25 =========================
  26 include::attributes.txt[]
  27 :pve-toplevel:
  28 endif::manvolnum[]
  29 ifdef::wiki[]
  30 :title: Linux Container
  31 endif::wiki[]
  32
  33 Containers are a lightweight alternative to fully virtualized
  34 VMs. Instead of emulating a complete Operating System (OS), containers
  35 simply use the OS of the host they run on. This implies that all
  36 containers use the same kernel, and that they can access resources
  37 from the host directly.
  38
  39 This is great because containers do not waste CPU power nor memory due
  40 to kernel emulation. Container run-time costs are close to zero and
  41 usually negligible. But there are also some drawbacks you need to
  42 consider:
  43
  44 * You can only run Linux based OS inside containers, i.e. it is not
  45   possible to run FreeBSD or MS Windows inside.
  46
  47 * For security reasons, access to host resources needs to be
  48   restricted. This is done with AppArmor, SecComp filters and other
  49   kernel features. Be prepared that some syscalls are not allowed
  50   inside containers.
  51
  52 {pve} uses https://linuxcontainers.org/[LXC] as underlying container
  53 technology. We consider LXC as low-level library, which provides
  54 countless options. It would be too difficult to use those tools
  55 directly. Instead, we provide a small wrapper called `pct`, the
  56 "Proxmox Container Toolkit".
  57
  58 The toolkit is tightly coupled with {pve}. That means that it is aware
  59 of the cluster setup, and it can use the same network and storage
  60 resources as fully virtualized VMs. You can even use the {pve}
  61 firewall, or manage containers using the HA framework.
  62
  63 Our primary goal is to offer an environment as one would get from a
  64 VM, but without the additional overhead. We call this "System
  65 Containers".
  66
  67 NOTE: If you want to run micro-containers (with docker, rkt, ...), it
  68 is best to run them inside a VM.
  69
  70
  71 Technology Overview
  72 -------------------
  73
  74 * LXC (https://linuxcontainers.org/)
  75
  76 * Integrated into {pve} graphical user interface (GUI)
  77
  78 * Easy to use command line tool `pct`
  79
  80 * Access via {pve} REST API
  81
  82 * lxcfs to provide containerized /proc file system
  83
  84 * AppArmor/Seccomp to improve security
  85
  86 * CRIU: for live migration (planned)
  87
  88 * Use latest available kernels (4.4.X)
  89
  90 * Image based deployment (templates)
  91
  92 * Use {pve} storage library
  93
  94 * Container setup from host (network, DNS, storage, ...)
  95
  96
  97 Security Considerations
  98 -----------------------
  99
 100 Containers use the same kernel as the host, so there is a big attack
 101 surface for malicious users. You should consider this fact if you
 102 provide containers to totally untrusted people. In general, fully
 103 virtualized VMs provide better isolation.
 104
 105 The good news is that LXC uses many kernel security features like
 106 AppArmor, CGroups and PID and user namespaces, which makes containers
 107 usage quite secure. We distinguish two types of containers:
 108
 109
 110 Privileged Containers
 111 ~~~~~~~~~~~~~~~~~~~~~
 112
 113 Security is done by dropping capabilities, using mandatory access
 114 control (AppArmor), SecComp filters and namespaces. The LXC team
 115 considers this kind of container as unsafe, and they will not consider
 116 new container escape exploits to be security issues worthy of a CVE
 117 and quick fix. So you should use this kind of containers only inside a
 118 trusted environment, or when no untrusted task is running as root in
 119 the container.
 120
 121
 122 Unprivileged Containers
 123 ~~~~~~~~~~~~~~~~~~~~~~~
 124
 125 This kind of containers use a new kernel feature called user
 126 namespaces. The root UID 0 inside the container is mapped to an
 127 unprivileged user outside the container. This means that most security
 128 issues (container escape, resource abuse, ...) in those containers
 129 will affect a random unprivileged user, and so would be a generic
 130 kernel security bug rather than an LXC issue. The LXC team thinks
 131 unprivileged containers are safe by design.
 132
 133
 134 Guest Operating System Configuration
 135 ------------------------------------
 136
 137 We normally try to detect the operating system type inside the
 138 container, and then modify some files inside the container to make
 139 them work as expected. Here is a short list of things we do at
 140 container startup:
 141
 142 set /etc/hostname:: to set the container name
 143
 144 modify /etc/hosts:: to allow lookup of the local hostname
 145
 146 network setup:: pass the complete network setup to the container
 147
 148 configure DNS:: pass information about DNS servers
 149
 150 adapt the init system:: for example, fix the number of spawned getty processes
 151
 152 set the root password:: when creating a new container
 153
 154 rewrite ssh_host_keys:: so that each container has unique keys
 155
 156 randomize crontab:: so that cron does not start at the same time on all containers
 157
 158 Changes made by {PVE} are enclosed by comment markers:
 159
 160 ----
 161 # --- BEGIN PVE ---
 162 <data>
 163 # --- END PVE ---
 164 ----
 165
 166 Those markers will be inserted at a reasonable location in the
 167 file. If such a section already exists, it will be updated in place
 168 and will not be moved.
 169
 170 Modification of a file can be prevented by adding a `.pve-ignore.`
 171 file for it.  For instance, if the file `/etc/.pve-ignore.hosts`
 172 exists then the `/etc/hosts` file will not be touched. This can be a
 173 simple empty file creatd via:
 174
 175  # touch /etc/.pve-ignore.hosts
 176
 177 Most modifications are OS dependent, so they differ between different
 178 distributions and versions. You can completely disable modifications
 179 by manually setting the `ostype` to `unmanaged`.
 180
 181 OS type detection is done by testing for certain files inside the
 182 container:
 183
 184 Ubuntu:: inspect /etc/lsb-release (`DISTRIB_ID=Ubuntu`)
 185
 186 Debian:: test /etc/debian_version
 187
 188 Fedora:: test /etc/fedora-release
 189
 190 RedHat or CentOS:: test /etc/redhat-release
 191
 192 ArchLinux:: test /etc/arch-release
 193
 194 Alpine:: test /etc/alpine-release
 195
 196 Gentoo:: test /etc/gentoo-release
 197
 198 NOTE: Container start fails if the configured `ostype` differs from the auto
 199 detected type.
 200
 201
 202 [[pct_container_images]]
 203 Container Images
 204 ----------------
 205
 206 Container images, sometimes also referred to as ``templates'' or
 207 ``appliances'', are `tar` archives which contain everything to run a
 208 container. You can think of it as a tidy container backup. Like most
 209 modern container toolkits, `pct` uses those images when you create a
 210 new container, for example:
 211
 212  pct create 999 local:vztmpl/debian-8.0-standard_8.0-1_amd64.tar.gz
 213
 214 {pve} itself ships a set of basic templates for most common
 215 operating systems, and you can download them using the `pveam` (short
 216 for {pve} Appliance Manager) command line utility. You can also
 217 download https://www.turnkeylinux.org/[TurnKey Linux] containers using
 218 that tool (or the graphical user interface).
 219
 220 Our image repositories contain a list of available images, and there
 221 is a cron job run each day to download that list. You can trigger that
 222 update manually with:
 223
 224  pveam update
 225
 226 After that you can view the list of available images using:
 227
 228  pveam available
 229
 230 You can restrict this large list by specifying the `section` you are
 231 interested in, for example basic `system` images:
 232
 233 .List available system images
 234 ----
 235 # pveam available --section system
 236 system          archlinux-base_2015-24-29-1_x86_64.tar.gz
 237 system          centos-7-default_20160205_amd64.tar.xz
 238 system          debian-6.0-standard_6.0-7_amd64.tar.gz
 239 system          debian-7.0-standard_7.0-3_amd64.tar.gz
 240 system          debian-8.0-standard_8.0-1_amd64.tar.gz
 241 system          ubuntu-12.04-standard_12.04-1_amd64.tar.gz
 242 system          ubuntu-14.04-standard_14.04-1_amd64.tar.gz
 243 system          ubuntu-15.04-standard_15.04-1_amd64.tar.gz
 244 system          ubuntu-15.10-standard_15.10-1_amd64.tar.gz
 245 ----
 246
 247 Before you can use such a template, you need to download them into one
 248 of your storages. You can simply use storage `local` for that
 249 purpose. For clustered installations, it is preferred to use a shared
 250 storage so that all nodes can access those images.
 251
 252  pveam download local debian-8.0-standard_8.0-1_amd64.tar.gz
 253
 254 You are now ready to create containers using that image, and you can
 255 list all downloaded images on storage `local` with:
 256
 257 ----
 258 # pveam list local
 259 local:vztmpl/debian-8.0-standard_8.0-1_amd64.tar.gz  190.20MB
 260 ----
 261
 262 The above command shows you the full {pve} volume identifiers. They include
 263 the storage name, and most other {pve} commands can use them. For
 264 example you can delete that image later with:
 265
 266  pveam remove local:vztmpl/debian-8.0-standard_8.0-1_amd64.tar.gz
 267
 268
 269 [[pct_container_storage]]
 270 Container Storage
 271 -----------------
 272
 273 Traditional containers use a very simple storage model, only allowing
 274 a single mount point, the root file system. This was further
 275 restricted to specific file system types like `ext4` and `nfs`.
 276 Additional mounts are often done by user provided scripts. This turned
 277 out to be complex and error prone, so we try to avoid that now.
 278
 279 Our new LXC based container model is more flexible regarding
 280 storage. First, you can have more than a single mount point. This
 281 allows you to choose a suitable storage for each application. For
 282 example, you can use a relatively slow (and thus cheap) storage for
 283 the container root file system. Then you can use a second mount point
 284 to mount a very fast, distributed storage for your database
 285 application. See section <<pct_mount_points,Mount Points>> for further
 286 details.
 287
 288 The second big improvement is that you can use any storage type
 289 supported by the {pve} storage library. That means that you can store
 290 your containers on local `lvmthin` or `zfs`, shared `iSCSI` storage,
 291 or even on distributed storage systems like `ceph`. It also enables us
 292 to use advanced storage features like snapshots and clones. `vzdump`
 293 can also use the snapshot feature to provide consistent container
 294 backups.
 295
 296 Last but not least, you can also mount local devices directly, or
 297 mount local directories using bind mounts. That way you can access
 298 local storage inside containers with zero overhead. Such bind mounts
 299 also provide an easy way to share data between different containers.
 300
 301
 302 FUSE Mounts
 303 ~~~~~~~~~~~
 304
 305 WARNING: Because of existing issues in the Linux kernel's freezer
 306 subsystem the usage of FUSE mounts inside a container is strongly
 307 advised against, as containers need to be frozen for suspend or
 308 snapshot mode backups.
 309
 310 If FUSE mounts cannot be replaced by other mounting mechanisms or storage
 311 technologies, it is possible to establish the FUSE mount on the Proxmox host
 312 and use a bind mount point to make it accessible inside the container.
 313
 314
 315 Using Quotas Inside Containers
 316 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 317
 318 Quotas allow to set limits inside a container for the amount of disk
 319 space that each user can use.  This only works on ext4 image based
 320 storage types and currently does not work with unprivileged
 321 containers.
 322
 323 Activating the `quota` option causes the following mount options to be
 324 used for a mount point:
 325 `usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0`
 326
 327 This allows quotas to be used like you would on any other system. You
 328 can initialize the `/aquota.user` and `/aquota.group` files by running
 329
 330 ----
 331 quotacheck -cmug /
 332 quotaon /
 333 ----
 334
 335 and edit the quotas via the `edquota` command. Refer to the documentation
 336 of the distribution running inside the container for details.
 337
 338 NOTE: You need to run the above commands for every mount point by passing
 339 the mount point's path instead of just `/`.
 340
 341
 342 Using ACLs Inside Containers
 343 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 344
 345 The standard Posix **A**ccess **C**ontrol **L**ists are also available inside containers.
 346 ACLs allow you to set more detailed file ownership than the traditional user/
 347 group/others model.
 348
 349
 350 [[pct_setting]]
 351 Container Settings
 352 ------------------
 353
 354
 355 [[pct_mount_points]]
 356 Mount Points
 357 ~~~~~~~~~~~~
 358
 359 The root mount point is configured with the `rootfs` property, and you can
 360 configure up to 10 additional mount points. The corresponding options
 361 are called `mp0` to `mp9`, and they can contain the following setting:
 362
 363 include::pct-mountpoint-opts.adoc[]
 364
 365 Currently there are basically three types of mount points: storage backed
 366 mount points, bind mounts and device mounts.
 367
 368 .Typical container `rootfs` configuration
 369 ----
 370 rootfs: thin1:base-100-disk-1,size=8G
 371 ----
 372
 373
 374 Storage Backed Mount Points
 375 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 376
 377 Storage backed mount points are managed by the {pve} storage subsystem and come
 378 in three different flavors:
 379
 380 - Image based: these are raw images containing a single ext4 formatted file
 381   system.
 382 - ZFS subvolumes: these are technically bind mounts, but with managed storage,
 383   and thus allow resizing and snapshotting.
 384 - Directories: passing `size=0` triggers a special case where instead of a raw
 385   image a directory is created.
 386
 387
 388 Bind Mount Points
 389 ^^^^^^^^^^^^^^^^^
 390
 391 Bind mounts allow you to access arbitrary directories from your Proxmox VE host
 392 inside a container. Some potential use cases are:
 393
 394 - Accessing your home directory in the guest
 395 - Accessing an USB device directory in the guest
 396 - Accessing an NFS mount from the host in the guest
 397
 398 Bind mounts are considered to not be managed by the storage subsystem, so you
 399 cannot make snapshots or deal with quotas from inside the container. With
 400 unprivileged containers you might run into permission problems caused by the
 401 user mapping and cannot use ACLs.
 402
 403 NOTE: The contents of bind mount points are not backed up when using `vzdump`.
 404
 405 WARNING: For security reasons, bind mounts should only be established
 406 using source directories especially reserved for this purpose, e.g., a
 407 directory hierarchy under `/mnt/bindmounts`. Never bind mount system
 408 directories like `/`, `/var` or `/etc` into a container - this poses a
 409 great security risk.
 410
 411 NOTE: The bind mount source path must not contain any symlinks.
 412
 413 For example, to make the directory `/mnt/bindmounts/shared` accessible in the
 414 container with ID `100` under the path `/shared`, use a configuration line like
 415 `mp0: /mnt/bindmounts/shared,mp=/shared` in `/etc/pve/lxc/100.conf`.
 416 Alternatively, use `pct set 100 -mp0 /mnt/bindmounts/shared,mp=/shared` to
 417 achieve the same result.
 418
 419
 420 Device Mount Points
 421 ^^^^^^^^^^^^^^^^^^^
 422
 423 Device mount points allow to mount block devices of the host directly into the
 424 container. Similar to bind mounts, device mounts are not managed by {PVE}'s
 425 storage subsystem, but the `quota` and `acl` options will be honored.
 426
 427 NOTE: Device mount points should only be used under special circumstances. In
 428 most cases a storage backed mount point offers the same performance and a lot
 429 more features.
 430
 431 NOTE: The contents of device mount points are not backed up when using `vzdump`.
 432
 433
 434 [[pct_container_network]]
 435 Container Network
 436 ~~~~~~~~~~~~~~~~~
 437
 438 You can configure up to 10 network interfaces for a single
 439 container. The corresponding options are called `net0` to `net9`, and
 440 they can contain the following setting:
 441
 442 include::pct-network-opts.adoc[]
 443
 444
 445 Backup and Restore
 446 ------------------
 447
 448
 449 Container Backup
 450 ~~~~~~~~~~~~~~~~
 451
 452 It is possible to use the `vzdump` tool for container backup. Please
 453 refer to the `vzdump` manual page for details.
 454
 455
 456 Restoring Container Backups
 457 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 458
 459 Restoring container backups made with `vzdump` is possible using the
 460 `pct restore` command. By default, `pct restore` will attempt to restore as much
 461 of the backed up container configuration as possible. It is possible to override
 462 the backed up configuration by manually setting container options on the command
 463 line (see the `pct` manual page for details).
 464
 465 NOTE: `pvesm extractconfig` can be used to view the backed up configuration
 466 contained in a vzdump archive.
 467
 468 There are two basic restore modes, only differing by their handling of mount
 469 points:
 470
 471
 472 ``Simple'' Restore Mode
 473 ^^^^^^^^^^^^^^^^^^^^^^^
 474
 475 If neither the `rootfs` parameter nor any of the optional `mpX` parameters
 476 are explicitly set, the mount point configuration from the backed up
 477 configuration file is restored using the following steps:
 478
 479 . Extract mount points and their options from backup
 480 . Create volumes for storage backed mount points (on storage provided with the
 481 `storage` parameter, or default local storage if unset)
 482 . Extract files from backup archive
 483 . Add bind and device mount points to restored configuration (limited to root user)
 484
 485 NOTE: Since bind and device mount points are never backed up, no files are
 486 restored in the last step, but only the configuration options. The assumption
 487 is that such mount points are either backed up with another mechanism (e.g.,
 488 NFS space that is bind mounted into many containers), or not intended to be
 489 backed up at all.
 490
 491 This simple mode is also used by the container restore operations in the web
 492 interface.
 493
 494
 495 ``Advanced'' Restore Mode
 496 ^^^^^^^^^^^^^^^^^^^^^^^^^
 497
 498 By setting the `rootfs` parameter (and optionally, any combination of `mpX`
 499 parameters), the `pct restore` command is automatically switched into an
 500 advanced mode. This advanced mode completely ignores the `rootfs` and `mpX`
 501 configuration options contained in the backup archive, and instead only
 502 uses the options explicitly provided as parameters.
 503
 504 This mode allows flexible configuration of mount point settings at restore time,
 505 for example:
 506
 507 * Set target storages, volume sizes and other options for each mount point
 508 individually
 509 * Redistribute backed up files according to new mount point scheme
 510 * Restore to device and/or bind mount points (limited to root user)
 511
 512
 513 Managing Containers with `pct`
 514 ------------------------------
 515
 516 `pct` is the tool to manage Linux Containers on {pve}. You can create
 517 and destroy containers, and control execution (start, stop, migrate,
 518 ...). You can use pct to set parameters in the associated config file,
 519 like network configuration or memory limits.
 520
 521
 522 CLI Usage Examples
 523 ~~~~~~~~~~~~~~~~~~
 524
 525 Create a container based on a Debian template (provided you have
 526 already downloaded the template via the web interface)
 527
 528  pct create 100 /var/lib/vz/template/cache/debian-8.0-standard_8.0-1_amd64.tar.gz
 529
 530 Start container 100
 531
 532  pct start 100
 533
 534 Start a login session via getty
 535
 536  pct console 100
 537
 538 Enter the LXC namespace and run a shell as root user
 539
 540  pct enter 100
 541
 542 Display the configuration
 543
 544  pct config 100
 545
 546 Add a network interface called `eth0`, bridged to the host bridge `vmbr0`,
 547 set the address and gateway, while it's running
 548
 549  pct set 100 -net0 name=eth0,bridge=vmbr0,ip=192.168.15.147/24,gw=192.168.15.1
 550
 551 Reduce the memory of the container to 512MB
 552
 553  pct set 100 -memory 512
 554
 555
 556 Obtaining Debugging Logs
 557 ~~~~~~~~~~~~~~~~~~~~~~~~
 558
 559 In case `pct start` is unable to start a specific container, it might be
 560 helpful to collect debugging output by running `lxc-start` (replace `ID` with
 561 the container's ID):
 562
 563  lxc-start -n ID -F -l DEBUG -o /tmp/lxc-ID.log
 564
 565 This command will attempt to start the container in foreground mode, to stop the container run `pct shutdown ID` or `pct stop ID` in a second terminal.
 566
 567 The collected debug log is written to `/tmp/lxc-ID.log`.
 568
 569 NOTE: If you have changed the container's configuration since the last start
 570 attempt with `pct start`, you need to run `pct start` at least once to also
 571 update the configuration used by `lxc-start`.
 572
 573
 574 [[pct_configuration]]
 575 Configuration
 576 -------------
 577
 578 The `/etc/pve/lxc/<CTID>.conf` file stores container configuration,
 579 where `<CTID>` is the numeric ID of the given container. Like all
 580 other files stored inside `/etc/pve/`, they get automatically
 581 replicated to all other cluster nodes.
 582
 583 NOTE: CTIDs < 100 are reserved for internal purposes, and CTIDs need to be
 584 unique cluster wide.
 585
 586 .Example Container Configuration
 587 ----
 588 ostype: debian
 589 arch: amd64
 590 hostname: www
 591 memory: 512
 592 swap: 512
 593 net0: bridge=vmbr0,hwaddr=66:64:66:64:64:36,ip=dhcp,name=eth0,type=veth
 594 rootfs: local:107/vm-107-disk-1.raw,size=7G
 595 ----
 596
 597 Those configuration files are simple text files, and you can edit them
 598 using a normal text editor (`vi`, `nano`, ...). This is sometimes
 599 useful to do small corrections, but keep in mind that you need to
 600 restart the container to apply such changes.
 601
 602 For that reason, it is usually better to use the `pct` command to
 603 generate and modify those files, or do the whole thing using the GUI.
 604 Our toolkit is smart enough to instantaneously apply most changes to
 605 running containers. This feature is called "hot plug", and there is no
 606 need to restart the container in that case.
 607
 608
 609 File Format
 610 ~~~~~~~~~~~
 611
 612 Container configuration files use a simple colon separated key/value
 613 format. Each line has the following format:
 614
 615 -----
 616 # this is a comment
 617 OPTION: value
 618 -----
 619
 620 Blank lines in those files are ignored, and lines starting with a `#`
 621 character are treated as comments and are also ignored.
 622
 623 It is possible to add low-level, LXC style configuration directly, for
 624 example:
 625
 626  lxc.init_cmd: /sbin/my_own_init
 627
 628 or
 629
 630  lxc.init_cmd = /sbin/my_own_init
 631
 632 Those settings are directly passed to the LXC low-level tools.
 633
 634
 635 [[pct_snapshots]]
 636 Snapshots
 637 ~~~~~~~~~
 638
 639 When you create a snapshot, `pct` stores the configuration at snapshot
 640 time into a separate snapshot section within the same configuration
 641 file. For example, after creating a snapshot called ``testsnapshot'',
 642 your configuration file will look like this:
 643
 644 .Container configuration with snapshot
 645 ----
 646 memory: 512
 647 swap: 512
 648 parent: testsnaphot
 649 ...
 650
 651 [testsnaphot]
 652 memory: 512
 653 swap: 512
 654 snaptime: 1457170803
 655 ...
 656 ----
 657
 658 There are a few snapshot related properties like `parent` and
 659 `snaptime`. The `parent` property is used to store the parent/child
 660 relationship between snapshots. `snaptime` is the snapshot creation
 661 time stamp (Unix epoch).
 662
 663
 664 [[pct_options]]
 665 Options
 666 ~~~~~~~
 667
 668 include::pct.conf.5-opts.adoc[]
 669
 670
 671 Locks
 672 -----
 673
 674 Container migrations, snapshots and backups (`vzdump`) set a lock to
 675 prevent incompatible concurrent actions on the affected container. Sometimes
 676 you need to remove such a lock manually (e.g., after a power failure).
 677
 678  pct unlock <CTID>
 679
 680 CAUTION: Only do that if you are sure the action which set the lock is
 681 no longer running.
 682
 683
 684 ifdef::manvolnum[]
 685
 686 Files
 687 ------
 688
 689 `/etc/pve/lxc/<CTID>.conf`::
 690
 691 Configuration file for the container '<CTID>'.
 692
 693
 694 include::pve-copyright.adoc[]
 695 endif::manvolnum[]
 696
 697
 698
 699
 700
 701
 702