pct.adoc

   1 [[chapter_pct]]
   2 ifdef::manvolnum[]
   3 pct(1)
   4 ======
   5 include::attributes.txt[]
   6 :pve-toplevel:
   7
   8 NAME
   9 ----
  10
  11 pct - Tool to manage Linux Containers (LXC) on Proxmox VE
  12
  13
  14 SYNOPSIS
  15 --------
  16
  17 include::pct.1-synopsis.adoc[]
  18
  19 DESCRIPTION
  20 -----------
  21 endif::manvolnum[]
  22
  23 ifndef::manvolnum[]
  24 Proxmox Container Toolkit
  25 =========================
  26 include::attributes.txt[]
  27 :pve-toplevel:
  28 endif::manvolnum[]
  29 ifdef::wiki[]
  30 :title: Linux Container
  31 endif::wiki[]
  32
  33 Containers are a lightweight alternative to fully virtualized
  34 VMs. Instead of emulating a complete Operating System (OS), containers
  35 simply use the OS of the host they run on. This implies that all
  36 containers use the same kernel, and that they can access resources
  37 from the host directly.
  38
  39 This is great because containers do not waste CPU power nor memory due
  40 to kernel emulation. Container run-time costs are close to zero and
  41 usually negligible. But there are also some drawbacks you need to
  42 consider:
  43
  44 * You can only run Linux based OS inside containers, i.e. it is not
  45   possible to run FreeBSD or MS Windows inside.
  46
  47 * For security reasons, access to host resources needs to be
  48   restricted. This is done with AppArmor, SecComp filters and other
  49   kernel features. Be prepared that some syscalls are not allowed
  50   inside containers.
  51
  52 {pve} uses https://linuxcontainers.org/[LXC] as underlying container
  53 technology. We consider LXC as low-level library, which provides
  54 countless options. It would be too difficult to use those tools
  55 directly. Instead, we provide a small wrapper called `pct`, the
  56 "Proxmox Container Toolkit".
  57
  58 The toolkit is tightly coupled with {pve}. That means that it is aware
  59 of the cluster setup, and it can use the same network and storage
  60 resources as fully virtualized VMs. You can even use the {pve}
  61 firewall, or manage containers using the HA framework.
  62
  63 Our primary goal is to offer an environment as one would get from a
  64 VM, but without the additional overhead. We call this "System
  65 Containers".
  66
  67 NOTE: If you want to run micro-containers (with docker, rkt, ...), it
  68 is best to run them inside a VM.
  69
  70
  71 Technology Overview
  72 -------------------
  73
  74 * LXC (https://linuxcontainers.org/)
  75
  76 * Integrated into {pve} graphical user interface (GUI)
  77
  78 * Easy to use command line tool `pct`
  79
  80 * Access via {pve} REST API
  81
  82 * lxcfs to provide containerized /proc file system
  83
  84 * AppArmor/Seccomp to improve security
  85
  86 * CRIU: for live migration (planned)
  87
  88 * Use latest available kernels (4.4.X)
  89
  90 * Image based deployment (templates)
  91
  92 * Use {pve} storage library
  93
  94 * Container setup from host (network, DNS, storage, ...)
  95
  96
  97 Security Considerations
  98 -----------------------
  99
 100 Containers use the same kernel as the host, so there is a big attack
 101 surface for malicious users. You should consider this fact if you
 102 provide containers to totally untrusted people. In general, fully
 103 virtualized VMs provide better isolation.
 104
 105 The good news is that LXC uses many kernel security features like
 106 AppArmor, CGroups and PID and user namespaces, which makes containers
 107 usage quite secure. We distinguish two types of containers:
 108
 109
 110 Privileged Containers
 111 ~~~~~~~~~~~~~~~~~~~~~
 112
 113 Security is done by dropping capabilities, using mandatory access
 114 control (AppArmor), SecComp filters and namespaces. The LXC team
 115 considers this kind of container as unsafe, and they will not consider
 116 new container escape exploits to be security issues worthy of a CVE
 117 and quick fix. So you should use this kind of containers only inside a
 118 trusted environment, or when no untrusted task is running as root in
 119 the container.
 120
 121
 122 Unprivileged Containers
 123 ~~~~~~~~~~~~~~~~~~~~~~~
 124
 125 This kind of containers use a new kernel feature called user
 126 namespaces. The root UID 0 inside the container is mapped to an
 127 unprivileged user outside the container. This means that most security
 128 issues (container escape, resource abuse, ...) in those containers
 129 will affect a random unprivileged user, and so would be a generic
 130 kernel security bug rather than an LXC issue. The LXC team thinks
 131 unprivileged containers are safe by design.
 132
 133
 134 Guest Operating System Configuration
 135 ------------------------------------
 136
 137 We normally try to detect the operating system type inside the
 138 container, and then modify some files inside the container to make
 139 them work as expected. Here is a short list of things we do at
 140 container startup:
 141
 142 set /etc/hostname:: to set the container name
 143
 144 modify /etc/hosts:: to allow lookup of the local hostname
 145
 146 network setup:: pass the complete network setup to the container
 147
 148 configure DNS:: pass information about DNS servers
 149
 150 adapt the init system:: for example, fix the number of spawned getty processes
 151
 152 set the root password:: when creating a new container
 153
 154 rewrite ssh_host_keys:: so that each container has unique keys
 155
 156 randomize crontab:: so that cron does not start at the same time on all containers
 157
 158 Changes made by {PVE} are enclosed by comment markers:
 159
 160 ----
 161 # --- BEGIN PVE ---
 162 <data>
 163 # --- END PVE ---
 164 ----
 165
 166 Those markers will be inserted at a reasonable location in the
 167 file. If such a section already exists, it will be updated in place
 168 and will not be moved.
 169
 170 Modification of a file can be prevented by adding a `.pve-ignore.`
 171 file for it.  For instance, if the file `/etc/.pve-ignore.hosts`
 172 exists then the `/etc/hosts` file will not be touched. This can be a
 173 simple empty file creatd via:
 174
 175  # touch /etc/.pve-ignore.hosts
 176
 177 Most modifications are OS dependent, so they differ between different
 178 distributions and versions. You can completely disable modifications
 179 by manually setting the `ostype` to `unmanaged`.
 180
 181 OS type detection is done by testing for certain files inside the
 182 container:
 183
 184 Ubuntu:: inspect /etc/lsb-release (`DISTRIB_ID=Ubuntu`)
 185
 186 Debian:: test /etc/debian_version
 187
 188 Fedora:: test /etc/fedora-release
 189
 190 RedHat or CentOS:: test /etc/redhat-release
 191
 192 ArchLinux:: test /etc/arch-release
 193
 194 Alpine:: test /etc/alpine-release
 195
 196 Gentoo:: test /etc/gentoo-release
 197
 198 NOTE: Container start fails if the configured `ostype` differs from the auto
 199 detected type.
 200
 201
 202 [[pct_container_images]]
 203 Container Images
 204 ----------------
 205
 206 Container images, sometimes also referred to as ``templates'' or
 207 ``appliances'', are `tar` archives which contain everything to run a
 208 container. You can think of it as a tidy container backup. Like most
 209 modern container toolkits, `pct` uses those images when you create a
 210 new container, for example:
 211
 212  pct create 999 local:vztmpl/debian-8.0-standard_8.0-1_amd64.tar.gz
 213
 214 {pve} itself ships a set of basic templates for most common
 215 operating systems, and you can download them using the `pveam` (short
 216 for {pve} Appliance Manager) command line utility. You can also
 217 download https://www.turnkeylinux.org/[TurnKey Linux] containers using
 218 that tool (or the graphical user interface).
 219
 220 Our image repositories contain a list of available images, and there
 221 is a cron job run each day to download that list. You can trigger that
 222 update manually with:
 223
 224  pveam update
 225
 226 After that you can view the list of available images using:
 227
 228  pveam available
 229
 230 You can restrict this large list by specifying the `section` you are
 231 interested in, for example basic `system` images:
 232
 233 .List available system images
 234 ----
 235 # pveam available --section system
 236 system          archlinux-base_2015-24-29-1_x86_64.tar.gz
 237 system          centos-7-default_20160205_amd64.tar.xz
 238 system          debian-6.0-standard_6.0-7_amd64.tar.gz
 239 system          debian-7.0-standard_7.0-3_amd64.tar.gz
 240 system          debian-8.0-standard_8.0-1_amd64.tar.gz
 241 system          ubuntu-12.04-standard_12.04-1_amd64.tar.gz
 242 system          ubuntu-14.04-standard_14.04-1_amd64.tar.gz
 243 system          ubuntu-15.04-standard_15.04-1_amd64.tar.gz
 244 system          ubuntu-15.10-standard_15.10-1_amd64.tar.gz
 245 ----
 246
 247 Before you can use such a template, you need to download them into one
 248 of your storages. You can simply use storage `local` for that
 249 purpose. For clustered installations, it is preferred to use a shared
 250 storage so that all nodes can access those images.
 251
 252  pveam download local debian-8.0-standard_8.0-1_amd64.tar.gz
 253
 254 You are now ready to create containers using that image, and you can
 255 list all downloaded images on storage `local` with:
 256
 257 ----
 258 # pveam list local
 259 local:vztmpl/debian-8.0-standard_8.0-1_amd64.tar.gz  190.20MB
 260 ----
 261
 262 The above command shows you the full {pve} volume identifiers. They include
 263 the storage name, and most other {pve} commands can use them. For
 264 example you can delete that image later with:
 265
 266  pveam remove local:vztmpl/debian-8.0-standard_8.0-1_amd64.tar.gz
 267
 268
 269 [[pct_container_storage]]
 270 Container Storage
 271 -----------------
 272
 273 Traditional containers use a very simple storage model, only allowing
 274 a single mount point, the root file system. This was further
 275 restricted to specific file system types like `ext4` and `nfs`.
 276 Additional mounts are often done by user provided scripts. This turned
 277 out to be complex and error prone, so we try to avoid that now.
 278
 279 Our new LXC based container model is more flexible regarding
 280 storage. First, you can have more than a single mount point. This
 281 allows you to choose a suitable storage for each application. For
 282 example, you can use a relatively slow (and thus cheap) storage for
 283 the container root file system. Then you can use a second mount point
 284 to mount a very fast, distributed storage for your database
 285 application.
 286
 287 The second big improvement is that you can use any storage type
 288 supported by the {pve} storage library. That means that you can store
 289 your containers on local `lvmthin` or `zfs`, shared `iSCSI` storage,
 290 or even on distributed storage systems like `ceph`. It also enables us
 291 to use advanced storage features like snapshots and clones. `vzdump`
 292 can also use the snapshot feature to provide consistent container
 293 backups.
 294
 295 Last but not least, you can also mount local devices directly, or
 296 mount local directories using bind mounts. That way you can access
 297 local storage inside containers with zero overhead. Such bind mounts
 298 also provide an easy way to share data between different containers.
 299
 300
 301 Mount Points
 302 ~~~~~~~~~~~~
 303
 304 The root mount point is configured with the `rootfs` property, and you can
 305 configure up to 10 additional mount points. The corresponding options
 306 are called `mp0` to `mp9`, and they can contain the following setting:
 307
 308 include::pct-mountpoint-opts.adoc[]
 309
 310 Currently there are basically three types of mount points: storage backed
 311 mount points, bind mounts and device mounts.
 312
 313 .Typical container `rootfs` configuration
 314 ----
 315 rootfs: thin1:base-100-disk-1,size=8G
 316 ----
 317
 318
 319 Storage Backed Mount Points
 320 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 321
 322 Storage backed mount points are managed by the {pve} storage subsystem and come
 323 in three different flavors:
 324
 325 - Image based: these are raw images containing a single ext4 formatted file
 326   system.
 327 - ZFS subvolumes: these are technically bind mounts, but with managed storage,
 328   and thus allow resizing and snapshotting.
 329 - Directories: passing `size=0` triggers a special case where instead of a raw
 330   image a directory is created.
 331
 332
 333 Bind Mount Points
 334 ^^^^^^^^^^^^^^^^^
 335
 336 Bind mounts allow you to access arbitrary directories from your Proxmox VE host
 337 inside a container. Some potential use cases are:
 338
 339 - Accessing your home directory in the guest
 340 - Accessing an USB device directory in the guest
 341 - Accessing an NFS mount from the host in the guest
 342
 343 Bind mounts are considered to not be managed by the storage subsystem, so you
 344 cannot make snapshots or deal with quotas from inside the container. With
 345 unprivileged containers you might run into permission problems caused by the
 346 user mapping and cannot use ACLs.
 347
 348 NOTE: The contents of bind mount points are not backed up when using `vzdump`.
 349
 350 WARNING: For security reasons, bind mounts should only be established
 351 using source directories especially reserved for this purpose, e.g., a
 352 directory hierarchy under `/mnt/bindmounts`. Never bind mount system
 353 directories like `/`, `/var` or `/etc` into a container - this poses a
 354 great security risk.
 355
 356 NOTE: The bind mount source path must not contain any symlinks.
 357
 358 For example, to make the directory `/mnt/bindmounts/shared` accessible in the
 359 container with ID `100` under the path `/shared`, use a configuration line like
 360 `mp0: /mnt/bindmounts/shared,mp=/shared` in `/etc/pve/lxc/100.conf`.
 361 Alternatively, use `pct set 100 -mp0 /mnt/bindmounts/shared,mp=/shared` to
 362 achieve the same result.
 363
 364
 365 Device Mount Points
 366 ^^^^^^^^^^^^^^^^^^^
 367
 368 Device mount points allow to mount block devices of the host directly into the
 369 container. Similar to bind mounts, device mounts are not managed by {PVE}'s
 370 storage subsystem, but the `quota` and `acl` options will be honored.
 371
 372 NOTE: Device mount points should only be used under special circumstances. In
 373 most cases a storage backed mount point offers the same performance and a lot
 374 more features.
 375
 376 NOTE: The contents of device mount points are not backed up when using `vzdump`.
 377
 378
 379 FUSE Mounts
 380 ~~~~~~~~~~~
 381
 382 WARNING: Because of existing issues in the Linux kernel's freezer
 383 subsystem the usage of FUSE mounts inside a container is strongly
 384 advised against, as containers need to be frozen for suspend or
 385 snapshot mode backups.
 386
 387 If FUSE mounts cannot be replaced by other mounting mechanisms or storage
 388 technologies, it is possible to establish the FUSE mount on the Proxmox host
 389 and use a bind mount point to make it accessible inside the container.
 390
 391
 392 Using Quotas Inside Containers
 393 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 394
 395 Quotas allow to set limits inside a container for the amount of disk
 396 space that each user can use.  This only works on ext4 image based
 397 storage types and currently does not work with unprivileged
 398 containers.
 399
 400 Activating the `quota` option causes the following mount options to be
 401 used for a mount point:
 402 `usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0`
 403
 404 This allows quotas to be used like you would on any other system. You
 405 can initialize the `/aquota.user` and `/aquota.group` files by running
 406
 407 ----
 408 quotacheck -cmug /
 409 quotaon /
 410 ----
 411
 412 and edit the quotas via the `edquota` command. Refer to the documentation
 413 of the distribution running inside the container for details.
 414
 415 NOTE: You need to run the above commands for every mount point by passing
 416 the mount point's path instead of just `/`.
 417
 418
 419 Using ACLs Inside Containers
 420 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 421
 422 The standard Posix **A**ccess **C**ontrol **L**ists are also available inside containers.
 423 ACLs allow you to set more detailed file ownership than the traditional user/
 424 group/others model.
 425
 426
 427 [[pct_container_network]]
 428 Container Network
 429 -----------------
 430
 431 You can configure up to 10 network interfaces for a single
 432 container. The corresponding options are called `net0` to `net9`, and
 433 they can contain the following setting:
 434
 435 include::pct-network-opts.adoc[]
 436
 437
 438 Backup and Restore
 439 ------------------
 440
 441
 442 Container Backup
 443 ~~~~~~~~~~~~~~~~
 444
 445 It is possible to use the `vzdump` tool for container backup. Please
 446 refer to the `vzdump` manual page for details.
 447
 448
 449 Restoring Container Backups
 450 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 451
 452 Restoring container backups made with `vzdump` is possible using the
 453 `pct restore` command. By default, `pct restore` will attempt to restore as much
 454 of the backed up container configuration as possible. It is possible to override
 455 the backed up configuration by manually setting container options on the command
 456 line (see the `pct` manual page for details).
 457
 458 NOTE: `pvesm extractconfig` can be used to view the backed up configuration
 459 contained in a vzdump archive.
 460
 461 There are two basic restore modes, only differing by their handling of mount
 462 points:
 463
 464
 465 ``Simple'' Restore Mode
 466 ^^^^^^^^^^^^^^^^^^^^^^^
 467
 468 If neither the `rootfs` parameter nor any of the optional `mpX` parameters
 469 are explicitly set, the mount point configuration from the backed up
 470 configuration file is restored using the following steps:
 471
 472 . Extract mount points and their options from backup
 473 . Create volumes for storage backed mount points (on storage provided with the
 474 `storage` parameter, or default local storage if unset)
 475 . Extract files from backup archive
 476 . Add bind and device mount points to restored configuration (limited to root user)
 477
 478 NOTE: Since bind and device mount points are never backed up, no files are
 479 restored in the last step, but only the configuration options. The assumption
 480 is that such mount points are either backed up with another mechanism (e.g.,
 481 NFS space that is bind mounted into many containers), or not intended to be
 482 backed up at all.
 483
 484 This simple mode is also used by the container restore operations in the web
 485 interface.
 486
 487
 488 ``Advanced'' Restore Mode
 489 ^^^^^^^^^^^^^^^^^^^^^^^^^
 490
 491 By setting the `rootfs` parameter (and optionally, any combination of `mpX`
 492 parameters), the `pct restore` command is automatically switched into an
 493 advanced mode. This advanced mode completely ignores the `rootfs` and `mpX`
 494 configuration options contained in the backup archive, and instead only
 495 uses the options explicitly provided as parameters.
 496
 497 This mode allows flexible configuration of mount point settings at restore time,
 498 for example:
 499
 500 * Set target storages, volume sizes and other options for each mount point
 501 individually
 502 * Redistribute backed up files according to new mount point scheme
 503 * Restore to device and/or bind mount points (limited to root user)
 504
 505
 506 Managing Containers with `pct`
 507 ------------------------------
 508
 509 `pct` is the tool to manage Linux Containers on {pve}. You can create
 510 and destroy containers, and control execution (start, stop, migrate,
 511 ...). You can use pct to set parameters in the associated config file,
 512 like network configuration or memory limits.
 513
 514
 515 CLI Usage Examples
 516 ~~~~~~~~~~~~~~~~~~
 517
 518 Create a container based on a Debian template (provided you have
 519 already downloaded the template via the web interface)
 520
 521  pct create 100 /var/lib/vz/template/cache/debian-8.0-standard_8.0-1_amd64.tar.gz
 522
 523 Start container 100
 524
 525  pct start 100
 526
 527 Start a login session via getty
 528
 529  pct console 100
 530
 531 Enter the LXC namespace and run a shell as root user
 532
 533  pct enter 100
 534
 535 Display the configuration
 536
 537  pct config 100
 538
 539 Add a network interface called `eth0`, bridged to the host bridge `vmbr0`,
 540 set the address and gateway, while it's running
 541
 542  pct set 100 -net0 name=eth0,bridge=vmbr0,ip=192.168.15.147/24,gw=192.168.15.1
 543
 544 Reduce the memory of the container to 512MB
 545
 546  pct set 100 -memory 512
 547
 548
 549 Obtaining Debugging Logs
 550 ~~~~~~~~~~~~~~~~~~~~~~~~
 551
 552 In case `pct start` is unable to start a specific container, it might be
 553 helpful to collect debugging output by running `lxc-start` (replace `ID` with
 554 the container's ID):
 555
 556  lxc-start -n ID -F -l DEBUG -o /tmp/lxc-ID.log
 557
 558 This command will attempt to start the container in foreground mode, to stop the container run `pct shutdown ID` or `pct stop ID` in a second terminal.
 559
 560 The collected debug log is written to `/tmp/lxc-ID.log`.
 561
 562 NOTE: If you have changed the container's configuration since the last start
 563 attempt with `pct start`, you need to run `pct start` at least once to also
 564 update the configuration used by `lxc-start`.
 565
 566
 567 [[pct_configuration]]
 568 Configuration
 569 -------------
 570
 571 The `/etc/pve/lxc/<CTID>.conf` file stores container configuration,
 572 where `<CTID>` is the numeric ID of the given container. Like all
 573 other files stored inside `/etc/pve/`, they get automatically
 574 replicated to all other cluster nodes.
 575
 576 NOTE: CTIDs < 100 are reserved for internal purposes, and CTIDs need to be
 577 unique cluster wide.
 578
 579 .Example Container Configuration
 580 ----
 581 ostype: debian
 582 arch: amd64
 583 hostname: www
 584 memory: 512
 585 swap: 512
 586 net0: bridge=vmbr0,hwaddr=66:64:66:64:64:36,ip=dhcp,name=eth0,type=veth
 587 rootfs: local:107/vm-107-disk-1.raw,size=7G
 588 ----
 589
 590 Those configuration files are simple text files, and you can edit them
 591 using a normal text editor (`vi`, `nano`, ...). This is sometimes
 592 useful to do small corrections, but keep in mind that you need to
 593 restart the container to apply such changes.
 594
 595 For that reason, it is usually better to use the `pct` command to
 596 generate and modify those files, or do the whole thing using the GUI.
 597 Our toolkit is smart enough to instantaneously apply most changes to
 598 running containers. This feature is called "hot plug", and there is no
 599 need to restart the container in that case.
 600
 601
 602 File Format
 603 ~~~~~~~~~~~
 604
 605 Container configuration files use a simple colon separated key/value
 606 format. Each line has the following format:
 607
 608 -----
 609 # this is a comment
 610 OPTION: value
 611 -----
 612
 613 Blank lines in those files are ignored, and lines starting with a `#`
 614 character are treated as comments and are also ignored.
 615
 616 It is possible to add low-level, LXC style configuration directly, for
 617 example:
 618
 619  lxc.init_cmd: /sbin/my_own_init
 620
 621 or
 622
 623  lxc.init_cmd = /sbin/my_own_init
 624
 625 Those settings are directly passed to the LXC low-level tools.
 626
 627
 628 [[pct_snapshots]]
 629 Snapshots
 630 ~~~~~~~~~
 631
 632 When you create a snapshot, `pct` stores the configuration at snapshot
 633 time into a separate snapshot section within the same configuration
 634 file. For example, after creating a snapshot called ``testsnapshot'',
 635 your configuration file will look like this:
 636
 637 .Container configuration with snapshot
 638 ----
 639 memory: 512
 640 swap: 512
 641 parent: testsnaphot
 642 ...
 643
 644 [testsnaphot]
 645 memory: 512
 646 swap: 512
 647 snaptime: 1457170803
 648 ...
 649 ----
 650
 651 There are a few snapshot related properties like `parent` and
 652 `snaptime`. The `parent` property is used to store the parent/child
 653 relationship between snapshots. `snaptime` is the snapshot creation
 654 time stamp (Unix epoch).
 655
 656
 657 [[pct_options]]
 658 Options
 659 ~~~~~~~
 660
 661 include::pct.conf.5-opts.adoc[]
 662
 663
 664 Locks
 665 -----
 666
 667 Container migrations, snapshots and backups (`vzdump`) set a lock to
 668 prevent incompatible concurrent actions on the affected container. Sometimes
 669 you need to remove such a lock manually (e.g., after a power failure).
 670
 671  pct unlock <CTID>
 672
 673 CAUTION: Only do that if you are sure the action which set the lock is
 674 no longer running.
 675
 676
 677 ifdef::manvolnum[]
 678
 679 Files
 680 ------
 681
 682 `/etc/pve/lxc/<CTID>.conf`::
 683
 684 Configuration file for the container '<CTID>'.
 685
 686
 687 include::pve-copyright.adoc[]
 688 endif::manvolnum[]
 689
 690
 691
 692
 693
 694
 695