<listitem>
<para>
The only allowed values are 0 and 1. Set this to 1 to destroy a
- container on shutdown.
+ container on shutdown.
</para>
</listitem>
</varlistentry>
<para>
<option>veth:</option> a virtual ethernet pair
device is created with one side assigned to the container
- and the other side attached to a bridge specified by
+ and the other side on the host.
+ <option>lxc.net.[i].veth.mode</option> specifies the
+ mode the veth parent will use on the host.
+ The accepted modes are <option>bridge</option> and <option>router</option>.
+ The mode defaults to bridge if not specified.
+ In <option>bridge</option> mode the host side is attached to a bridge specified by
the <option>lxc.net.[i].link</option> option.
- If the bridge is not specified, then the veth pair device
+ If the bridge link is not specified, then the veth pair device
will be created but not attached to any bridge.
Otherwise, the bridge has to be created on the system
before starting the container.
<command>lxc</command> won't handle any
configuration outside of the container.
+ In <option>router</option> mode static routes are created on the host for the
+ container's IP addresses pointing to the host side veth interface.
+ Additionally Proxy ARP and Proxy NDP entries are added on the host side veth interface
+ for the gateway IPs defined in the container to allow the container to reach the host.
By default, <command>lxc</command> chooses a name for the
network device belonging to the outside of the
container, but if you wish to handle
the <option>lxc.net.[i].veth.pair</option> option (except for
unprivileged containers where this option is ignored for security
reasons).
+
+ Static routes can be added on the host pointing to the container using the
+ <option>lxc.net.[i].veth.ipv4.route</option> and
+ <option>lxc.net.[i].veth.ipv6.route</option> options.
+ Several lines specify several routes.
+ The route is in format x.y.z.t/m, eg. 192.168.1.0/24.
+
+ In <option>bridge</option> mode untagged VLAN membership can be set with the
+ <option>lxc.net.[i].veth.vlan.id</option> option. It accepts a special value of 'none' indicating
+ that the container port should be removed from the bridge's default untagged VLAN.
+ The <option>lxc.net.[i].veth.vlan.tagged.id</option> option can be specified multiple times to set
+ the container's bridge port membership to one or more tagged VLANs.
</para>
<para>
different macvlan on the same upper device. The accepted
modes are <option>private</option>, <option>vepa</option>,
<option>bridge</option> and <option>passthru</option>.
- In <option>private</option> mode, the device never
+ In <option>private</option> mode, the device never
communicates with any other device on the same upper_dev (default).
In <option>vepa</option> mode, the new Virtual Ethernet Port
Aggregator (VEPA) mode, it assumes that the adjacent
mode is possible for one physical interface.
</para>
+ <para>
+ <option>ipvlan:</option> an ipvlan interface is linked
+ with the interface specified by
+ the <option>lxc.net.[i].link</option> and assigned to
+ the container.
+ <option>lxc.net.[i].ipvlan.mode</option> specifies the
+ mode the ipvlan will use to communicate between
+ different ipvlan on the same upper device. The accepted
+ modes are <option>l3</option>, <option>l3s</option> and
+ <option>l2</option>. It defaults to <option>l3</option> mode.
+ In <option>l3</option> mode TX processing up to L3 happens on the stack instance
+ attached to the dependent device and packets are switched to the stack instance of the
+ parent device for the L2 processing and routing from that instance will be
+ used before packets are queued on the outbound device. In this mode the dependent devices
+ will not receive nor can send multicast / broadcast traffic.
+ In <option>l3s</option> mode TX processing is very similar to the L3 mode except that
+ iptables (conn-tracking) works in this mode and hence it is L3-symmetric (L3s).
+ This will have slightly less performance but that shouldn't matter since you are
+ choosing this mode over plain-L3 mode to make conn-tracking work.
+ In <option>l2</option> mode TX processing happens on the stack instance attached to
+ the dependent device and packets are switched and queued to the parent device to send devices
+ out. In this mode the dependent devices will RX/TX multicast and broadcast (if applicable) as well.
+ <option>lxc.net.[i].ipvlan.isolation</option> specifies the isolation mode.
+ The accepted isolation values are <option>bridge</option>,
+ <option>private</option> and <option>vepa</option>.
+ It defaults to <option>bridge</option>.
+ In <option>bridge</option> isolation mode dependent devices can cross-talk among themselves
+ apart from talking through the parent device.
+ In <option>private</option> isolation mode the port is set in private mode.
+ i.e. port won't allow cross communication between dependent devices.
+ In <option>vepa</option> isolation mode the port is set in VEPA mode.
+ i.e. port will offload switching functionality to the external entity as
+ described in 802.1Qbg.
+ </para>
+
<para>
<option>phys:</option> an already existing interface
specified by the <option>lxc.net.[i].link</option> is
</listitem>
</varlistentry>
+ <varlistentry>
+ <term>
+ <option>lxc.net.[i].l2proxy</option>
+ </term>
+ <listitem>
+ <para>
+ Controls whether layer 2 IP neighbour proxy entries will be added to the
+ lxc.net.[i].link interface for the IP addresses of the container.
+ Can be set to 0 or 1. Defaults to 0.
+ When used with IPv4 addresses, the following sysctl values need to be set:
+ net.ipv4.conf.[link].forwarding=1
+ When used with IPv6 addresses, the following sysctl values need to be set:
+ net.ipv6.conf.[link].proxy_ndp=1
+ net.ipv6.conf.[link].forwarding=1
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term>
<option>lxc.net.[i].mtu</option>
interface (as specified by the
<option>lxc.net.[i].link</option> option) and use that as
the gateway. <option>auto</option> is only available when
- using the <option>veth</option> and
- <option>macvlan</option> network types.
+ using the <option>veth</option>,
+ <option>macvlan</option> and <option>ipvlan</option> network types.
+ Can also have the special value of <option>dev</option>,
+ which means to set the default gateway as a device route.
+ This is primarily for use with layer 3 network modes, such as IPVLAN.
</para>
</listitem>
</varlistentry>
interface (as specified by the
<option>lxc.net.[i].link</option> option) and use that as
the gateway. <option>auto</option> is only available when
- using the <option>veth</option> and
- <option>macvlan</option> network types.
+ using the <option>veth</option>,
+ <option>macvlan</option> and <option>ipvlan</option> network types.
+ Can also have the special value of <option>dev</option>,
+ which means to set the default gateway as a device route.
+ This is primarily for use with layer 3 network modes, such as IPVLAN.
</para>
</listitem>
</varlistentry>
<listitem>
<para>
LXC_NET_TYPE: the network type. This is one of the valid
- network types listed here (e.g. 'macvlan', 'veth').
+ network types listed here (e.g. 'vlan', 'macvlan', 'ipvlan', 'veth').
</para>
</listitem>
<listitem>
<para>
LXC_NET_TYPE: the network type. This is one of the valid
- network types listed here (e.g. 'macvlan', 'veth').
+ network types listed here (e.g. 'vlan', 'macvlan', 'ipvlan', 'veth').
</para>
</listitem>
<para>
If set, the container will have a new pseudo tty
instance, making this private to it. The value specifies
- the maximum number of pseudo ttys allowed for a pts
+ the maximum number of pseudo ttys allowed for a pty
instance (this limitation is not implemented yet).
</para>
</listitem>
When manually specifying a size for the log file the value should
be a power of 2 when converted to bytes. Valid size prefixes are
'KB', 'MB', 'GB'. (Note that all conversions are based on multiples
- of 1024. That means 'KB' == 'KiB', 'MB' == 'MiB', 'GB' == 'GiB'.
+ of 1024. That means 'KB' == 'KiB', 'MB' == 'MiB', 'GB' == 'GiB'.
Additionally, the case of the suffix is ignored, i.e. 'kB', 'KB' and
'Kb' are treated equally.)
<filename>/dev</filename> to be set up as needed in the container
rootfs. If lxc.autodev is set to 1, then after mounting the container's
rootfs LXC will mount a fresh tmpfs under <filename>/dev</filename>
- (limited to 500k) and fill in a minimal set of initial devices.
+ (limited to 500K by default, unless defined in lxc.autodev.tmpfs.size)
+ and fill in a minimal set of initial devices.
This is generally required when starting a container containing
a "systemd" based "init" but may be optional at other times. Additional
devices in the containers /dev directory may be created through the
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term>
+ <option>lxc.autodev.tmpfs.size</option>
+ </term>
+ <listitem>
+ <para>
+ Set this to define the size of the /dev tmpfs.
+ The default value is 500000 (500K). If the parameter is used
+ but without value, the default value is used.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect2>
Specify a mount point corresponding to a line in the
fstab format.
- Moreover lxc supports mount propagation, such as rslave or
+ Moreover lxc supports mount propagation, such as rshared or
rprivate, and adds three additional mount options.
<option>optional</option> don't fail if mount does not work.
<option>create=dir</option> or <option>create=file</option>
<option>relative</option> source path is taken to be relative to
the mounted container root. For instance,
</para>
-<screen>
-dev/null proc/kcore none bind,relative 0 0
-</screen>
+ <programlisting>
+ dev/null proc/kcore none bind,relative 0 0
+ </programlisting>
<para>
Will expand dev/null to ${<option>LXC_ROOTFS_MOUNT</option>}/dev/null,
and mount it to proc/kcore inside the container.
</listitem>
</varlistentry>
+ <varlistentry>
+ <term>
+ <option>lxc.rootfs.managed</option>
+ </term>
+ <listitem>
+ <para>
+ Set this to 0 to indicate that LXC is not managing the
+ container storage, then LXC will not modify the
+ container storage. The default is 1.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
</para>
</listitem>
</varlistentry>
+ <varlistentry>
+ <term>
+ <option>lxc.cgroup.dir.container</option>
+ </term>
+ <listitem>
+ <para>
+ This is similar to <option>lxc.cgroup.dir</option>, but must be
+ used together with <option>lxc.cgroup.dir.monitor</option> and
+ affects only the container's cgroup path. This option is mutually
+ exclusive with <option>lxc.cgroup.dir</option>.
+ Note that the final path the container attaches to may be
+ extended further by the
+ <option>lxc.cgroup.dir.container.inner</option> option.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ <option>lxc.cgroup.dir.monitor</option>
+ </term>
+ <listitem>
+ <para>
+ This is the monitor process counterpart to
+ <option>lxc.cgroup.dir.container</option>.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ <option>lxc.cgroup.dir.container.inner</option>
+ </term>
+ <listitem>
+ <para>
+ Specify an additional subdirectory where the cgroup namespace
+ will be created. With this option, the cgroup limits will be
+ applied to the outer path specified in
+ <option>lxc.cgroup.dir.container</option>, which is not accessible
+ from within the container, making it possible to better enforce
+ limits for privileged containers in a way they cannot override
+ them.
+ This only works in conjunction with the
+ <option>lxc.cgroup.dir.container</option> and
+ <option>lxc.cgroup.dir.monitor</option> options and has otherwise
+ no effect.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ <option>lxc.cgroup.relative</option>
+ </term>
+ <listitem>
+ <para>
+ Set this to 1 to instruct LXC to never escape to the
+ root cgroup. This makes it easy for users to adhere to
+ restrictions enforced by cgroup2 and
+ systemd. Specifically, this makes it possible to run LXC
+ containers as systemd services.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect2>
standard namespace identifiers as seen in the
<filename>/proc/PID/ns</filename> directory.
The <option>lxc.namespace.keep</option> is a
- blacklist option, i.e. it is useful when enforcing that containers
+ denylist option, i.e. it is useful when enforcing that containers
must keep a specific set of namespaces.
</para>
</para>
<para>
- To inherit the namespace from another container set the
+ To inherit the namespace from another container set the
<option>lxc.namespace.share.[namespace identifier]</option> to the name of
the container, e.g. <option>lxc.namespace.share.pid=c3</option>.
</para>
process wants to inherit the other's network namespace it usually
needs to inherit the user namespace as well.
</para>
+
+ <para>
+ Note that without careful additional configuration of an LSM,
+ sharing user+pid namespaces with a task may allow that task to
+ escalate privileges to that of the task calling liblxc.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <option>lxc.time.offset.boot</option>
+ </term>
+ <listitem>
+ <para>
+ Specify a positive or negative offset for the boottime clock. The
+ format accepts hours (h), minutes (m), seconds (s),
+ milliseconds (ms), microseconds (us), and nanoseconds (ns).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <option>lxc.time.offset.monotonic</option>
+ </term>
+ <listitem>
+ <para>
+ Specify a positive or negative offset for the monotonic clock. The
+ format accepts hours (h), minutes (m), seconds (s),
+ milliseconds (ms), microseconds (us), and nanoseconds (ns).
+ </para>
</listitem>
</varlistentry>
+
</variablelist>
</refsect2>
</term>
<listitem>
<para>
- Specify the kernel parameters to be set. The parameters available
+ Specify the kernel parameters to be set. The parameters available
are those listed under /proc/sys/.
Note that not all sysctls are namespaced. Changing Non-namespaced
sysctls will cause the system-wide setting to be modified.
<refentrytitle><command>sysctl</command></refentrytitle>
<manvolnum>8</manvolnum>
</citerefentry>.
- If used with no value, lxc will clear the parameters specified up
+ If used with no value, lxc will clear the parameters specified up
to this point.
</para>
</listitem>
container should be run can be specified in the container
configuration. The default is <command>lxc-container-default-cgns</command>
if the host kernel is cgroup namespace aware, or
- <command>lxc-container-default</command> othewise.
+ <command>lxc-container-default</command> otherwise.
</para>
<variablelist>
<varlistentry>
are nesting containers and are already confined), then use
</para>
<programlisting>lxc.apparmor.profile = unchanged</programlisting>
+ <para>
+ If you instruct LXC to generate the apparmor profile,
+ then use
+ </para>
+ <programlisting>lxc.apparmor.profile = generated</programlisting>
</listitem>
</varlistentry>
<varlistentry>
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term>
+ <option>lxc.apparmor.allow_nesting</option>
+ </term>
+ <listitem>
+ <para>
+ If set this to 1, causes the following changes. When
+ generated apparmor profiles are used, they will contain
+ the necessary changes to allow creating a nested
+ container. In addition to the usual mount points,
+ <filename>/dev/.lxc/proc</filename>
+ and <filename>/dev/.lxc/sys</filename> will contain
+ procfs and sysfs mount points without the lxcfs
+ overlays, which, if generated apparmor profiles are
+ being used, will not be read/writable directly.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <option>lxc.apparmor.raw</option>
+ </term>
+ <listitem>
+ <para>
+ A list of raw AppArmor profile lines to append to the
+ profile. Only valid when using generated profiles.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect2>
<programlisting>lxc.selinux.context = system_u:system_r:lxc_t:s0:c22</programlisting>
</listitem>
</varlistentry>
+ <varlistentry>
+ <term>
+ <option>lxc.selinux.context.keyring</option>
+ </term>
+ <listitem>
+ <para>
+ Specify the SELinux context under which the container's keyring
+ should be created. By default this the same as lxc.selinux.context, or
+ the context lxc is executed under if lxc.selinux.context has not been set.
+ </para>
+ <programlisting>lxc.selinux.context.keyring = system_u:system_r:lxc_t:s0:c22</programlisting>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2>
+ <title>Kernel Keyring</title>
+ <para>
+ The Linux Keyring facility is primarily a way for various
+ kernel components to retain or cache security data, authentication
+ keys, encryption keys, and other data in the kernel. By default lxc
+ will create a new session keyring for the started application.
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term>
+ <option>lxc.keyring.session</option>
+ </term>
+ <listitem>
+ <para>
+ Disable the creation of new session keyring by lxc. The started
+ application will then inherit the current session keyring.
+ By default, or when passing the value 1, a new keyring will be created.
+ </para>
+ <programlisting>lxc.keyring.session = 0</programlisting>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect2>
</para>
<para>
Versions 1 and 2 are currently supported. In version 1, the
- policy is a simple whitelist. The second line therefore must
- read "whitelist", with the rest of the file containing one (numeric)
- sycall number per line. Each syscall number is whitelisted,
- while every unlisted number is blacklisted for use in the container
+ policy is a simple allowlist. The second line therefore must
+ read "allowlist", with the rest of the file containing one (numeric)
+ syscall number per line. Each syscall number is allowlisted,
+ while every unlisted number is denylisted for use in the container
</para>
<para>
- In version 2, the policy may be blacklist or whitelist,
+ In version 2, the policy may be denylist or allowlist,
supports per-rule and per-policy default actions, and supports
per-architecture system call resolution from textual names.
</para>
<para>
- An example blacklist policy, in which all system calls are
+ An example denylist policy, in which all system calls are
allowed except for mknod, which will simply do nothing and
return 0 (success), looks like:
</para>
<programlisting>
2
- blacklist
+ denylist
mknod errno 0
+ ioctl notify
</programlisting>
+ <para>
+ Specifying "errno" as action will cause LXC to register a seccomp filter
+ that will cause a specific errno to be returned to the caller. The errno
+ value can be specified after the "errno" action word.
+ </para>
+
+ <para>
+ Specifying "notify" as action will cause LXC to register a seccomp
+ listener and retrieve a listener file descriptor from the kernel. When a
+ syscall is made that is registered as "notify" the kernel will generate a
+ poll event and send a message over the file descriptor. The caller can
+ read this message, inspect the syscalls including its arguments. Based on
+ this information the caller is expected to send back a message informing
+ the kernel which action to take. Until that message is sent the kernel
+ will block the calling process. The format of the messages to read and
+ sent is documented in seccomp itself.
+ </para>
+
<variablelist>
<varlistentry>
<term>
</para>
</listitem>
</varlistentry>
+ <varlistentry>
+ <term>
+ <option>lxc.seccomp.allow_nesting</option>
+ </term>
+ <listitem>
+ <para>
+ If this flag is set to 1, then seccomp filters will be stacked
+ regardless of whether a seccomp profile is already loaded.
+ This allows nested containers to load their own seccomp profile.
+ The default setting is 0.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ <option>lxc.seccomp.notify.proxy</option>
+ </term>
+ <listitem>
+ <para>
+ Specify a unix socket to which LXC will connect and forward
+ seccomp events to. The path must be in the form
+ unix:/path/to/socket or unix:@socket. The former specifies a
+ path-bound unix domain socket while the latter specifies an
+ abstract unix domain socket.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>
+ <option>lxc.seccomp.notify.cookie</option>
+ </term>
+ <listitem>
+ <para>
+ An additional string sent along with proxied seccomp notification
+ requests.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect2>
<listitem>
<para>
An integer used to sort the containers when auto-starting
- a series of containers at once.
+ a series of containers at once. A lower value means an
+ earlier start.
</para>
</listitem>
</varlistentry>