:pve-toplevel:
endif::wiki[]
-Network configuration can be done either via the GUI, or by manually
-editing the file `/etc/network/interfaces`, which contains the
-whole network configuration. The `interfaces(5)` manual page contains the
-complete format description. All {pve} tools try hard to keep direct
-user modifications, but using the GUI is still preferable, because it
+{pve} is using the Linux network stack. This provides a lot of flexibility on
+how to set up the network on the {pve} nodes. The configuration can be done
+either via the GUI, or by manually editing the file `/etc/network/interfaces`,
+which contains the whole network configuration. The `interfaces(5)` manual
+page contains the complete format description. All {pve} tools try hard to keep
+direct user modifications, but using the GUI is still preferable, because it
protects you from errors.
-Once the network is configured, you can use the Debian traditional tools `ifup`
-and `ifdown` commands to bring interfaces up and down.
+A 'vmbr' interface is needed to connect guests to the underlying physical
+network. They are a Linux bridge which can be thought of as a virtual switch
+to which the guests and physical interfaces are connected to. This section
+provides some examples on how the network can be set up to accomodate different
+use cases like redundancy with a xref:sysadmin_network_bond['bond'],
+xref:sysadmin_network_vlan['vlans'] or
+xref:sysadmin_network_routed['routed'] and
+xref:sysadmin_network_masquerading['NAT'] setups.
+
+The xref:chapter_pvesdn[Software Defined Network] is an option for more complex
+virtual networks in {pve} clusters.
+
+WARNING: It's discouraged to use the traditional Debian tools `ifup` and `ifdown`
+if unsure, as they have some pitfalls like interupting all guest traffic on
+`ifdown vmbrX` but not reconnecting those guest again when doing `ifup` on the
+same bridge later.
Apply Network Changes
~~~~~~~~~~~~~~~~~~~~~
are correct before applying, as a wrong network configuration may render a node
inaccessible.
-Reboot Node to apply
-^^^^^^^^^^^^^^^^^^^^
-
-With the default installed `ifupdown` network managing package you need to
-reboot to commit any pending network changes. Most of the time, the basic {pve}
-network setup is stable and does not change often, so rebooting should not be
-required often.
-
-Reload Network with ifupdown2
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-With the optional `ifupdown2` network managing package you also can reload the
-network configuration live, without requiring a reboot.
-
-Since {pve} 6.1 you can apply pending network changes over the web-interface,
-using the 'Apply Configuration' button in the 'Network' panel of a node.
+Live-Reload Network with ifupdown2
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-To install 'ifupdown2' ensure you have the latest {pve} updates installed, then
+With the recommended 'ifupdown2' package (default for new installations since
+{pve} 7.0), it is possible to apply network configuration changes without a
+reboot. If you change the network configuration via the GUI, you can click the
+'Apply Configuration' button. This will move changes from the staging
+`interfaces.new` file to `/etc/network/interfaces` and apply them live.
-WARNING: installing 'ifupdown2' will remove 'ifupdown', but as the removal
-scripts of 'ifupdown' before version '0.8.35+pve1' have a issue where network
-is fully stopped on removal footnote:[Introduced with Debian Buster:
-https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=945877] you *must* ensure
-that you have a up to date 'ifupdown' package version.
+If you made manual changes directly to the `/etc/network/interfaces` file, you
+can apply them by running `ifreload -a`
-For the installation itself you can then simply do:
+NOTE: If you installed {pve} on top of Debian, or upgraded to {pve} 7.0 from an
+older {pve} installation, make sure 'ifupdown2' is installed: `apt install
+ifupdown2`
- apt install ifupdown2
+Reboot Node to Apply
+^^^^^^^^^^^^^^^^^^^^
-With that you're all set. You can also switch back to the 'ifupdown' variant at
-any time, if you run into issues.
+Another way to apply a new network configuration is to reboot the node.
+In that case the systemd service `pvenetcommit` will activate the staging
+`interfaces.new` file before the `networking` service will apply that
+configuration.
Naming Conventions
~~~~~~~~~~~~~~~~~~
We currently use the following naming conventions for device names:
-* Ethernet devices: en*, systemd network interface names. This naming scheme is
+* Ethernet devices: `en*`, systemd network interface names. This naming scheme is
used for new {pve} installations since version 5.0.
-* Ethernet devices: eth[N], where 0 ≤ N (`eth0`, `eth1`, ...) This naming
+* Ethernet devices: `eth[N]`, where 0 ≤ N (`eth0`, `eth1`, ...) This naming
scheme is used for {pve} hosts which were installed before the 5.0
release. When upgrading to 5.0, the names are kept as-is.
-* Bridge names: vmbr[N], where 0 ≤ N ≤ 4094 (`vmbr0` - `vmbr4094`)
+* Bridge names: `vmbr[N]`, where 0 ≤ N ≤ 4094 (`vmbr0` - `vmbr4094`)
-* Bonds: bond[N], where 0 ≤ N (`bond0`, `bond1`, ...)
+* Bonds: `bond[N]`, where 0 ≤ N (`bond0`, `bond1`, ...)
* VLANs: Simply add the VLAN number to the device name,
separated by a period (`eno1.50`, `bond1.30`)
This makes it easier to debug networks problems, because the device
name implies the device type.
+[[systemd_network_interface_names]]
Systemd Network Interface Names
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-Systemd uses the two character prefix 'en' for Ethernet network
-devices. The next characters depends on the device driver and the fact
-which schema matches first.
+Systemd defines a versioned naming scheme for network device names. The
+scheme uses the two-character prefix `en` for Ethernet network devices. The
+next characters depends on the device driver, device location and other
+attributes. Some possible patterns are:
+
+* `o<index>[n<phys_port_name>|d<dev_port>]` — devices on board
+
+* `s<slot>[f<function>][n<phys_port_name>|d<dev_port>]` — devices by hotplug id
+
+* `[P<domain>]p<bus>s<slot>[f<function>][n<phys_port_name>|d<dev_port>]` —
+devices by bus id
+
+* `x<MAC>` — devices by MAC address
+
+Some examples for the most common patterns are:
+
+* `eno1` — is the first on-board NIC
+
+* `enp3s0f1` — is function 1 of the NIC on PCI bus 3, slot 0
+
+For a full list of possible device name patterns, see the
+https://manpages.debian.org/stable/systemd/systemd.net-naming-scheme.7.en.html[
+systemd.net-naming-scheme(7) manpage].
+
+A new version of systemd may define a new version of the network device naming
+scheme, which it then uses by default. Consequently, updating to a newer
+systemd version, for example during a major {pve} upgrade, can change the names
+of network devices and require adjusting the network configuration. To avoid
+name changes due to a new version of the naming scheme, you can manually pin a
+particular naming scheme version (see
+xref:network_pin_naming_scheme_version[below]).
+
+However, even with a pinned naming scheme version, network device names can
+still change due to kernel or driver updates. In order to avoid name changes
+for a particular network device altogether, you can manually override its name
+using a link file (see xref:network_override_device_names[below]).
+
+For more information on network interface names, see
+https://systemd.io/PREDICTABLE_INTERFACE_NAMES/[Predictable Network Interface
+Names].
+
+[[network_pin_naming_scheme_version]]
+Pinning a specific naming scheme version
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You can pin a specific version of the naming scheme for network devices by
+adding the `net.naming-scheme=<version>` parameter to the
+xref:sysboot_edit_kernel_cmdline[kernel command line]. For a list of naming
+scheme versions, see the
+https://manpages.debian.org/stable/systemd/systemd.net-naming-scheme.7.en.html[
+systemd.net-naming-scheme(7) manpage].
+
+For example, to pin the version `v252`, which is the latest naming scheme
+version for a fresh {pve} 8.0 installation, add the following kernel
+command-line parameter:
+
+----
+net.naming-scheme=v252
+----
+
+See also xref:sysboot_edit_kernel_cmdline[this section] on editing the kernel
+command line. You need to reboot for the changes to take effect.
+
+[[network_override_device_names]]
+Overriding network device names
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You can manually assign a name to a particular network device using a custom
+https://manpages.debian.org/stable/udev/systemd.link.5.en.html[systemd.link
+file]. This overrides the name that would be assigned according to the latest
+network device naming scheme. This way, you can avoid naming changes due to
+kernel updates, driver updates or newer versions of the naming scheme.
-* o<index>[n<phys_port_name>|d<dev_port>] — devices on board
+Custom link files should be placed in `/etc/systemd/network/` and named
+`<n>-<id>.link`, where `n` is a priority smaller than `99` and `id` is some
+identifier. A link file has two sections: `[Match]` determines which interfaces
+the file will apply to; `[Link]` determines how these interfaces should be
+configured, including their naming.
-* s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — device by hotplug id
+To assign a name to a particular network device, you need a way to uniquely and
+permanently identify that device in the `[Match]` section. One possibility is
+to match the device's MAC address using the `MACAddress` option, as it is
+unlikely to change. Then, you can assign a name using the `Name` option in the
+`[Link]` section.
-* [P<domain>]p<bus>s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — devices by bus id
+For example, to assign the name `enwan0` to the device with MAC address
+`aa:bb:cc:dd:ee:ff`, create a file `/etc/systemd/network/10-enwan0.link` with
+the following contents:
-* x<MAC> — device by MAC address
+----
+[Match]
+MACAddress=aa:bb:cc:dd:ee:ff
-The most common patterns are:
+[Link]
+Name=enwan0
+----
-* eno1 — is the first on board NIC
+Do not forget to adjust `/etc/network/interfaces` to use the new name.
+You need to reboot the node for the change to take effect.
-* enp3s0f1 — is the NIC on pcibus 3 slot 0 and use the NIC function 1.
+NOTE: It is recommended to assign a name starting with `en` or `eth` so that
+{pve} recognizes the interface as a physical network device which can then be
+configured via the GUI. Also, you should ensure that the name will not clash
+with other interface names in the future. One possibility is to assign a name
+that does not match any name pattern that systemd uses for network interfaces
+(xref:systemd_network_interface_names[see above]), such as `enwan0` in the
+example above.
-For more information see https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/[Predictable Network Interface Names].
+For more information on link files, see the
+https://manpages.debian.org/stable/udev/systemd.link.5.en.html[systemd.link(5)
+manpage].
Choosing a network configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
auto vmbr0
iface vmbr0 inet static
- address 192.168.10.2
- netmask 255.255.255.0
+ address 192.168.10.2/24
gateway 192.168.10.1
bridge-ports eno1
bridge-stp off
having its own MAC, even though there is only one network cable
connecting all of these VMs to the network.
+[[sysadmin_network_routed]]
Routed Configuration
~~~~~~~~~~~~~~~~~~~~
[thumbnail="default-network-setup-routed.svg"]
A common scenario is that you have a public IP (assume `198.51.100.5`
for this example), and an additional IP block for your VMs
-(`203.0.113.16/29`). We recommend the following setup for such
+(`203.0.113.16/28`). We recommend the following setup for such
situations:
----
auto lo
iface lo inet loopback
-auto eno1
-iface eno1 inet static
- address 198.51.100.5
- netmask 255.255.255.0
+auto eno0
+iface eno0 inet static
+ address 198.51.100.5/29
gateway 198.51.100.1
post-up echo 1 > /proc/sys/net/ipv4/ip_forward
- post-up echo 1 > /proc/sys/net/ipv4/conf/eno1/proxy_arp
+ post-up echo 1 > /proc/sys/net/ipv4/conf/eno0/proxy_arp
auto vmbr0
iface vmbr0 inet static
- address 203.0.113.17
- netmask 255.255.255.248
+ address 203.0.113.17/28
bridge-ports none
bridge-stp off
bridge-fd 0
----
+[[sysadmin_network_masquerading]]
Masquerading (NAT) with `iptables`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
auto eno1
#real IP address
iface eno1 inet static
- address 198.51.100.5
- netmask 255.255.255.0
+ address 198.51.100.5/24
gateway 198.51.100.1
auto vmbr0
#private sub network
iface vmbr0 inet static
- address 10.10.10.1
- netmask 255.255.255.0
+ address 10.10.10.1/24
bridge-ports none
bridge-stp off
bridge-fd 0
https://lwn.net/Articles/370152/[Patch on netdev-list introducing conntrack zones]
-https://blog.lobraun.de/2019/05/19/prox/[Blog post with a good explanation by using TRACE in the raw table]
-
+https://web.archive.org/web/20220610151210/https://blog.lobraun.de/2019/05/19/prox/[Blog post with a good explanation by using TRACE in the raw table]
+[[sysadmin_network_bond]]
Linux Bond
~~~~~~~~~~
If your switch support the LACP (IEEE 802.3ad) protocol then we recommend using
the corresponding bonding mode (802.3ad). Otherwise you should generally use the
-active-backup mode. +
-// http://lists.linux-ha.org/pipermail/linux-ha/2013-January/046295.html
-If you intend to run your cluster network on the bonding interfaces, then you
-have to use active-passive mode on the bonding interfaces, other modes are
-unsupported.
+active-backup mode.
+
+For the cluster network (Corosync) we recommend configuring it with multiple
+networks. Corosync does not need a bond for network reduncancy as it can switch
+between networks by itself, if one becomes unusable.
The following bond configuration can be used as distributed/shared
storage network. The benefit would be that you get more speed and the
auto bond0
iface bond0 inet static
bond-slaves eno1 eno2
- address 192.168.1.2
- netmask 255.255.255.0
+ address 192.168.1.2/24
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer2+3
auto vmbr0
iface vmbr0 inet static
- address 10.10.10.2
- netmask 255.255.255.0
+ address 10.10.10.2/24
gateway 10.10.10.1
bridge-ports eno3
bridge-stp off
auto vmbr0
iface vmbr0 inet static
- address 10.10.10.2
- netmask 255.255.255.0
+ address 10.10.10.2/24
gateway 10.10.10.1
bridge-ports bond0
bridge-stp off
----
+[[sysadmin_network_vlan]]
VLAN 802.1Q
~~~~~~~~~~~
auto vmbr0v5
iface vmbr0v5 inet static
- address 10.10.10.2
- netmask 255.255.255.0
+ address 10.10.10.2/24
gateway 10.10.10.1
bridge-ports eno1.5
bridge-stp off
auto vmbr0.5
iface vmbr0.5 inet static
- address 10.10.10.2
- netmask 255.255.255.0
+ address 10.10.10.2/24
gateway 10.10.10.1
auto vmbr0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
+ bridge-vids 2-4094
----
The next example is the same setup but a bond is used to
auto vmbr0v5
iface vmbr0v5 inet static
- address 10.10.10.2
- netmask 255.255.255.0
+ address 10.10.10.2/24
gateway 10.10.10.1
bridge-ports bond0.5
bridge-stp off
This method is preferred to disabling the loading of the IPv6 module on the
https://www.kernel.org/doc/Documentation/networking/ipv6.rst[kernel commandline].
+
+Disabling MAC Learning on a Bridge
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+By default, MAC learning is enabled on a bridge to ensure a smooth experience
+with virtual guests and their networks.
+
+But in some environments this can be undesired. Since {pve} 7.3 you can disable
+MAC learning on the bridge by setting the `bridge-disable-mac-learning 1`
+configuration on a bridge in `/etc/network/interfaces', for example:
+
+----
+# ...
+
+auto vmbr0
+iface vmbr0 inet static
+ address 10.10.10.2/24
+ gateway 10.10.10.1
+ bridge-ports ens18
+ bridge-stp off
+ bridge-fd 0
+ bridge-disable-mac-learning 1
+----
+
+Once enabled, {pve} will manually add the configured MAC address from VMs and
+Containers to the bridges forwarding database to ensure that guest can still
+use the network - but only when they are using their actual MAC address.
+
////
TODO: explain IPv6 support?
TODO: explain OVS