pve-network.adoc

   1 [[sysadmin_network_configuration]]
   2 Network Configuration
   3 ---------------------
   4 ifdef::wiki[]
   5 :pve-toplevel:
   6 endif::wiki[]
   7
   8 {pve} is using the Linux network stack. This provides a lot of flexibility on
   9 how to set up the network on the {pve} nodes. The configuration can be done
  10 either via the GUI, or by manually editing the file `/etc/network/interfaces`,
  11 which contains the whole network configuration. The  `interfaces(5)` manual
  12 page contains the complete format description. All {pve} tools try hard to keep
  13 direct user modifications, but using the GUI is still preferable, because it
  14 protects you from errors.
  15
  16 A 'vmbr' interface is needed to connect guests to the underlying physical
  17 network.  They are a Linux bridge which can be thought of as a virtual switch
  18 to which the guests and physical interfaces are connected to.  This section
  19 provides some examples on how the network can be set up to accomodate different
  20 use cases like redundancy with a xref:sysadmin_network_bond['bond'],
  21 xref:sysadmin_network_vlan['vlans'] or
  22 xref:sysadmin_network_routed['routed'] and
  23 xref:sysadmin_network_masquerading['NAT'] setups.
  24
  25 The xref:chapter_pvesdn[Software Defined Network] is an option for more complex
  26 virtual networks in {pve} clusters.
  27
  28 WARNING: It's discouraged to use the traditional Debian tools `ifup` and `ifdown`
  29 if unsure, as they have some pitfalls like interupting all guest traffic on
  30 `ifdown vmbrX` but not reconnecting those guest again when doing `ifup` on the
  31 same bridge later.
  32
  33 Apply Network Changes
  34 ~~~~~~~~~~~~~~~~~~~~~
  35
  36 {pve} does not write changes directly to `/etc/network/interfaces`. Instead, we
  37 write into a temporary file called `/etc/network/interfaces.new`, this way you
  38 can do many related changes at once. This also allows to ensure your changes
  39 are correct before applying, as a wrong network configuration may render a node
  40 inaccessible.
  41
  42 Live-Reload Network with ifupdown2
  43 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  44
  45 With the recommended 'ifupdown2' package (default for new installations since
  46 {pve} 7.0), it is possible to apply network configuration changes without a
  47 reboot. If you change the network configuration via the GUI, you can click the
  48 'Apply Configuration' button. This will move changes from the staging
  49 `interfaces.new` file to `/etc/network/interfaces` and apply them live.
  50
  51 If you made manual changes directly to the `/etc/network/interfaces` file, you
  52 can apply them by running `ifreload -a`
  53
  54 NOTE: If you installed {pve} on top of Debian, or upgraded to {pve} 7.0 from an
  55 older {pve} installation, make sure 'ifupdown2' is installed: `apt install
  56 ifupdown2`
  57
  58 Reboot Node to Apply
  59 ^^^^^^^^^^^^^^^^^^^^
  60
  61 Another way to apply a new network configuration is to reboot the node.
  62 In that case the systemd service `pvenetcommit` will activate the staging
  63 `interfaces.new` file before the `networking` service will apply that
  64 configuration.
  65
  66 Naming Conventions
  67 ~~~~~~~~~~~~~~~~~~
  68
  69 We currently use the following naming conventions for device names:
  70
  71 * Ethernet devices: en*, systemd network interface names. This naming scheme is
  72  used for new {pve} installations since version 5.0.
  73
  74 * Ethernet devices: eth[N], where 0 ≤ N (`eth0`, `eth1`, ...) This naming
  75 scheme is used for {pve} hosts which were installed before the 5.0
  76 release. When upgrading to 5.0, the names are kept as-is.
  77
  78 * Bridge names: vmbr[N], where 0 ≤ N ≤ 4094 (`vmbr0` - `vmbr4094`)
  79
  80 * Bonds: bond[N], where 0 ≤ N (`bond0`, `bond1`, ...)
  81
  82 * VLANs: Simply add the VLAN number to the device name,
  83   separated by a period (`eno1.50`, `bond1.30`)
  84
  85 This makes it easier to debug networks problems, because the device
  86 name implies the device type.
  87
  88 Systemd Network Interface Names
  89 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  90
  91 Systemd uses the two character prefix 'en' for Ethernet network
  92 devices. The next characters depends on the device driver and the fact
  93 which schema matches first.
  94
  95 * o<index>[n<phys_port_name>|d<dev_port>] — devices on board
  96
  97 * s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — device by hotplug id
  98
  99 * [P<domain>]p<bus>s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — devices by bus id
 100
 101 * x<MAC> — device by MAC address
 102
 103 The most common patterns are:
 104
 105 * eno1 — is the first on board NIC
 106
 107 * enp3s0f1 — is the NIC on pcibus 3 slot 0 and use the NIC function 1.
 108
 109 For more information see https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/[Predictable Network Interface Names].
 110
 111 Choosing a network configuration
 112 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 113
 114 Depending on your current network organization and your resources you can
 115 choose either a bridged, routed, or masquerading networking setup.
 116
 117 {pve} server in a private LAN, using an external gateway to reach the internet
 118 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 119
 120 The *Bridged* model makes the most sense in this case, and this is also
 121 the default mode on new {pve} installations.
 122 Each of your Guest system will have a virtual interface attached to the
 123 {pve} bridge. This is similar in effect to having the Guest network card
 124 directly connected to a new switch on your LAN, the {pve} host playing the role
 125 of the switch.
 126
 127 {pve} server at hosting provider, with public IP ranges for Guests
 128 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 129
 130 For this setup, you can use either a *Bridged* or *Routed* model, depending on
 131 what your provider allows.
 132
 133 {pve} server at hosting provider, with a single public IP address
 134 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 135
 136 In that case the only way to get outgoing network accesses for your guest
 137 systems is to use *Masquerading*. For incoming network access to your guests,
 138 you will need to configure *Port Forwarding*.
 139
 140 For further flexibility, you can configure
 141 VLANs (IEEE 802.1q) and network bonding, also known as "link
 142 aggregation". That way it is possible to build complex and flexible
 143 virtual networks.
 144
 145 Default Configuration using a Bridge
 146 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 147
 148 [thumbnail="default-network-setup-bridge.svg"]
 149 Bridges are like physical network switches implemented in software.
 150 All virtual guests can share a single bridge, or you can create multiple
 151 bridges to separate network domains. Each host can have up to 4094 bridges.
 152
 153 The installation program creates a single bridge named `vmbr0`, which
 154 is connected to the first Ethernet card. The corresponding
 155 configuration in `/etc/network/interfaces` might look like this:
 156
 157 ----
 158 auto lo
 159 iface lo inet loopback
 160
 161 iface eno1 inet manual
 162
 163 auto vmbr0
 164 iface vmbr0 inet static
 165         address 192.168.10.2/24
 166         gateway 192.168.10.1
 167         bridge-ports eno1
 168         bridge-stp off
 169         bridge-fd 0
 170 ----
 171
 172 Virtual machines behave as if they were directly connected to the
 173 physical network. The network, in turn, sees each virtual machine as
 174 having its own MAC, even though there is only one network cable
 175 connecting all of these VMs to the network.
 176
 177 [[sysadmin_network_routed]]
 178 Routed Configuration
 179 ~~~~~~~~~~~~~~~~~~~~
 180
 181 Most hosting providers do not support the above setup. For security
 182 reasons, they disable networking as soon as they detect multiple MAC
 183 addresses on a single interface.
 184
 185 TIP: Some providers allow you to register additional MACs through their
 186 management interface. This avoids the problem, but can be clumsy to
 187 configure because you need to register a MAC for each of your VMs.
 188
 189 You can avoid the problem by ``routing'' all traffic via a single
 190 interface. This makes sure that all network packets use the same MAC
 191 address.
 192
 193 [thumbnail="default-network-setup-routed.svg"]
 194 A common scenario is that you have a public IP (assume `198.51.100.5`
 195 for this example), and an additional IP block for your VMs
 196 (`203.0.113.16/28`). We recommend the following setup for such
 197 situations:
 198
 199 ----
 200 auto lo
 201 iface lo inet loopback
 202
 203 auto eno0
 204 iface eno0 inet static
 205         address  198.51.100.5/29
 206         gateway  198.51.100.1
 207         post-up echo 1 > /proc/sys/net/ipv4/ip_forward
 208         post-up echo 1 > /proc/sys/net/ipv4/conf/eno0/proxy_arp
 209
 210
 211 auto vmbr0
 212 iface vmbr0 inet static
 213         address  203.0.113.17/28
 214         bridge-ports none
 215         bridge-stp off
 216         bridge-fd 0
 217 ----
 218
 219
 220 [[sysadmin_network_masquerading]]
 221 Masquerading (NAT) with `iptables`
 222 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 223
 224 Masquerading allows guests having only a private IP address to access the
 225 network by using the host IP address for outgoing traffic. Each outgoing
 226 packet is rewritten by `iptables` to appear as originating from the host,
 227 and responses are rewritten accordingly to be routed to the original sender.
 228
 229 ----
 230 auto lo
 231 iface lo inet loopback
 232
 233 auto eno1
 234 #real IP address
 235 iface eno1 inet static
 236         address  198.51.100.5/24
 237         gateway  198.51.100.1
 238
 239 auto vmbr0
 240 #private sub network
 241 iface vmbr0 inet static
 242         address  10.10.10.1/24
 243         bridge-ports none
 244         bridge-stp off
 245         bridge-fd 0
 246
 247         post-up   echo 1 > /proc/sys/net/ipv4/ip_forward
 248         post-up   iptables -t nat -A POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
 249         post-down iptables -t nat -D POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
 250 ----
 251
 252 NOTE: In some masquerade setups with firewall enabled, conntrack zones might be
 253 needed for outgoing connections. Otherwise the firewall could block outgoing
 254 connections since they will prefer the `POSTROUTING` of the VM bridge (and not
 255 `MASQUERADE`).
 256
 257 Adding these lines in the `/etc/network/interfaces` can fix this problem:
 258
 259 ----
 260 post-up   iptables -t raw -I PREROUTING -i fwbr+ -j CT --zone 1
 261 post-down iptables -t raw -D PREROUTING -i fwbr+ -j CT --zone 1
 262 ----
 263
 264 For more information about this, refer to the following links:
 265
 266 https://commons.wikimedia.org/wiki/File:Netfilter-packet-flow.svg[Netfilter Packet Flow]
 267
 268 https://lwn.net/Articles/370152/[Patch on netdev-list introducing conntrack zones]
 269
 270 https://web.archive.org/web/20220610151210/https://blog.lobraun.de/2019/05/19/prox/[Blog post with a good explanation by using TRACE in the raw table]
 271
 272
 273 [[sysadmin_network_bond]]
 274 Linux Bond
 275 ~~~~~~~~~~
 276
 277 Bonding (also called NIC teaming or Link Aggregation) is a technique
 278 for binding multiple NIC's to a single network device.  It is possible
 279 to achieve different goals, like make the network fault-tolerant,
 280 increase the performance or both together.
 281
 282 High-speed hardware like Fibre Channel and the associated switching
 283 hardware can be quite expensive. By doing link aggregation, two NICs
 284 can appear as one logical interface, resulting in double speed. This
 285 is a native Linux kernel feature that is supported by most
 286 switches. If your nodes have multiple Ethernet ports, you can
 287 distribute your points of failure by running network cables to
 288 different switches and the bonded connection will failover to one
 289 cable or the other in case of network trouble.
 290
 291 Aggregated links can improve live-migration delays and improve the
 292 speed of replication of data between Proxmox VE Cluster nodes.
 293
 294 There are 7 modes for bonding:
 295
 296 * *Round-robin (balance-rr):* Transmit network packets in sequential
 297 order from the first available network interface (NIC) slave through
 298 the last. This mode provides load balancing and fault tolerance.
 299
 300 * *Active-backup (active-backup):* Only one NIC slave in the bond is
 301 active. A different slave becomes active if, and only if, the active
 302 slave fails. The single logical bonded interface's MAC address is
 303 externally visible on only one NIC (port) to avoid distortion in the
 304 network switch. This mode provides fault tolerance.
 305
 306 * *XOR (balance-xor):* Transmit network packets based on [(source MAC
 307 address XOR'd with destination MAC address) modulo NIC slave
 308 count]. This selects the same NIC slave for each destination MAC
 309 address. This mode provides load balancing and fault tolerance.
 310
 311 * *Broadcast (broadcast):* Transmit network packets on all slave
 312 network interfaces. This mode provides fault tolerance.
 313
 314 * *IEEE 802.3ad Dynamic link aggregation (802.3ad)(LACP):* Creates
 315 aggregation groups that share the same speed and duplex
 316 settings. Utilizes all slave network interfaces in the active
 317 aggregator group according to the 802.3ad specification.
 318
 319 * *Adaptive transmit load balancing (balance-tlb):* Linux bonding
 320 driver mode that does not require any special network-switch
 321 support. The outgoing network packet traffic is distributed according
 322 to the current load (computed relative to the speed) on each network
 323 interface slave. Incoming traffic is received by one currently
 324 designated slave network interface. If this receiving slave fails,
 325 another slave takes over the MAC address of the failed receiving
 326 slave.
 327
 328 * *Adaptive load balancing (balance-alb):* Includes balance-tlb plus receive
 329 load balancing (rlb) for IPV4 traffic, and does not require any
 330 special network switch support. The receive load balancing is achieved
 331 by ARP negotiation. The bonding driver intercepts the ARP Replies sent
 332 by the local system on their way out and overwrites the source
 333 hardware address with the unique hardware address of one of the NIC
 334 slaves in the single logical bonded interface such that different
 335 network-peers use different MAC addresses for their network packet
 336 traffic.
 337
 338 If your switch support the LACP (IEEE 802.3ad) protocol then we recommend using
 339 the corresponding bonding mode (802.3ad). Otherwise you should generally use the
 340 active-backup mode. +
 341 // http://lists.linux-ha.org/pipermail/linux-ha/2013-January/046295.html
 342 If you intend to run your cluster network on the bonding interfaces, then you
 343 have to use active-passive mode on the bonding interfaces, other modes are
 344 unsupported.
 345
 346 The following bond configuration can be used as distributed/shared
 347 storage network. The benefit would be that you get more speed and the
 348 network will be fault-tolerant.
 349
 350 .Example: Use bond with fixed IP address
 351 ----
 352 auto lo
 353 iface lo inet loopback
 354
 355 iface eno1 inet manual
 356
 357 iface eno2 inet manual
 358
 359 iface eno3 inet manual
 360
 361 auto bond0
 362 iface bond0 inet static
 363       bond-slaves eno1 eno2
 364       address  192.168.1.2/24
 365       bond-miimon 100
 366       bond-mode 802.3ad
 367       bond-xmit-hash-policy layer2+3
 368
 369 auto vmbr0
 370 iface vmbr0 inet static
 371         address  10.10.10.2/24
 372         gateway  10.10.10.1
 373         bridge-ports eno3
 374         bridge-stp off
 375         bridge-fd 0
 376
 377 ----
 378
 379
 380 [thumbnail="default-network-setup-bond.svg"]
 381 Another possibility it to use the bond directly as bridge port.
 382 This can be used to make the guest network fault-tolerant.
 383
 384 .Example: Use a bond as bridge port
 385 ----
 386 auto lo
 387 iface lo inet loopback
 388
 389 iface eno1 inet manual
 390
 391 iface eno2 inet manual
 392
 393 auto bond0
 394 iface bond0 inet manual
 395       bond-slaves eno1 eno2
 396       bond-miimon 100
 397       bond-mode 802.3ad
 398       bond-xmit-hash-policy layer2+3
 399
 400 auto vmbr0
 401 iface vmbr0 inet static
 402         address  10.10.10.2/24
 403         gateway  10.10.10.1
 404         bridge-ports bond0
 405         bridge-stp off
 406         bridge-fd 0
 407
 408 ----
 409
 410
 411 [[sysadmin_network_vlan]]
 412 VLAN 802.1Q
 413 ~~~~~~~~~~~
 414
 415 A virtual LAN (VLAN) is a broadcast domain that is partitioned and
 416 isolated in the network at layer two.  So it is possible to have
 417 multiple networks (4096) in a physical network, each independent of
 418 the other ones.
 419
 420 Each VLAN network is identified by a number often called 'tag'.
 421 Network packages are then 'tagged' to identify which virtual network
 422 they belong to.
 423
 424
 425 VLAN for Guest Networks
 426 ^^^^^^^^^^^^^^^^^^^^^^^
 427
 428 {pve} supports this setup out of the box. You can specify the VLAN tag
 429 when you create a VM. The VLAN tag is part of the guest network
 430 configuration. The networking layer supports different modes to
 431 implement VLANs, depending on the bridge configuration:
 432
 433 * *VLAN awareness on the Linux bridge:*
 434 In this case, each guest's virtual network card is assigned to a VLAN tag,
 435 which is transparently supported by the Linux bridge.
 436 Trunk mode is also possible, but that makes configuration
 437 in the guest necessary.
 438
 439 * *"traditional" VLAN on the Linux bridge:*
 440 In contrast to the VLAN awareness method, this method is not transparent
 441 and creates a VLAN device with associated bridge for each VLAN.
 442 That is, creating a guest on VLAN 5 for example, would create two
 443 interfaces eno1.5 and vmbr0v5, which would remain until a reboot occurs.
 444
 445 * *Open vSwitch VLAN:*
 446 This mode uses the OVS VLAN feature.
 447
 448 * *Guest configured VLAN:*
 449 VLANs are assigned inside the guest. In this case, the setup is
 450 completely done inside the guest and can not be influenced from the
 451 outside. The benefit is that you can use more than one VLAN on a
 452 single virtual NIC.
 453
 454
 455 VLAN on the Host
 456 ^^^^^^^^^^^^^^^^
 457
 458 To allow host communication with an isolated network. It is possible
 459 to apply VLAN tags to any network device (NIC, Bond, Bridge). In
 460 general, you should configure the VLAN on the interface with the least
 461 abstraction layers between itself and the physical NIC.
 462
 463 For example, in a default configuration where you want to place
 464 the host management address on a separate VLAN.
 465
 466
 467 .Example: Use VLAN 5 for the {pve} management IP with traditional Linux bridge
 468 ----
 469 auto lo
 470 iface lo inet loopback
 471
 472 iface eno1 inet manual
 473
 474 iface eno1.5 inet manual
 475
 476 auto vmbr0v5
 477 iface vmbr0v5 inet static
 478         address  10.10.10.2/24
 479         gateway  10.10.10.1
 480         bridge-ports eno1.5
 481         bridge-stp off
 482         bridge-fd 0
 483
 484 auto vmbr0
 485 iface vmbr0 inet manual
 486         bridge-ports eno1
 487         bridge-stp off
 488         bridge-fd 0
 489
 490 ----
 491
 492 .Example: Use VLAN 5 for the {pve} management IP with VLAN aware Linux bridge
 493 ----
 494 auto lo
 495 iface lo inet loopback
 496
 497 iface eno1 inet manual
 498
 499
 500 auto vmbr0.5
 501 iface vmbr0.5 inet static
 502         address  10.10.10.2/24
 503         gateway  10.10.10.1
 504
 505 auto vmbr0
 506 iface vmbr0 inet manual
 507         bridge-ports eno1
 508         bridge-stp off
 509         bridge-fd 0
 510         bridge-vlan-aware yes
 511         bridge-vids 2-4094
 512 ----
 513
 514 The next example is the same setup but a bond is used to
 515 make this network fail-safe.
 516
 517 .Example: Use VLAN 5 with bond0 for the {pve} management IP with traditional Linux bridge
 518 ----
 519 auto lo
 520 iface lo inet loopback
 521
 522 iface eno1 inet manual
 523
 524 iface eno2 inet manual
 525
 526 auto bond0
 527 iface bond0 inet manual
 528       bond-slaves eno1 eno2
 529       bond-miimon 100
 530       bond-mode 802.3ad
 531       bond-xmit-hash-policy layer2+3
 532
 533 iface bond0.5 inet manual
 534
 535 auto vmbr0v5
 536 iface vmbr0v5 inet static
 537         address  10.10.10.2/24
 538         gateway  10.10.10.1
 539         bridge-ports bond0.5
 540         bridge-stp off
 541         bridge-fd 0
 542
 543 auto vmbr0
 544 iface vmbr0 inet manual
 545         bridge-ports bond0
 546         bridge-stp off
 547         bridge-fd 0
 548
 549 ----
 550
 551 Disabling IPv6 on the Node
 552 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 553
 554 {pve} works correctly in all environments, irrespective of whether IPv6 is
 555 deployed or not. We recommend leaving all settings at the provided defaults.
 556
 557 Should you still need to disable support for IPv6 on your node, do so by
 558 creating an appropriate `sysctl.conf (5)` snippet file and setting the proper
 559 https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt[sysctls],
 560 for example adding `/etc/sysctl.d/disable-ipv6.conf` with content:
 561
 562 ----
 563 net.ipv6.conf.all.disable_ipv6 = 1
 564 net.ipv6.conf.default.disable_ipv6 = 1
 565 ----
 566
 567 This method is preferred to disabling the loading of the IPv6 module on the
 568 https://www.kernel.org/doc/Documentation/networking/ipv6.rst[kernel commandline].
 569
 570
 571 Disabling MAC Learning on a Bridge
 572 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 573
 574 By default, MAC learning is enabled on a bridge to ensure a smooth experience
 575 with virtual guests and their networks.
 576
 577 But in some environments this can be undesired. Since {pve} 7.3 you can disable
 578 MAC learning on the bridge by setting the `bridge-disable-mac-learning 1`
 579 configuration on a bridge in `/etc/network/interfaces', for example:
 580
 581 ----
 582 # ...
 583
 584 auto vmbr0
 585 iface vmbr0 inet static
 586         address  10.10.10.2/24
 587         gateway  10.10.10.1
 588         bridge-ports ens18
 589         bridge-stp off
 590         bridge-fd 0
 591         bridge-disable-mac-learning 1
 592 ----
 593
 594 Once enabled, {pve} will manually add the configured MAC address from VMs and
 595 Containers to the bridges forwarding database to ensure that guest can still
 596 use the network - but only when they are using their actual MAC address.
 597
 598 ////
 599 TODO: explain IPv6 support?
 600 TODO: explain OVS
 601 ////