pve-network.adoc

   1 [[sysadmin_network_configuration]]
   2 Network Configuration
   3 ---------------------
   4 ifdef::wiki[]
   5 :pve-toplevel:
   6 endif::wiki[]
   7
   8 Network configuration can be done either via the GUI, or by manually
   9 editing the file `/etc/network/interfaces`, which contains the
  10 whole network configuration. The  `interfaces(5)` manual page contains the
  11 complete format description. All {pve} tools try hard to keep direct
  12 user modifications, but using the GUI is still preferable, because it
  13 protects you from errors.
  14
  15 Apply Network Changes
  16 ~~~~~~~~~~~~~~~~~~~~~
  17
  18 {pve} does not write changes directly to `/etc/network/interfaces`. Instead, we
  19 write into a temporary file called `/etc/network/interfaces.new`, this way you
  20 can do many related changes at once. This also allows to ensure your changes
  21 are correct before applying, as a wrong network configuration may render a node
  22 inaccessible.
  23
  24 Reboot Node to apply
  25 ^^^^^^^^^^^^^^^^^^^^
  26
  27 With the default installed `ifupdown` network managing package you need to
  28 reboot to commit any pending network changes. Most of the time, the basic {pve}
  29 network setup is stable and does not change often, so rebooting should not be
  30 required often.
  31
  32 Reload Network with ifupdown2
  33 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  34
  35 With the optional `ifupdown2` network managing package you also can reload the
  36 network configuration live, without requiring a reboot.
  37
  38 Since {pve} 6.1 you can apply pending network changes over the web-interface,
  39 using the 'Apply Configuration' button in the 'Network' panel of a node.
  40
  41 To install 'ifupdown2' ensure you have the latest {pve} updates installed, then
  42
  43 WARNING: installing 'ifupdown2' will remove 'ifupdown', but as the removal
  44 scripts of 'ifupdown' before version '0.8.35+pve1' have a issue where network
  45 is fully stopped on removal footnote:[Introduced with Debian Buster:
  46 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=945877] you *must* ensure
  47 that you have a up to date 'ifupdown' package version.
  48
  49 For the installation itself you can then simply do:
  50
  51  apt install ifupdown2
  52
  53 With that you're all set. You can also switch back to the 'ifupdown' variant at
  54 any time, if you run into issues.
  55
  56 Naming Conventions
  57 ~~~~~~~~~~~~~~~~~~
  58
  59 We currently use the following naming conventions for device names:
  60
  61 * Ethernet devices: en*, systemd network interface names. This naming scheme is
  62  used for new {pve} installations since version 5.0.
  63
  64 * Ethernet devices: eth[N], where 0 ≤ N (`eth0`, `eth1`, ...) This naming
  65 scheme is used for {pve} hosts which were installed before the 5.0
  66 release. When upgrading to 5.0, the names are kept as-is.
  67
  68 * Bridge names: vmbr[N], where 0 ≤ N ≤ 4094 (`vmbr0` - `vmbr4094`)
  69
  70 * Bonds: bond[N], where 0 ≤ N (`bond0`, `bond1`, ...)
  71
  72 * VLANs: Simply add the VLAN number to the device name,
  73   separated by a period (`eno1.50`, `bond1.30`)
  74
  75 This makes it easier to debug networks problems, because the device
  76 name implies the device type.
  77
  78 Systemd Network Interface Names
  79 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  80
  81 Systemd uses the two character prefix 'en' for Ethernet network
  82 devices. The next characters depends on the device driver and the fact
  83 which schema matches first.
  84
  85 * o<index>[n<phys_port_name>|d<dev_port>] — devices on board
  86
  87 * s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — device by hotplug id
  88
  89 * [P<domain>]p<bus>s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — devices by bus id
  90
  91 * x<MAC> — device by MAC address
  92
  93 The most common patterns are:
  94
  95 * eno1 — is the first on board NIC
  96
  97 * enp3s0f1 — is the NIC on pcibus 3 slot 0 and use the NIC function 1.
  98
  99 For more information see https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/[Predictable Network Interface Names].
 100
 101 Choosing a network configuration
 102 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 103
 104 Depending on your current network organization and your resources you can
 105 choose either a bridged, routed, or masquerading networking setup.
 106
 107 {pve} server in a private LAN, using an external gateway to reach the internet
 108 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 109
 110 The *Bridged* model makes the most sense in this case, and this is also
 111 the default mode on new {pve} installations.
 112 Each of your Guest system will have a virtual interface attached to the
 113 {pve} bridge. This is similar in effect to having the Guest network card
 114 directly connected to a new switch on your LAN, the {pve} host playing the role
 115 of the switch.
 116
 117 {pve} server at hosting provider, with public IP ranges for Guests
 118 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 119
 120 For this setup, you can use either a *Bridged* or *Routed* model, depending on
 121 what your provider allows.
 122
 123 {pve} server at hosting provider, with a single public IP address
 124 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 125
 126 In that case the only way to get outgoing network accesses for your guest
 127 systems is to use *Masquerading*. For incoming network access to your guests,
 128 you will need to configure *Port Forwarding*.
 129
 130 For further flexibility, you can configure
 131 VLANs (IEEE 802.1q) and network bonding, also known as "link
 132 aggregation". That way it is possible to build complex and flexible
 133 virtual networks.
 134
 135 Default Configuration using a Bridge
 136 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 137
 138 [thumbnail="default-network-setup-bridge.svg"]
 139 Bridges are like physical network switches implemented in software.
 140 All virtual guests can share a single bridge, or you can create multiple
 141 bridges to separate network domains. Each host can have up to 4094 bridges.
 142
 143 The installation program creates a single bridge named `vmbr0`, which
 144 is connected to the first Ethernet card. The corresponding
 145 configuration in `/etc/network/interfaces` might look like this:
 146
 147 ----
 148 auto lo
 149 iface lo inet loopback
 150
 151 iface eno1 inet manual
 152
 153 auto vmbr0
 154 iface vmbr0 inet static
 155         address 192.168.10.2/24
 156         gateway 192.168.10.1
 157         bridge-ports eno1
 158         bridge-stp off
 159         bridge-fd 0
 160 ----
 161
 162 Virtual machines behave as if they were directly connected to the
 163 physical network. The network, in turn, sees each virtual machine as
 164 having its own MAC, even though there is only one network cable
 165 connecting all of these VMs to the network.
 166
 167 Routed Configuration
 168 ~~~~~~~~~~~~~~~~~~~~
 169
 170 Most hosting providers do not support the above setup. For security
 171 reasons, they disable networking as soon as they detect multiple MAC
 172 addresses on a single interface.
 173
 174 TIP: Some providers allow you to register additional MACs through their
 175 management interface. This avoids the problem, but can be clumsy to
 176 configure because you need to register a MAC for each of your VMs.
 177
 178 You can avoid the problem by ``routing'' all traffic via a single
 179 interface. This makes sure that all network packets use the same MAC
 180 address.
 181
 182 [thumbnail="default-network-setup-routed.svg"]
 183 A common scenario is that you have a public IP (assume `198.51.100.5`
 184 for this example), and an additional IP block for your VMs
 185 (`203.0.113.16/28`). We recommend the following setup for such
 186 situations:
 187
 188 ----
 189 auto lo
 190 iface lo inet loopback
 191
 192 auto eno0
 193 iface eno0 inet static
 194         address  198.51.100.5/29
 195         gateway  198.51.100.1
 196         post-up echo 1 > /proc/sys/net/ipv4/ip_forward
 197         post-up echo 1 > /proc/sys/net/ipv4/conf/eno0/proxy_arp
 198
 199
 200 auto vmbr0
 201 iface vmbr0 inet static
 202         address  203.0.113.17/28
 203         bridge-ports none
 204         bridge-stp off
 205         bridge-fd 0
 206 ----
 207
 208
 209 Masquerading (NAT) with `iptables`
 210 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 211
 212 Masquerading allows guests having only a private IP address to access the
 213 network by using the host IP address for outgoing traffic. Each outgoing
 214 packet is rewritten by `iptables` to appear as originating from the host,
 215 and responses are rewritten accordingly to be routed to the original sender.
 216
 217 ----
 218 auto lo
 219 iface lo inet loopback
 220
 221 auto eno1
 222 #real IP address
 223 iface eno1 inet static
 224         address  198.51.100.5/24
 225         gateway  198.51.100.1
 226
 227 auto vmbr0
 228 #private sub network
 229 iface vmbr0 inet static
 230         address  10.10.10.1/24
 231         bridge-ports none
 232         bridge-stp off
 233         bridge-fd 0
 234
 235         post-up   echo 1 > /proc/sys/net/ipv4/ip_forward
 236         post-up   iptables -t nat -A POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
 237         post-down iptables -t nat -D POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
 238 ----
 239
 240 NOTE: In some masquerade setups with firewall enabled, conntrack zones might be
 241 needed for outgoing connections. Otherwise the firewall could block outgoing
 242 connections since they will prefer the `POSTROUTING` of the VM bridge (and not
 243 `MASQUERADE`).
 244
 245 Adding these lines in the `/etc/network/interfaces` can fix this problem:
 246
 247 ----
 248 post-up   iptables -t raw -I PREROUTING -i fwbr+ -j CT --zone 1
 249 post-down iptables -t raw -D PREROUTING -i fwbr+ -j CT --zone 1
 250 ----
 251
 252 For more information about this, refer to the following links:
 253
 254 https://commons.wikimedia.org/wiki/File:Netfilter-packet-flow.svg[Netfilter Packet Flow]
 255
 256 https://lwn.net/Articles/370152/[Patch on netdev-list introducing conntrack zones]
 257
 258 https://blog.lobraun.de/2019/05/19/prox/[Blog post with a good explanation by using TRACE in the raw table]
 259
 260
 261
 262 Linux Bond
 263 ~~~~~~~~~~
 264
 265 Bonding (also called NIC teaming or Link Aggregation) is a technique
 266 for binding multiple NIC's to a single network device.  It is possible
 267 to achieve different goals, like make the network fault-tolerant,
 268 increase the performance or both together.
 269
 270 High-speed hardware like Fibre Channel and the associated switching
 271 hardware can be quite expensive. By doing link aggregation, two NICs
 272 can appear as one logical interface, resulting in double speed. This
 273 is a native Linux kernel feature that is supported by most
 274 switches. If your nodes have multiple Ethernet ports, you can
 275 distribute your points of failure by running network cables to
 276 different switches and the bonded connection will failover to one
 277 cable or the other in case of network trouble.
 278
 279 Aggregated links can improve live-migration delays and improve the
 280 speed of replication of data between Proxmox VE Cluster nodes.
 281
 282 There are 7 modes for bonding:
 283
 284 * *Round-robin (balance-rr):* Transmit network packets in sequential
 285 order from the first available network interface (NIC) slave through
 286 the last. This mode provides load balancing and fault tolerance.
 287
 288 * *Active-backup (active-backup):* Only one NIC slave in the bond is
 289 active. A different slave becomes active if, and only if, the active
 290 slave fails. The single logical bonded interface's MAC address is
 291 externally visible on only one NIC (port) to avoid distortion in the
 292 network switch. This mode provides fault tolerance.
 293
 294 * *XOR (balance-xor):* Transmit network packets based on [(source MAC
 295 address XOR'd with destination MAC address) modulo NIC slave
 296 count]. This selects the same NIC slave for each destination MAC
 297 address. This mode provides load balancing and fault tolerance.
 298
 299 * *Broadcast (broadcast):* Transmit network packets on all slave
 300 network interfaces. This mode provides fault tolerance.
 301
 302 * *IEEE 802.3ad Dynamic link aggregation (802.3ad)(LACP):* Creates
 303 aggregation groups that share the same speed and duplex
 304 settings. Utilizes all slave network interfaces in the active
 305 aggregator group according to the 802.3ad specification.
 306
 307 * *Adaptive transmit load balancing (balance-tlb):* Linux bonding
 308 driver mode that does not require any special network-switch
 309 support. The outgoing network packet traffic is distributed according
 310 to the current load (computed relative to the speed) on each network
 311 interface slave. Incoming traffic is received by one currently
 312 designated slave network interface. If this receiving slave fails,
 313 another slave takes over the MAC address of the failed receiving
 314 slave.
 315
 316 * *Adaptive load balancing (balance-alb):* Includes balance-tlb plus receive
 317 load balancing (rlb) for IPV4 traffic, and does not require any
 318 special network switch support. The receive load balancing is achieved
 319 by ARP negotiation. The bonding driver intercepts the ARP Replies sent
 320 by the local system on their way out and overwrites the source
 321 hardware address with the unique hardware address of one of the NIC
 322 slaves in the single logical bonded interface such that different
 323 network-peers use different MAC addresses for their network packet
 324 traffic.
 325
 326 If your switch support the LACP (IEEE 802.3ad) protocol then we recommend using
 327 the corresponding bonding mode (802.3ad). Otherwise you should generally use the
 328 active-backup mode. +
 329 // http://lists.linux-ha.org/pipermail/linux-ha/2013-January/046295.html
 330 If you intend to run your cluster network on the bonding interfaces, then you
 331 have to use active-passive mode on the bonding interfaces, other modes are
 332 unsupported.
 333
 334 The following bond configuration can be used as distributed/shared
 335 storage network. The benefit would be that you get more speed and the
 336 network will be fault-tolerant.
 337
 338 .Example: Use bond with fixed IP address
 339 ----
 340 auto lo
 341 iface lo inet loopback
 342
 343 iface eno1 inet manual
 344
 345 iface eno2 inet manual
 346
 347 iface eno3 inet manual
 348
 349 auto bond0
 350 iface bond0 inet static
 351       bond-slaves eno1 eno2
 352       address  192.168.1.2/24
 353       bond-miimon 100
 354       bond-mode 802.3ad
 355       bond-xmit-hash-policy layer2+3
 356
 357 auto vmbr0
 358 iface vmbr0 inet static
 359         address  10.10.10.2/24
 360         gateway  10.10.10.1
 361         bridge-ports eno3
 362         bridge-stp off
 363         bridge-fd 0
 364
 365 ----
 366
 367
 368 [thumbnail="default-network-setup-bond.svg"]
 369 Another possibility it to use the bond directly as bridge port.
 370 This can be used to make the guest network fault-tolerant.
 371
 372 .Example: Use a bond as bridge port
 373 ----
 374 auto lo
 375 iface lo inet loopback
 376
 377 iface eno1 inet manual
 378
 379 iface eno2 inet manual
 380
 381 auto bond0
 382 iface bond0 inet manual
 383       bond-slaves eno1 eno2
 384       bond-miimon 100
 385       bond-mode 802.3ad
 386       bond-xmit-hash-policy layer2+3
 387
 388 auto vmbr0
 389 iface vmbr0 inet static
 390         address  10.10.10.2/24
 391         gateway  10.10.10.1
 392         bridge-ports bond0
 393         bridge-stp off
 394         bridge-fd 0
 395
 396 ----
 397
 398
 399 VLAN 802.1Q
 400 ~~~~~~~~~~~
 401
 402 A virtual LAN (VLAN) is a broadcast domain that is partitioned and
 403 isolated in the network at layer two.  So it is possible to have
 404 multiple networks (4096) in a physical network, each independent of
 405 the other ones.
 406
 407 Each VLAN network is identified by a number often called 'tag'.
 408 Network packages are then 'tagged' to identify which virtual network
 409 they belong to.
 410
 411
 412 VLAN for Guest Networks
 413 ^^^^^^^^^^^^^^^^^^^^^^^
 414
 415 {pve} supports this setup out of the box. You can specify the VLAN tag
 416 when you create a VM. The VLAN tag is part of the guest network
 417 configuration. The networking layer supports different modes to
 418 implement VLANs, depending on the bridge configuration:
 419
 420 * *VLAN awareness on the Linux bridge:*
 421 In this case, each guest's virtual network card is assigned to a VLAN tag,
 422 which is transparently supported by the Linux bridge.
 423 Trunk mode is also possible, but that makes configuration
 424 in the guest necessary.
 425
 426 * *"traditional" VLAN on the Linux bridge:*
 427 In contrast to the VLAN awareness method, this method is not transparent
 428 and creates a VLAN device with associated bridge for each VLAN.
 429 That is, creating a guest on VLAN 5 for example, would create two
 430 interfaces eno1.5 and vmbr0v5, which would remain until a reboot occurs.
 431
 432 * *Open vSwitch VLAN:*
 433 This mode uses the OVS VLAN feature.
 434
 435 * *Guest configured VLAN:*
 436 VLANs are assigned inside the guest. In this case, the setup is
 437 completely done inside the guest and can not be influenced from the
 438 outside. The benefit is that you can use more than one VLAN on a
 439 single virtual NIC.
 440
 441
 442 VLAN on the Host
 443 ^^^^^^^^^^^^^^^^
 444
 445 To allow host communication with an isolated network. It is possible
 446 to apply VLAN tags to any network device (NIC, Bond, Bridge). In
 447 general, you should configure the VLAN on the interface with the least
 448 abstraction layers between itself and the physical NIC.
 449
 450 For example, in a default configuration where you want to place
 451 the host management address on a separate VLAN.
 452
 453
 454 .Example: Use VLAN 5 for the {pve} management IP with traditional Linux bridge
 455 ----
 456 auto lo
 457 iface lo inet loopback
 458
 459 iface eno1 inet manual
 460
 461 iface eno1.5 inet manual
 462
 463 auto vmbr0v5
 464 iface vmbr0v5 inet static
 465         address  10.10.10.2/24
 466         gateway  10.10.10.1
 467         bridge-ports eno1.5
 468         bridge-stp off
 469         bridge-fd 0
 470
 471 auto vmbr0
 472 iface vmbr0 inet manual
 473         bridge-ports eno1
 474         bridge-stp off
 475         bridge-fd 0
 476
 477 ----
 478
 479 .Example: Use VLAN 5 for the {pve} management IP with VLAN aware Linux bridge
 480 ----
 481 auto lo
 482 iface lo inet loopback
 483
 484 iface eno1 inet manual
 485
 486
 487 auto vmbr0.5
 488 iface vmbr0.5 inet static
 489         address  10.10.10.2/24
 490         gateway  10.10.10.1
 491
 492 auto vmbr0
 493 iface vmbr0 inet manual
 494         bridge-ports eno1
 495         bridge-stp off
 496         bridge-fd 0
 497         bridge-vlan-aware yes
 498         bridge-vids 2-4094
 499 ----
 500
 501 The next example is the same setup but a bond is used to
 502 make this network fail-safe.
 503
 504 .Example: Use VLAN 5 with bond0 for the {pve} management IP with traditional Linux bridge
 505 ----
 506 auto lo
 507 iface lo inet loopback
 508
 509 iface eno1 inet manual
 510
 511 iface eno2 inet manual
 512
 513 auto bond0
 514 iface bond0 inet manual
 515       bond-slaves eno1 eno2
 516       bond-miimon 100
 517       bond-mode 802.3ad
 518       bond-xmit-hash-policy layer2+3
 519
 520 iface bond0.5 inet manual
 521
 522 auto vmbr0v5
 523 iface vmbr0v5 inet static
 524         address  10.10.10.2/24
 525         gateway  10.10.10.1
 526         bridge-ports bond0.5
 527         bridge-stp off
 528         bridge-fd 0
 529
 530 auto vmbr0
 531 iface vmbr0 inet manual
 532         bridge-ports bond0
 533         bridge-stp off
 534         bridge-fd 0
 535
 536 ----
 537
 538 Disabling IPv6 on the Node
 539 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 540
 541 {pve} works correctly in all environments, irrespective of whether IPv6 is
 542 deployed or not. We recommend leaving all settings at the provided defaults.
 543
 544 Should you still need to disable support for IPv6 on your node, do so by
 545 creating an appropriate `sysctl.conf (5)` snippet file and setting the proper
 546 https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt[sysctls],
 547 for example adding `/etc/sysctl.d/disable-ipv6.conf` with content:
 548
 549 ----
 550 net.ipv6.conf.all.disable_ipv6 = 1
 551 net.ipv6.conf.default.disable_ipv6 = 1
 552 ----
 553
 554 This method is preferred to disabling the loading of the IPv6 module on the
 555 https://www.kernel.org/doc/Documentation/networking/ipv6.rst[kernel commandline].
 556
 557 ////
 558 TODO: explain IPv6 support?
 559 TODO: explain OVS
 560 ////