pve-network.adoc

   1 [[sysadmin_network_configuration]]
   2 Network Configuration
   3 ---------------------
   4 ifdef::wiki[]
   5 :pve-toplevel:
   6 endif::wiki[]
   7
   8 Network configuration can be done either via the GUI, or by manually
   9 editing the file `/etc/network/interfaces`, which contains the
  10 whole network configuration. The  `interfaces(5)` manual page contains the
  11 complete format description. All {pve} tools try hard to keep direct
  12 user modifications, but using the GUI is still preferable, because it
  13 protects you from errors.
  14
  15 Once the network is configured, you can use the Debian traditional tools `ifup`
  16 and `ifdown` commands to bring interfaces up and down.
  17
  18 Apply Network Changes
  19 ~~~~~~~~~~~~~~~~~~~~~
  20
  21 {pve} does not write changes directly to `/etc/network/interfaces`. Instead, we
  22 write into a temporary file called `/etc/network/interfaces.new`, this way you
  23 can do many related changes at once. This also allows to ensure your changes
  24 are correct before applying, as a wrong network configuration may render a node
  25 inaccessible.
  26
  27 Reboot Node to apply
  28 ^^^^^^^^^^^^^^^^^^^^
  29
  30 With the default installed `ifupdown` network managing package you need to
  31 reboot to commit any pending network changes. Most of the time, the basic {pve}
  32 network setup is stable and does not change often, so rebooting should not be
  33 required often.
  34
  35 Reload Network with ifupdown2
  36 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  37
  38 With the optional `ifupdown2` network managing package you also can reload the
  39 network configuration live, without requiring a reboot.
  40
  41 Since {pve} 6.1 you can apply pending network changes over the web-interface,
  42 using the 'Apply Configuration' button in the 'Network' panel of a node.
  43
  44 To install 'ifupdown2' ensure you have the latest {pve} updates installed, then
  45
  46 WARNING: installing 'ifupdown2' will remove 'ifupdown', but as the removal
  47 scripts of 'ifupdown' before version '0.8.35+pve1' have a issue where network
  48 is fully stopped on removal footnote:[Introduced with Debian Buster:
  49 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=945877] you *must* ensure
  50 that you have a up to date 'ifupdown' package version.
  51
  52 For the installation itself you can then simply do:
  53
  54  apt install ifupdown2
  55
  56 With that you're all set. You can also switch back to the 'ifupdown' variant at
  57 any time, if you run into issues.
  58
  59 Naming Conventions
  60 ~~~~~~~~~~~~~~~~~~
  61
  62 We currently use the following naming conventions for device names:
  63
  64 * Ethernet devices: en*, systemd network interface names. This naming scheme is
  65  used for new {pve} installations since version 5.0.
  66
  67 * Ethernet devices: eth[N], where 0 ≤ N (`eth0`, `eth1`, ...) This naming
  68 scheme is used for {pve} hosts which were installed before the 5.0
  69 release. When upgrading to 5.0, the names are kept as-is.
  70
  71 * Bridge names: vmbr[N], where 0 ≤ N ≤ 4094 (`vmbr0` - `vmbr4094`)
  72
  73 * Bonds: bond[N], where 0 ≤ N (`bond0`, `bond1`, ...)
  74
  75 * VLANs: Simply add the VLAN number to the device name,
  76   separated by a period (`eno1.50`, `bond1.30`)
  77
  78 This makes it easier to debug networks problems, because the device
  79 name implies the device type.
  80
  81 Systemd Network Interface Names
  82 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  83
  84 Systemd uses the two character prefix 'en' for Ethernet network
  85 devices. The next characters depends on the device driver and the fact
  86 which schema matches first.
  87
  88 * o<index>[n<phys_port_name>|d<dev_port>] — devices on board
  89
  90 * s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — device by hotplug id
  91
  92 * [P<domain>]p<bus>s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — devices by bus id
  93
  94 * x<MAC> — device by MAC address
  95
  96 The most common patterns are:
  97
  98 * eno1 — is the first on board NIC
  99
 100 * enp3s0f1 — is the NIC on pcibus 3 slot 0 and use the NIC function 1.
 101
 102 For more information see https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/[Predictable Network Interface Names].
 103
 104 Choosing a network configuration
 105 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 106
 107 Depending on your current network organization and your resources you can
 108 choose either a bridged, routed, or masquerading networking setup.
 109
 110 {pve} server in a private LAN, using an external gateway to reach the internet
 111 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 112
 113 The *Bridged* model makes the most sense in this case, and this is also
 114 the default mode on new {pve} installations.
 115 Each of your Guest system will have a virtual interface attached to the
 116 {pve} bridge. This is similar in effect to having the Guest network card
 117 directly connected to a new switch on your LAN, the {pve} host playing the role
 118 of the switch.
 119
 120 {pve} server at hosting provider, with public IP ranges for Guests
 121 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 122
 123 For this setup, you can use either a *Bridged* or *Routed* model, depending on
 124 what your provider allows.
 125
 126 {pve} server at hosting provider, with a single public IP address
 127 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 128
 129 In that case the only way to get outgoing network accesses for your guest
 130 systems is to use *Masquerading*. For incoming network access to your guests,
 131 you will need to configure *Port Forwarding*.
 132
 133 For further flexibility, you can configure
 134 VLANs (IEEE 802.1q) and network bonding, also known as "link
 135 aggregation". That way it is possible to build complex and flexible
 136 virtual networks.
 137
 138 Default Configuration using a Bridge
 139 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 140
 141 [thumbnail="default-network-setup-bridge.svg"]
 142 Bridges are like physical network switches implemented in software.
 143 All virtual guests can share a single bridge, or you can create multiple
 144 bridges to separate network domains. Each host can have up to 4094 bridges.
 145
 146 The installation program creates a single bridge named `vmbr0`, which
 147 is connected to the first Ethernet card. The corresponding
 148 configuration in `/etc/network/interfaces` might look like this:
 149
 150 ----
 151 auto lo
 152 iface lo inet loopback
 153
 154 iface eno1 inet manual
 155
 156 auto vmbr0
 157 iface vmbr0 inet static
 158         address 192.168.10.2/24
 159         gateway 192.168.10.1
 160         bridge-ports eno1
 161         bridge-stp off
 162         bridge-fd 0
 163 ----
 164
 165 Virtual machines behave as if they were directly connected to the
 166 physical network. The network, in turn, sees each virtual machine as
 167 having its own MAC, even though there is only one network cable
 168 connecting all of these VMs to the network.
 169
 170 Routed Configuration
 171 ~~~~~~~~~~~~~~~~~~~~
 172
 173 Most hosting providers do not support the above setup. For security
 174 reasons, they disable networking as soon as they detect multiple MAC
 175 addresses on a single interface.
 176
 177 TIP: Some providers allow you to register additional MACs through their
 178 management interface. This avoids the problem, but can be clumsy to
 179 configure because you need to register a MAC for each of your VMs.
 180
 181 You can avoid the problem by ``routing'' all traffic via a single
 182 interface. This makes sure that all network packets use the same MAC
 183 address.
 184
 185 [thumbnail="default-network-setup-routed.svg"]
 186 A common scenario is that you have a public IP (assume `198.51.100.5`
 187 for this example), and an additional IP block for your VMs
 188 (`203.0.113.16/28`). We recommend the following setup for such
 189 situations:
 190
 191 ----
 192 auto lo
 193 iface lo inet loopback
 194
 195 auto eno0
 196 iface eno0 inet static
 197         address  198.51.100.5/29
 198         gateway  198.51.100.1
 199         post-up echo 1 > /proc/sys/net/ipv4/ip_forward
 200         post-up echo 1 > /proc/sys/net/ipv4/conf/eno1/proxy_arp
 201
 202
 203 auto vmbr0
 204 iface vmbr0 inet static
 205         address  203.0.113.17/28
 206         bridge-ports none
 207         bridge-stp off
 208         bridge-fd 0
 209 ----
 210
 211
 212 Masquerading (NAT) with `iptables`
 213 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 214
 215 Masquerading allows guests having only a private IP address to access the
 216 network by using the host IP address for outgoing traffic. Each outgoing
 217 packet is rewritten by `iptables` to appear as originating from the host,
 218 and responses are rewritten accordingly to be routed to the original sender.
 219
 220 ----
 221 auto lo
 222 iface lo inet loopback
 223
 224 auto eno1
 225 #real IP address
 226 iface eno1 inet static
 227         address  198.51.100.5/24
 228         gateway  198.51.100.1
 229
 230 auto vmbr0
 231 #private sub network
 232 iface vmbr0 inet static
 233         address  10.10.10.1/24
 234         bridge-ports none
 235         bridge-stp off
 236         bridge-fd 0
 237
 238         post-up   echo 1 > /proc/sys/net/ipv4/ip_forward
 239         post-up   iptables -t nat -A POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
 240         post-down iptables -t nat -D POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
 241 ----
 242
 243 NOTE: In some masquerade setups with firewall enabled, conntrack zones might be
 244 needed for outgoing connections. Otherwise the firewall could block outgoing
 245 connections since they will prefer the `POSTROUTING` of the VM bridge (and not
 246 `MASQUERADE`).
 247
 248 Adding these lines in the `/etc/network/interfaces` can fix this problem:
 249
 250 ----
 251 post-up   iptables -t raw -I PREROUTING -i fwbr+ -j CT --zone 1
 252 post-down iptables -t raw -D PREROUTING -i fwbr+ -j CT --zone 1
 253 ----
 254
 255 For more information about this, refer to the following links:
 256
 257 https://commons.wikimedia.org/wiki/File:Netfilter-packet-flow.svg[Netfilter Packet Flow]
 258
 259 https://lwn.net/Articles/370152/[Patch on netdev-list introducing conntrack zones]
 260
 261 https://blog.lobraun.de/2019/05/19/prox/[Blog post with a good explanation by using TRACE in the raw table]
 262
 263
 264
 265 Linux Bond
 266 ~~~~~~~~~~
 267
 268 Bonding (also called NIC teaming or Link Aggregation) is a technique
 269 for binding multiple NIC's to a single network device.  It is possible
 270 to achieve different goals, like make the network fault-tolerant,
 271 increase the performance or both together.
 272
 273 High-speed hardware like Fibre Channel and the associated switching
 274 hardware can be quite expensive. By doing link aggregation, two NICs
 275 can appear as one logical interface, resulting in double speed. This
 276 is a native Linux kernel feature that is supported by most
 277 switches. If your nodes have multiple Ethernet ports, you can
 278 distribute your points of failure by running network cables to
 279 different switches and the bonded connection will failover to one
 280 cable or the other in case of network trouble.
 281
 282 Aggregated links can improve live-migration delays and improve the
 283 speed of replication of data between Proxmox VE Cluster nodes.
 284
 285 There are 7 modes for bonding:
 286
 287 * *Round-robin (balance-rr):* Transmit network packets in sequential
 288 order from the first available network interface (NIC) slave through
 289 the last. This mode provides load balancing and fault tolerance.
 290
 291 * *Active-backup (active-backup):* Only one NIC slave in the bond is
 292 active. A different slave becomes active if, and only if, the active
 293 slave fails. The single logical bonded interface's MAC address is
 294 externally visible on only one NIC (port) to avoid distortion in the
 295 network switch. This mode provides fault tolerance.
 296
 297 * *XOR (balance-xor):* Transmit network packets based on [(source MAC
 298 address XOR'd with destination MAC address) modulo NIC slave
 299 count]. This selects the same NIC slave for each destination MAC
 300 address. This mode provides load balancing and fault tolerance.
 301
 302 * *Broadcast (broadcast):* Transmit network packets on all slave
 303 network interfaces. This mode provides fault tolerance.
 304
 305 * *IEEE 802.3ad Dynamic link aggregation (802.3ad)(LACP):* Creates
 306 aggregation groups that share the same speed and duplex
 307 settings. Utilizes all slave network interfaces in the active
 308 aggregator group according to the 802.3ad specification.
 309
 310 * *Adaptive transmit load balancing (balance-tlb):* Linux bonding
 311 driver mode that does not require any special network-switch
 312 support. The outgoing network packet traffic is distributed according
 313 to the current load (computed relative to the speed) on each network
 314 interface slave. Incoming traffic is received by one currently
 315 designated slave network interface. If this receiving slave fails,
 316 another slave takes over the MAC address of the failed receiving
 317 slave.
 318
 319 * *Adaptive load balancing (balance-alb):* Includes balance-tlb plus receive
 320 load balancing (rlb) for IPV4 traffic, and does not require any
 321 special network switch support. The receive load balancing is achieved
 322 by ARP negotiation. The bonding driver intercepts the ARP Replies sent
 323 by the local system on their way out and overwrites the source
 324 hardware address with the unique hardware address of one of the NIC
 325 slaves in the single logical bonded interface such that different
 326 network-peers use different MAC addresses for their network packet
 327 traffic.
 328
 329 If your switch support the LACP (IEEE 802.3ad) protocol then we recommend using
 330 the corresponding bonding mode (802.3ad). Otherwise you should generally use the
 331 active-backup mode. +
 332 // http://lists.linux-ha.org/pipermail/linux-ha/2013-January/046295.html
 333 If you intend to run your cluster network on the bonding interfaces, then you
 334 have to use active-passive mode on the bonding interfaces, other modes are
 335 unsupported.
 336
 337 The following bond configuration can be used as distributed/shared
 338 storage network. The benefit would be that you get more speed and the
 339 network will be fault-tolerant.
 340
 341 .Example: Use bond with fixed IP address
 342 ----
 343 auto lo
 344 iface lo inet loopback
 345
 346 iface eno1 inet manual
 347
 348 iface eno2 inet manual
 349
 350 iface eno3 inet manual
 351
 352 auto bond0
 353 iface bond0 inet static
 354       bond-slaves eno1 eno2
 355       address  192.168.1.2/24
 356       bond-miimon 100
 357       bond-mode 802.3ad
 358       bond-xmit-hash-policy layer2+3
 359
 360 auto vmbr0
 361 iface vmbr0 inet static
 362         address  10.10.10.2/24
 363         gateway  10.10.10.1
 364         bridge-ports eno3
 365         bridge-stp off
 366         bridge-fd 0
 367
 368 ----
 369
 370
 371 [thumbnail="default-network-setup-bond.svg"]
 372 Another possibility it to use the bond directly as bridge port.
 373 This can be used to make the guest network fault-tolerant.
 374
 375 .Example: Use a bond as bridge port
 376 ----
 377 auto lo
 378 iface lo inet loopback
 379
 380 iface eno1 inet manual
 381
 382 iface eno2 inet manual
 383
 384 auto bond0
 385 iface bond0 inet manual
 386       bond-slaves eno1 eno2
 387       bond-miimon 100
 388       bond-mode 802.3ad
 389       bond-xmit-hash-policy layer2+3
 390
 391 auto vmbr0
 392 iface vmbr0 inet static
 393         address  10.10.10.2/24
 394         gateway  10.10.10.1
 395         bridge-ports bond0
 396         bridge-stp off
 397         bridge-fd 0
 398
 399 ----
 400
 401
 402 VLAN 802.1Q
 403 ~~~~~~~~~~~
 404
 405 A virtual LAN (VLAN) is a broadcast domain that is partitioned and
 406 isolated in the network at layer two.  So it is possible to have
 407 multiple networks (4096) in a physical network, each independent of
 408 the other ones.
 409
 410 Each VLAN network is identified by a number often called 'tag'.
 411 Network packages are then 'tagged' to identify which virtual network
 412 they belong to.
 413
 414
 415 VLAN for Guest Networks
 416 ^^^^^^^^^^^^^^^^^^^^^^^
 417
 418 {pve} supports this setup out of the box. You can specify the VLAN tag
 419 when you create a VM. The VLAN tag is part of the guest network
 420 configuration. The networking layer supports different modes to
 421 implement VLANs, depending on the bridge configuration:
 422
 423 * *VLAN awareness on the Linux bridge:*
 424 In this case, each guest's virtual network card is assigned to a VLAN tag,
 425 which is transparently supported by the Linux bridge.
 426 Trunk mode is also possible, but that makes configuration
 427 in the guest necessary.
 428
 429 * *"traditional" VLAN on the Linux bridge:*
 430 In contrast to the VLAN awareness method, this method is not transparent
 431 and creates a VLAN device with associated bridge for each VLAN.
 432 That is, creating a guest on VLAN 5 for example, would create two
 433 interfaces eno1.5 and vmbr0v5, which would remain until a reboot occurs.
 434
 435 * *Open vSwitch VLAN:*
 436 This mode uses the OVS VLAN feature.
 437
 438 * *Guest configured VLAN:*
 439 VLANs are assigned inside the guest. In this case, the setup is
 440 completely done inside the guest and can not be influenced from the
 441 outside. The benefit is that you can use more than one VLAN on a
 442 single virtual NIC.
 443
 444
 445 VLAN on the Host
 446 ^^^^^^^^^^^^^^^^
 447
 448 To allow host communication with an isolated network. It is possible
 449 to apply VLAN tags to any network device (NIC, Bond, Bridge). In
 450 general, you should configure the VLAN on the interface with the least
 451 abstraction layers between itself and the physical NIC.
 452
 453 For example, in a default configuration where you want to place
 454 the host management address on a separate VLAN.
 455
 456
 457 .Example: Use VLAN 5 for the {pve} management IP with traditional Linux bridge
 458 ----
 459 auto lo
 460 iface lo inet loopback
 461
 462 iface eno1 inet manual
 463
 464 iface eno1.5 inet manual
 465
 466 auto vmbr0v5
 467 iface vmbr0v5 inet static
 468         address  10.10.10.2/24
 469         gateway  10.10.10.1
 470         bridge-ports eno1.5
 471         bridge-stp off
 472         bridge-fd 0
 473
 474 auto vmbr0
 475 iface vmbr0 inet manual
 476         bridge-ports eno1
 477         bridge-stp off
 478         bridge-fd 0
 479
 480 ----
 481
 482 .Example: Use VLAN 5 for the {pve} management IP with VLAN aware Linux bridge
 483 ----
 484 auto lo
 485 iface lo inet loopback
 486
 487 iface eno1 inet manual
 488
 489
 490 auto vmbr0.5
 491 iface vmbr0.5 inet static
 492         address  10.10.10.2/24
 493         gateway  10.10.10.1
 494
 495 auto vmbr0
 496 iface vmbr0 inet manual
 497         bridge-ports eno1
 498         bridge-stp off
 499         bridge-fd 0
 500         bridge-vlan-aware yes
 501         bridge-vids 2-4094
 502 ----
 503
 504 The next example is the same setup but a bond is used to
 505 make this network fail-safe.
 506
 507 .Example: Use VLAN 5 with bond0 for the {pve} management IP with traditional Linux bridge
 508 ----
 509 auto lo
 510 iface lo inet loopback
 511
 512 iface eno1 inet manual
 513
 514 iface eno2 inet manual
 515
 516 auto bond0
 517 iface bond0 inet manual
 518       bond-slaves eno1 eno2
 519       bond-miimon 100
 520       bond-mode 802.3ad
 521       bond-xmit-hash-policy layer2+3
 522
 523 iface bond0.5 inet manual
 524
 525 auto vmbr0v5
 526 iface vmbr0v5 inet static
 527         address  10.10.10.2/24
 528         gateway  10.10.10.1
 529         bridge-ports bond0.5
 530         bridge-stp off
 531         bridge-fd 0
 532
 533 auto vmbr0
 534 iface vmbr0 inet manual
 535         bridge-ports bond0
 536         bridge-stp off
 537         bridge-fd 0
 538
 539 ----
 540
 541 Disabling IPv6 on the Node
 542 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 543
 544 {pve} works correctly in all environments, irrespective of whether IPv6 is
 545 deployed or not. We recommend leaving all settings at the provided defaults.
 546
 547 Should you still need to disable support for IPv6 on your node, do so by
 548 creating an appropriate `sysctl.conf (5)` snippet file and setting the proper
 549 https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt[sysctls],
 550 for example adding `/etc/sysctl.d/disable-ipv6.conf` with content:
 551
 552 ----
 553 net.ipv6.conf.all.disable_ipv6 = 1
 554 net.ipv6.conf.default.disable_ipv6 = 1
 555 ----
 556
 557 This method is preferred to disabling the loading of the IPv6 module on the
 558 https://www.kernel.org/doc/Documentation/networking/ipv6.rst[kernel commandline].
 559
 560 ////
 561 TODO: explain IPv6 support?
 562 TODO: explain OVS
 563 ////