pve-network.adoc

   1 [[sysadmin_network_configuration]]
   2 Network Configuration
   3 ---------------------
   4 ifdef::wiki[]
   5 :pve-toplevel:
   6 endif::wiki[]
   7
   8 Network configuration can be done either via the GUI, or by manually
   9 editing the file `/etc/network/interfaces`, which contains the
  10 whole network configuration. The  `interfaces(5)` manual page contains the
  11 complete format description. All {pve} tools try hard to keep direct
  12 user modifications, but using the GUI is still preferable, because it
  13 protects you from errors.
  14
  15 WARNING: It's discourage to use the Debian traditional tools `ifup` and `ifdown`
  16 if unsure, as they have some pitfalls like interupting all guest traffic on
  17 `ifdown vmbrX` but not reconnecting those guest again when doing `ifup` on the
  18 same bridge later.
  19
  20 Apply Network Changes
  21 ~~~~~~~~~~~~~~~~~~~~~
  22
  23 {pve} does not write changes directly to `/etc/network/interfaces`. Instead, we
  24 write into a temporary file called `/etc/network/interfaces.new`, this way you
  25 can do many related changes at once. This also allows to ensure your changes
  26 are correct before applying, as a wrong network configuration may render a node
  27 inaccessible.
  28
  29 Reboot Node to apply
  30 ^^^^^^^^^^^^^^^^^^^^
  31
  32 With the default installed `ifupdown` network managing package you need to
  33 reboot to commit any pending network changes. Most of the time, the basic {pve}
  34 network setup is stable and does not change often, so rebooting should not be
  35 required often.
  36
  37 Reload Network with ifupdown2
  38 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  39
  40 With the optional `ifupdown2` network managing package you also can reload the
  41 network configuration live, without requiring a reboot.
  42
  43 Since {pve} 6.1 you can apply pending network changes over the web-interface,
  44 using the 'Apply Configuration' button in the 'Network' panel of a node.
  45
  46 To install 'ifupdown2' ensure you have the latest {pve} updates installed, then
  47
  48 WARNING: installing 'ifupdown2' will remove 'ifupdown', but as the removal
  49 scripts of 'ifupdown' before version '0.8.35+pve1' have a issue where network
  50 is fully stopped on removal footnote:[Introduced with Debian Buster:
  51 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=945877] you *must* ensure
  52 that you have a up to date 'ifupdown' package version.
  53
  54 For the installation itself you can then simply do:
  55
  56  apt install ifupdown2
  57
  58 With that you're all set. You can also switch back to the 'ifupdown' variant at
  59 any time, if you run into issues.
  60
  61 Naming Conventions
  62 ~~~~~~~~~~~~~~~~~~
  63
  64 We currently use the following naming conventions for device names:
  65
  66 * Ethernet devices: en*, systemd network interface names. This naming scheme is
  67  used for new {pve} installations since version 5.0.
  68
  69 * Ethernet devices: eth[N], where 0 ≤ N (`eth0`, `eth1`, ...) This naming
  70 scheme is used for {pve} hosts which were installed before the 5.0
  71 release. When upgrading to 5.0, the names are kept as-is.
  72
  73 * Bridge names: vmbr[N], where 0 ≤ N ≤ 4094 (`vmbr0` - `vmbr4094`)
  74
  75 * Bonds: bond[N], where 0 ≤ N (`bond0`, `bond1`, ...)
  76
  77 * VLANs: Simply add the VLAN number to the device name,
  78   separated by a period (`eno1.50`, `bond1.30`)
  79
  80 This makes it easier to debug networks problems, because the device
  81 name implies the device type.
  82
  83 Systemd Network Interface Names
  84 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  85
  86 Systemd uses the two character prefix 'en' for Ethernet network
  87 devices. The next characters depends on the device driver and the fact
  88 which schema matches first.
  89
  90 * o<index>[n<phys_port_name>|d<dev_port>] — devices on board
  91
  92 * s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — device by hotplug id
  93
  94 * [P<domain>]p<bus>s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — devices by bus id
  95
  96 * x<MAC> — device by MAC address
  97
  98 The most common patterns are:
  99
 100 * eno1 — is the first on board NIC
 101
 102 * enp3s0f1 — is the NIC on pcibus 3 slot 0 and use the NIC function 1.
 103
 104 For more information see https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/[Predictable Network Interface Names].
 105
 106 Choosing a network configuration
 107 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 108
 109 Depending on your current network organization and your resources you can
 110 choose either a bridged, routed, or masquerading networking setup.
 111
 112 {pve} server in a private LAN, using an external gateway to reach the internet
 113 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 114
 115 The *Bridged* model makes the most sense in this case, and this is also
 116 the default mode on new {pve} installations.
 117 Each of your Guest system will have a virtual interface attached to the
 118 {pve} bridge. This is similar in effect to having the Guest network card
 119 directly connected to a new switch on your LAN, the {pve} host playing the role
 120 of the switch.
 121
 122 {pve} server at hosting provider, with public IP ranges for Guests
 123 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 124
 125 For this setup, you can use either a *Bridged* or *Routed* model, depending on
 126 what your provider allows.
 127
 128 {pve} server at hosting provider, with a single public IP address
 129 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 130
 131 In that case the only way to get outgoing network accesses for your guest
 132 systems is to use *Masquerading*. For incoming network access to your guests,
 133 you will need to configure *Port Forwarding*.
 134
 135 For further flexibility, you can configure
 136 VLANs (IEEE 802.1q) and network bonding, also known as "link
 137 aggregation". That way it is possible to build complex and flexible
 138 virtual networks.
 139
 140 Default Configuration using a Bridge
 141 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 142
 143 [thumbnail="default-network-setup-bridge.svg"]
 144 Bridges are like physical network switches implemented in software.
 145 All virtual guests can share a single bridge, or you can create multiple
 146 bridges to separate network domains. Each host can have up to 4094 bridges.
 147
 148 The installation program creates a single bridge named `vmbr0`, which
 149 is connected to the first Ethernet card. The corresponding
 150 configuration in `/etc/network/interfaces` might look like this:
 151
 152 ----
 153 auto lo
 154 iface lo inet loopback
 155
 156 iface eno1 inet manual
 157
 158 auto vmbr0
 159 iface vmbr0 inet static
 160         address 192.168.10.2/24
 161         gateway 192.168.10.1
 162         bridge-ports eno1
 163         bridge-stp off
 164         bridge-fd 0
 165 ----
 166
 167 Virtual machines behave as if they were directly connected to the
 168 physical network. The network, in turn, sees each virtual machine as
 169 having its own MAC, even though there is only one network cable
 170 connecting all of these VMs to the network.
 171
 172 Routed Configuration
 173 ~~~~~~~~~~~~~~~~~~~~
 174
 175 Most hosting providers do not support the above setup. For security
 176 reasons, they disable networking as soon as they detect multiple MAC
 177 addresses on a single interface.
 178
 179 TIP: Some providers allow you to register additional MACs through their
 180 management interface. This avoids the problem, but can be clumsy to
 181 configure because you need to register a MAC for each of your VMs.
 182
 183 You can avoid the problem by ``routing'' all traffic via a single
 184 interface. This makes sure that all network packets use the same MAC
 185 address.
 186
 187 [thumbnail="default-network-setup-routed.svg"]
 188 A common scenario is that you have a public IP (assume `198.51.100.5`
 189 for this example), and an additional IP block for your VMs
 190 (`203.0.113.16/28`). We recommend the following setup for such
 191 situations:
 192
 193 ----
 194 auto lo
 195 iface lo inet loopback
 196
 197 auto eno0
 198 iface eno0 inet static
 199         address  198.51.100.5/29
 200         gateway  198.51.100.1
 201         post-up echo 1 > /proc/sys/net/ipv4/ip_forward
 202         post-up echo 1 > /proc/sys/net/ipv4/conf/eno0/proxy_arp
 203
 204
 205 auto vmbr0
 206 iface vmbr0 inet static
 207         address  203.0.113.17/28
 208         bridge-ports none
 209         bridge-stp off
 210         bridge-fd 0
 211 ----
 212
 213
 214 Masquerading (NAT) with `iptables`
 215 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 216
 217 Masquerading allows guests having only a private IP address to access the
 218 network by using the host IP address for outgoing traffic. Each outgoing
 219 packet is rewritten by `iptables` to appear as originating from the host,
 220 and responses are rewritten accordingly to be routed to the original sender.
 221
 222 ----
 223 auto lo
 224 iface lo inet loopback
 225
 226 auto eno1
 227 #real IP address
 228 iface eno1 inet static
 229         address  198.51.100.5/24
 230         gateway  198.51.100.1
 231
 232 auto vmbr0
 233 #private sub network
 234 iface vmbr0 inet static
 235         address  10.10.10.1/24
 236         bridge-ports none
 237         bridge-stp off
 238         bridge-fd 0
 239
 240         post-up   echo 1 > /proc/sys/net/ipv4/ip_forward
 241         post-up   iptables -t nat -A POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
 242         post-down iptables -t nat -D POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
 243 ----
 244
 245 NOTE: In some masquerade setups with firewall enabled, conntrack zones might be
 246 needed for outgoing connections. Otherwise the firewall could block outgoing
 247 connections since they will prefer the `POSTROUTING` of the VM bridge (and not
 248 `MASQUERADE`).
 249
 250 Adding these lines in the `/etc/network/interfaces` can fix this problem:
 251
 252 ----
 253 post-up   iptables -t raw -I PREROUTING -i fwbr+ -j CT --zone 1
 254 post-down iptables -t raw -D PREROUTING -i fwbr+ -j CT --zone 1
 255 ----
 256
 257 For more information about this, refer to the following links:
 258
 259 https://commons.wikimedia.org/wiki/File:Netfilter-packet-flow.svg[Netfilter Packet Flow]
 260
 261 https://lwn.net/Articles/370152/[Patch on netdev-list introducing conntrack zones]
 262
 263 https://blog.lobraun.de/2019/05/19/prox/[Blog post with a good explanation by using TRACE in the raw table]
 264
 265
 266
 267 Linux Bond
 268 ~~~~~~~~~~
 269
 270 Bonding (also called NIC teaming or Link Aggregation) is a technique
 271 for binding multiple NIC's to a single network device.  It is possible
 272 to achieve different goals, like make the network fault-tolerant,
 273 increase the performance or both together.
 274
 275 High-speed hardware like Fibre Channel and the associated switching
 276 hardware can be quite expensive. By doing link aggregation, two NICs
 277 can appear as one logical interface, resulting in double speed. This
 278 is a native Linux kernel feature that is supported by most
 279 switches. If your nodes have multiple Ethernet ports, you can
 280 distribute your points of failure by running network cables to
 281 different switches and the bonded connection will failover to one
 282 cable or the other in case of network trouble.
 283
 284 Aggregated links can improve live-migration delays and improve the
 285 speed of replication of data between Proxmox VE Cluster nodes.
 286
 287 There are 7 modes for bonding:
 288
 289 * *Round-robin (balance-rr):* Transmit network packets in sequential
 290 order from the first available network interface (NIC) slave through
 291 the last. This mode provides load balancing and fault tolerance.
 292
 293 * *Active-backup (active-backup):* Only one NIC slave in the bond is
 294 active. A different slave becomes active if, and only if, the active
 295 slave fails. The single logical bonded interface's MAC address is
 296 externally visible on only one NIC (port) to avoid distortion in the
 297 network switch. This mode provides fault tolerance.
 298
 299 * *XOR (balance-xor):* Transmit network packets based on [(source MAC
 300 address XOR'd with destination MAC address) modulo NIC slave
 301 count]. This selects the same NIC slave for each destination MAC
 302 address. This mode provides load balancing and fault tolerance.
 303
 304 * *Broadcast (broadcast):* Transmit network packets on all slave
 305 network interfaces. This mode provides fault tolerance.
 306
 307 * *IEEE 802.3ad Dynamic link aggregation (802.3ad)(LACP):* Creates
 308 aggregation groups that share the same speed and duplex
 309 settings. Utilizes all slave network interfaces in the active
 310 aggregator group according to the 802.3ad specification.
 311
 312 * *Adaptive transmit load balancing (balance-tlb):* Linux bonding
 313 driver mode that does not require any special network-switch
 314 support. The outgoing network packet traffic is distributed according
 315 to the current load (computed relative to the speed) on each network
 316 interface slave. Incoming traffic is received by one currently
 317 designated slave network interface. If this receiving slave fails,
 318 another slave takes over the MAC address of the failed receiving
 319 slave.
 320
 321 * *Adaptive load balancing (balance-alb):* Includes balance-tlb plus receive
 322 load balancing (rlb) for IPV4 traffic, and does not require any
 323 special network switch support. The receive load balancing is achieved
 324 by ARP negotiation. The bonding driver intercepts the ARP Replies sent
 325 by the local system on their way out and overwrites the source
 326 hardware address with the unique hardware address of one of the NIC
 327 slaves in the single logical bonded interface such that different
 328 network-peers use different MAC addresses for their network packet
 329 traffic.
 330
 331 If your switch support the LACP (IEEE 802.3ad) protocol then we recommend using
 332 the corresponding bonding mode (802.3ad). Otherwise you should generally use the
 333 active-backup mode. +
 334 // http://lists.linux-ha.org/pipermail/linux-ha/2013-January/046295.html
 335 If you intend to run your cluster network on the bonding interfaces, then you
 336 have to use active-passive mode on the bonding interfaces, other modes are
 337 unsupported.
 338
 339 The following bond configuration can be used as distributed/shared
 340 storage network. The benefit would be that you get more speed and the
 341 network will be fault-tolerant.
 342
 343 .Example: Use bond with fixed IP address
 344 ----
 345 auto lo
 346 iface lo inet loopback
 347
 348 iface eno1 inet manual
 349
 350 iface eno2 inet manual
 351
 352 iface eno3 inet manual
 353
 354 auto bond0
 355 iface bond0 inet static
 356       bond-slaves eno1 eno2
 357       address  192.168.1.2/24
 358       bond-miimon 100
 359       bond-mode 802.3ad
 360       bond-xmit-hash-policy layer2+3
 361
 362 auto vmbr0
 363 iface vmbr0 inet static
 364         address  10.10.10.2/24
 365         gateway  10.10.10.1
 366         bridge-ports eno3
 367         bridge-stp off
 368         bridge-fd 0
 369
 370 ----
 371
 372
 373 [thumbnail="default-network-setup-bond.svg"]
 374 Another possibility it to use the bond directly as bridge port.
 375 This can be used to make the guest network fault-tolerant.
 376
 377 .Example: Use a bond as bridge port
 378 ----
 379 auto lo
 380 iface lo inet loopback
 381
 382 iface eno1 inet manual
 383
 384 iface eno2 inet manual
 385
 386 auto bond0
 387 iface bond0 inet manual
 388       bond-slaves eno1 eno2
 389       bond-miimon 100
 390       bond-mode 802.3ad
 391       bond-xmit-hash-policy layer2+3
 392
 393 auto vmbr0
 394 iface vmbr0 inet static
 395         address  10.10.10.2/24
 396         gateway  10.10.10.1
 397         bridge-ports bond0
 398         bridge-stp off
 399         bridge-fd 0
 400
 401 ----
 402
 403
 404 VLAN 802.1Q
 405 ~~~~~~~~~~~
 406
 407 A virtual LAN (VLAN) is a broadcast domain that is partitioned and
 408 isolated in the network at layer two.  So it is possible to have
 409 multiple networks (4096) in a physical network, each independent of
 410 the other ones.
 411
 412 Each VLAN network is identified by a number often called 'tag'.
 413 Network packages are then 'tagged' to identify which virtual network
 414 they belong to.
 415
 416
 417 VLAN for Guest Networks
 418 ^^^^^^^^^^^^^^^^^^^^^^^
 419
 420 {pve} supports this setup out of the box. You can specify the VLAN tag
 421 when you create a VM. The VLAN tag is part of the guest network
 422 configuration. The networking layer supports different modes to
 423 implement VLANs, depending on the bridge configuration:
 424
 425 * *VLAN awareness on the Linux bridge:*
 426 In this case, each guest's virtual network card is assigned to a VLAN tag,
 427 which is transparently supported by the Linux bridge.
 428 Trunk mode is also possible, but that makes configuration
 429 in the guest necessary.
 430
 431 * *"traditional" VLAN on the Linux bridge:*
 432 In contrast to the VLAN awareness method, this method is not transparent
 433 and creates a VLAN device with associated bridge for each VLAN.
 434 That is, creating a guest on VLAN 5 for example, would create two
 435 interfaces eno1.5 and vmbr0v5, which would remain until a reboot occurs.
 436
 437 * *Open vSwitch VLAN:*
 438 This mode uses the OVS VLAN feature.
 439
 440 * *Guest configured VLAN:*
 441 VLANs are assigned inside the guest. In this case, the setup is
 442 completely done inside the guest and can not be influenced from the
 443 outside. The benefit is that you can use more than one VLAN on a
 444 single virtual NIC.
 445
 446
 447 VLAN on the Host
 448 ^^^^^^^^^^^^^^^^
 449
 450 To allow host communication with an isolated network. It is possible
 451 to apply VLAN tags to any network device (NIC, Bond, Bridge). In
 452 general, you should configure the VLAN on the interface with the least
 453 abstraction layers between itself and the physical NIC.
 454
 455 For example, in a default configuration where you want to place
 456 the host management address on a separate VLAN.
 457
 458
 459 .Example: Use VLAN 5 for the {pve} management IP with traditional Linux bridge
 460 ----
 461 auto lo
 462 iface lo inet loopback
 463
 464 iface eno1 inet manual
 465
 466 iface eno1.5 inet manual
 467
 468 auto vmbr0v5
 469 iface vmbr0v5 inet static
 470         address  10.10.10.2/24
 471         gateway  10.10.10.1
 472         bridge-ports eno1.5
 473         bridge-stp off
 474         bridge-fd 0
 475
 476 auto vmbr0
 477 iface vmbr0 inet manual
 478         bridge-ports eno1
 479         bridge-stp off
 480         bridge-fd 0
 481
 482 ----
 483
 484 .Example: Use VLAN 5 for the {pve} management IP with VLAN aware Linux bridge
 485 ----
 486 auto lo
 487 iface lo inet loopback
 488
 489 iface eno1 inet manual
 490
 491
 492 auto vmbr0.5
 493 iface vmbr0.5 inet static
 494         address  10.10.10.2/24
 495         gateway  10.10.10.1
 496
 497 auto vmbr0
 498 iface vmbr0 inet manual
 499         bridge-ports eno1
 500         bridge-stp off
 501         bridge-fd 0
 502         bridge-vlan-aware yes
 503         bridge-vids 2-4094
 504 ----
 505
 506 The next example is the same setup but a bond is used to
 507 make this network fail-safe.
 508
 509 .Example: Use VLAN 5 with bond0 for the {pve} management IP with traditional Linux bridge
 510 ----
 511 auto lo
 512 iface lo inet loopback
 513
 514 iface eno1 inet manual
 515
 516 iface eno2 inet manual
 517
 518 auto bond0
 519 iface bond0 inet manual
 520       bond-slaves eno1 eno2
 521       bond-miimon 100
 522       bond-mode 802.3ad
 523       bond-xmit-hash-policy layer2+3
 524
 525 iface bond0.5 inet manual
 526
 527 auto vmbr0v5
 528 iface vmbr0v5 inet static
 529         address  10.10.10.2/24
 530         gateway  10.10.10.1
 531         bridge-ports bond0.5
 532         bridge-stp off
 533         bridge-fd 0
 534
 535 auto vmbr0
 536 iface vmbr0 inet manual
 537         bridge-ports bond0
 538         bridge-stp off
 539         bridge-fd 0
 540
 541 ----
 542
 543 Disabling IPv6 on the Node
 544 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 545
 546 {pve} works correctly in all environments, irrespective of whether IPv6 is
 547 deployed or not. We recommend leaving all settings at the provided defaults.
 548
 549 Should you still need to disable support for IPv6 on your node, do so by
 550 creating an appropriate `sysctl.conf (5)` snippet file and setting the proper
 551 https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt[sysctls],
 552 for example adding `/etc/sysctl.d/disable-ipv6.conf` with content:
 553
 554 ----
 555 net.ipv6.conf.all.disable_ipv6 = 1
 556 net.ipv6.conf.default.disable_ipv6 = 1
 557 ----
 558
 559 This method is preferred to disabling the loading of the IPv6 module on the
 560 https://www.kernel.org/doc/Documentation/networking/ipv6.rst[kernel commandline].
 561
 562 ////
 563 TODO: explain IPv6 support?
 564 TODO: explain OVS
 565 ////