]> git.proxmox.com Git - pve-docs.git/blame_incremental - pve-network.adoc
fix #5429: network: override device names: include Type=ether
[pve-docs.git] / pve-network.adoc
... / ...
CommitLineData
1[[sysadmin_network_configuration]]
2Network Configuration
3---------------------
4ifdef::wiki[]
5:pve-toplevel:
6endif::wiki[]
7
8{pve} is using the Linux network stack. This provides a lot of flexibility on
9how to set up the network on the {pve} nodes. The configuration can be done
10either via the GUI, or by manually editing the file `/etc/network/interfaces`,
11which contains the whole network configuration. The `interfaces(5)` manual
12page contains the complete format description. All {pve} tools try hard to keep
13direct user modifications, but using the GUI is still preferable, because it
14protects you from errors.
15
16A 'vmbr' interface is needed to connect guests to the underlying physical
17network. They are a Linux bridge which can be thought of as a virtual switch
18to which the guests and physical interfaces are connected to. This section
19provides some examples on how the network can be set up to accomodate different
20use cases like redundancy with a xref:sysadmin_network_bond['bond'],
21xref:sysadmin_network_vlan['vlans'] or
22xref:sysadmin_network_routed['routed'] and
23xref:sysadmin_network_masquerading['NAT'] setups.
24
25The xref:chapter_pvesdn[Software Defined Network] is an option for more complex
26virtual networks in {pve} clusters.
27
28WARNING: It's discourage to use the Debian traditional tools `ifup` and `ifdown`
29if unsure, as they have some pitfalls like interupting all guest traffic on
30`ifdown vmbrX` but not reconnecting those guest again when doing `ifup` on the
31same bridge later.
32
33Apply Network Changes
34~~~~~~~~~~~~~~~~~~~~~
35
36{pve} does not write changes directly to `/etc/network/interfaces`. Instead, we
37write into a temporary file called `/etc/network/interfaces.new`, this way you
38can do many related changes at once. This also allows to ensure your changes
39are correct before applying, as a wrong network configuration may render a node
40inaccessible.
41
42Live-Reload Network with ifupdown2
43^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
44
45With the recommended 'ifupdown2' package (default for new installations since
46{pve} 7.0), it is possible to apply network configuration changes without a
47reboot. If you change the network configuration via the GUI, you can click the
48'Apply Configuration' button. This will move changes from the staging
49`interfaces.new` file to `/etc/network/interfaces` and apply them live.
50
51If you made manual changes directly to the `/etc/network/interfaces` file, you
52can apply them by running `ifreload -a`
53
54NOTE: If you installed {pve} on top of Debian, or upgraded to {pve} 7.0 from an
55older {pve} installation, make sure 'ifupdown2' is installed: `apt install
56ifupdown2`
57
58Reboot Node to Apply
59^^^^^^^^^^^^^^^^^^^^
60
61Another way to apply a new network configuration is to reboot the node.
62In that case the systemd service `pvenetcommit` will activate the staging
63`interfaces.new` file before the `networking` service will apply that
64configuration.
65
66Naming Conventions
67~~~~~~~~~~~~~~~~~~
68
69We currently use the following naming conventions for device names:
70
71* Ethernet devices: en*, systemd network interface names. This naming scheme is
72 used for new {pve} installations since version 5.0.
73
74* Ethernet devices: eth[N], where 0 ≤ N (`eth0`, `eth1`, ...) This naming
75scheme is used for {pve} hosts which were installed before the 5.0
76release. When upgrading to 5.0, the names are kept as-is.
77
78* Bridge names: vmbr[N], where 0 ≤ N ≤ 4094 (`vmbr0` - `vmbr4094`)
79
80* Bonds: bond[N], where 0 ≤ N (`bond0`, `bond1`, ...)
81
82* VLANs: Simply add the VLAN number to the device name,
83 separated by a period (`eno1.50`, `bond1.30`)
84
85This makes it easier to debug networks problems, because the device
86name implies the device type.
87
88Systemd Network Interface Names
89^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
90
91Systemd uses the two character prefix 'en' for Ethernet network
92devices. The next characters depends on the device driver and the fact
93which schema matches first.
94
95* o<index>[n<phys_port_name>|d<dev_port>] — devices on board
96
97* s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — device by hotplug id
98
99* [P<domain>]p<bus>s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — devices by bus id
100
101* x<MAC> — device by MAC address
102
103The most common patterns are:
104
105* eno1 — is the first on board NIC
106
107* enp3s0f1 — is the NIC on pcibus 3 slot 0 and use the NIC function 1.
108
109For more information see https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/[Predictable Network Interface Names].
110
111Choosing a network configuration
112~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
113
114Depending on your current network organization and your resources you can
115choose either a bridged, routed, or masquerading networking setup.
116
117{pve} server in a private LAN, using an external gateway to reach the internet
118^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
119
120The *Bridged* model makes the most sense in this case, and this is also
121the default mode on new {pve} installations.
122Each of your Guest system will have a virtual interface attached to the
123{pve} bridge. This is similar in effect to having the Guest network card
124directly connected to a new switch on your LAN, the {pve} host playing the role
125of the switch.
126
127{pve} server at hosting provider, with public IP ranges for Guests
128^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
129
130For this setup, you can use either a *Bridged* or *Routed* model, depending on
131what your provider allows.
132
133{pve} server at hosting provider, with a single public IP address
134^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
135
136In that case the only way to get outgoing network accesses for your guest
137systems is to use *Masquerading*. For incoming network access to your guests,
138you will need to configure *Port Forwarding*.
139
140For further flexibility, you can configure
141VLANs (IEEE 802.1q) and network bonding, also known as "link
142aggregation". That way it is possible to build complex and flexible
143virtual networks.
144
145Default Configuration using a Bridge
146~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
147
148[thumbnail="default-network-setup-bridge.svg"]
149Bridges are like physical network switches implemented in software.
150All virtual guests can share a single bridge, or you can create multiple
151bridges to separate network domains. Each host can have up to 4094 bridges.
152
153The installation program creates a single bridge named `vmbr0`, which
154is connected to the first Ethernet card. The corresponding
155configuration in `/etc/network/interfaces` might look like this:
156
157----
158auto lo
159iface lo inet loopback
160
161iface eno1 inet manual
162
163auto vmbr0
164iface vmbr0 inet static
165 address 192.168.10.2/24
166 gateway 192.168.10.1
167 bridge-ports eno1
168 bridge-stp off
169 bridge-fd 0
170----
171
172Virtual machines behave as if they were directly connected to the
173physical network. The network, in turn, sees each virtual machine as
174having its own MAC, even though there is only one network cable
175connecting all of these VMs to the network.
176
177[[sysadmin_network_routed]]
178Routed Configuration
179~~~~~~~~~~~~~~~~~~~~
180
181Most hosting providers do not support the above setup. For security
182reasons, they disable networking as soon as they detect multiple MAC
183addresses on a single interface.
184
185TIP: Some providers allow you to register additional MACs through their
186management interface. This avoids the problem, but can be clumsy to
187configure because you need to register a MAC for each of your VMs.
188
189You can avoid the problem by ``routing'' all traffic via a single
190interface. This makes sure that all network packets use the same MAC
191address.
192
193[thumbnail="default-network-setup-routed.svg"]
194A common scenario is that you have a public IP (assume `198.51.100.5`
195for this example), and an additional IP block for your VMs
196(`203.0.113.16/28`). We recommend the following setup for such
197situations:
198
199----
200auto lo
201iface lo inet loopback
202
203auto eno0
204iface eno0 inet static
205 address 198.51.100.5/29
206 gateway 198.51.100.1
207 post-up echo 1 > /proc/sys/net/ipv4/ip_forward
208 post-up echo 1 > /proc/sys/net/ipv4/conf/eno0/proxy_arp
209
210
211auto vmbr0
212iface vmbr0 inet static
213 address 203.0.113.17/28
214 bridge-ports none
215 bridge-stp off
216 bridge-fd 0
217----
218
219
220[[sysadmin_network_masquerading]]
221Masquerading (NAT) with `iptables`
222~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
223
224Masquerading allows guests having only a private IP address to access the
225network by using the host IP address for outgoing traffic. Each outgoing
226packet is rewritten by `iptables` to appear as originating from the host,
227and responses are rewritten accordingly to be routed to the original sender.
228
229----
230auto lo
231iface lo inet loopback
232
233auto eno1
234#real IP address
235iface eno1 inet static
236 address 198.51.100.5/24
237 gateway 198.51.100.1
238
239auto vmbr0
240#private sub network
241iface vmbr0 inet static
242 address 10.10.10.1/24
243 bridge-ports none
244 bridge-stp off
245 bridge-fd 0
246
247 post-up echo 1 > /proc/sys/net/ipv4/ip_forward
248 post-up iptables -t nat -A POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
249 post-down iptables -t nat -D POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
250----
251
252NOTE: In some masquerade setups with firewall enabled, conntrack zones might be
253needed for outgoing connections. Otherwise the firewall could block outgoing
254connections since they will prefer the `POSTROUTING` of the VM bridge (and not
255`MASQUERADE`).
256
257Adding these lines in the `/etc/network/interfaces` can fix this problem:
258
259----
260post-up iptables -t raw -I PREROUTING -i fwbr+ -j CT --zone 1
261post-down iptables -t raw -D PREROUTING -i fwbr+ -j CT --zone 1
262----
263
264For more information about this, refer to the following links:
265
266https://commons.wikimedia.org/wiki/File:Netfilter-packet-flow.svg[Netfilter Packet Flow]
267
268https://lwn.net/Articles/370152/[Patch on netdev-list introducing conntrack zones]
269
270https://blog.lobraun.de/2019/05/19/prox/[Blog post with a good explanation by using TRACE in the raw table]
271
272
273[[sysadmin_network_bond]]
274Linux Bond
275~~~~~~~~~~
276
277Bonding (also called NIC teaming or Link Aggregation) is a technique
278for binding multiple NIC's to a single network device. It is possible
279to achieve different goals, like make the network fault-tolerant,
280increase the performance or both together.
281
282High-speed hardware like Fibre Channel and the associated switching
283hardware can be quite expensive. By doing link aggregation, two NICs
284can appear as one logical interface, resulting in double speed. This
285is a native Linux kernel feature that is supported by most
286switches. If your nodes have multiple Ethernet ports, you can
287distribute your points of failure by running network cables to
288different switches and the bonded connection will failover to one
289cable or the other in case of network trouble.
290
291Aggregated links can improve live-migration delays and improve the
292speed of replication of data between Proxmox VE Cluster nodes.
293
294There are 7 modes for bonding:
295
296* *Round-robin (balance-rr):* Transmit network packets in sequential
297order from the first available network interface (NIC) slave through
298the last. This mode provides load balancing and fault tolerance.
299
300* *Active-backup (active-backup):* Only one NIC slave in the bond is
301active. A different slave becomes active if, and only if, the active
302slave fails. The single logical bonded interface's MAC address is
303externally visible on only one NIC (port) to avoid distortion in the
304network switch. This mode provides fault tolerance.
305
306* *XOR (balance-xor):* Transmit network packets based on [(source MAC
307address XOR'd with destination MAC address) modulo NIC slave
308count]. This selects the same NIC slave for each destination MAC
309address. This mode provides load balancing and fault tolerance.
310
311* *Broadcast (broadcast):* Transmit network packets on all slave
312network interfaces. This mode provides fault tolerance.
313
314* *IEEE 802.3ad Dynamic link aggregation (802.3ad)(LACP):* Creates
315aggregation groups that share the same speed and duplex
316settings. Utilizes all slave network interfaces in the active
317aggregator group according to the 802.3ad specification.
318
319* *Adaptive transmit load balancing (balance-tlb):* Linux bonding
320driver mode that does not require any special network-switch
321support. The outgoing network packet traffic is distributed according
322to the current load (computed relative to the speed) on each network
323interface slave. Incoming traffic is received by one currently
324designated slave network interface. If this receiving slave fails,
325another slave takes over the MAC address of the failed receiving
326slave.
327
328* *Adaptive load balancing (balance-alb):* Includes balance-tlb plus receive
329load balancing (rlb) for IPV4 traffic, and does not require any
330special network switch support. The receive load balancing is achieved
331by ARP negotiation. The bonding driver intercepts the ARP Replies sent
332by the local system on their way out and overwrites the source
333hardware address with the unique hardware address of one of the NIC
334slaves in the single logical bonded interface such that different
335network-peers use different MAC addresses for their network packet
336traffic.
337
338If your switch support the LACP (IEEE 802.3ad) protocol then we recommend using
339the corresponding bonding mode (802.3ad). Otherwise you should generally use the
340active-backup mode. +
341// http://lists.linux-ha.org/pipermail/linux-ha/2013-January/046295.html
342If you intend to run your cluster network on the bonding interfaces, then you
343have to use active-passive mode on the bonding interfaces, other modes are
344unsupported.
345
346The following bond configuration can be used as distributed/shared
347storage network. The benefit would be that you get more speed and the
348network will be fault-tolerant.
349
350.Example: Use bond with fixed IP address
351----
352auto lo
353iface lo inet loopback
354
355iface eno1 inet manual
356
357iface eno2 inet manual
358
359iface eno3 inet manual
360
361auto bond0
362iface bond0 inet static
363 bond-slaves eno1 eno2
364 address 192.168.1.2/24
365 bond-miimon 100
366 bond-mode 802.3ad
367 bond-xmit-hash-policy layer2+3
368
369auto vmbr0
370iface vmbr0 inet static
371 address 10.10.10.2/24
372 gateway 10.10.10.1
373 bridge-ports eno3
374 bridge-stp off
375 bridge-fd 0
376
377----
378
379
380[thumbnail="default-network-setup-bond.svg"]
381Another possibility it to use the bond directly as bridge port.
382This can be used to make the guest network fault-tolerant.
383
384.Example: Use a bond as bridge port
385----
386auto lo
387iface lo inet loopback
388
389iface eno1 inet manual
390
391iface eno2 inet manual
392
393auto bond0
394iface bond0 inet manual
395 bond-slaves eno1 eno2
396 bond-miimon 100
397 bond-mode 802.3ad
398 bond-xmit-hash-policy layer2+3
399
400auto vmbr0
401iface vmbr0 inet static
402 address 10.10.10.2/24
403 gateway 10.10.10.1
404 bridge-ports bond0
405 bridge-stp off
406 bridge-fd 0
407
408----
409
410
411[[sysadmin_network_vlan]]
412VLAN 802.1Q
413~~~~~~~~~~~
414
415A virtual LAN (VLAN) is a broadcast domain that is partitioned and
416isolated in the network at layer two. So it is possible to have
417multiple networks (4096) in a physical network, each independent of
418the other ones.
419
420Each VLAN network is identified by a number often called 'tag'.
421Network packages are then 'tagged' to identify which virtual network
422they belong to.
423
424
425VLAN for Guest Networks
426^^^^^^^^^^^^^^^^^^^^^^^
427
428{pve} supports this setup out of the box. You can specify the VLAN tag
429when you create a VM. The VLAN tag is part of the guest network
430configuration. The networking layer supports different modes to
431implement VLANs, depending on the bridge configuration:
432
433* *VLAN awareness on the Linux bridge:*
434In this case, each guest's virtual network card is assigned to a VLAN tag,
435which is transparently supported by the Linux bridge.
436Trunk mode is also possible, but that makes configuration
437in the guest necessary.
438
439* *"traditional" VLAN on the Linux bridge:*
440In contrast to the VLAN awareness method, this method is not transparent
441and creates a VLAN device with associated bridge for each VLAN.
442That is, creating a guest on VLAN 5 for example, would create two
443interfaces eno1.5 and vmbr0v5, which would remain until a reboot occurs.
444
445* *Open vSwitch VLAN:*
446This mode uses the OVS VLAN feature.
447
448* *Guest configured VLAN:*
449VLANs are assigned inside the guest. In this case, the setup is
450completely done inside the guest and can not be influenced from the
451outside. The benefit is that you can use more than one VLAN on a
452single virtual NIC.
453
454
455VLAN on the Host
456^^^^^^^^^^^^^^^^
457
458To allow host communication with an isolated network. It is possible
459to apply VLAN tags to any network device (NIC, Bond, Bridge). In
460general, you should configure the VLAN on the interface with the least
461abstraction layers between itself and the physical NIC.
462
463For example, in a default configuration where you want to place
464the host management address on a separate VLAN.
465
466
467.Example: Use VLAN 5 for the {pve} management IP with traditional Linux bridge
468----
469auto lo
470iface lo inet loopback
471
472iface eno1 inet manual
473
474iface eno1.5 inet manual
475
476auto vmbr0v5
477iface vmbr0v5 inet static
478 address 10.10.10.2/24
479 gateway 10.10.10.1
480 bridge-ports eno1.5
481 bridge-stp off
482 bridge-fd 0
483
484auto vmbr0
485iface vmbr0 inet manual
486 bridge-ports eno1
487 bridge-stp off
488 bridge-fd 0
489
490----
491
492.Example: Use VLAN 5 for the {pve} management IP with VLAN aware Linux bridge
493----
494auto lo
495iface lo inet loopback
496
497iface eno1 inet manual
498
499
500auto vmbr0.5
501iface vmbr0.5 inet static
502 address 10.10.10.2/24
503 gateway 10.10.10.1
504
505auto vmbr0
506iface vmbr0 inet manual
507 bridge-ports eno1
508 bridge-stp off
509 bridge-fd 0
510 bridge-vlan-aware yes
511 bridge-vids 2-4094
512----
513
514The next example is the same setup but a bond is used to
515make this network fail-safe.
516
517.Example: Use VLAN 5 with bond0 for the {pve} management IP with traditional Linux bridge
518----
519auto lo
520iface lo inet loopback
521
522iface eno1 inet manual
523
524iface eno2 inet manual
525
526auto bond0
527iface bond0 inet manual
528 bond-slaves eno1 eno2
529 bond-miimon 100
530 bond-mode 802.3ad
531 bond-xmit-hash-policy layer2+3
532
533iface bond0.5 inet manual
534
535auto vmbr0v5
536iface vmbr0v5 inet static
537 address 10.10.10.2/24
538 gateway 10.10.10.1
539 bridge-ports bond0.5
540 bridge-stp off
541 bridge-fd 0
542
543auto vmbr0
544iface vmbr0 inet manual
545 bridge-ports bond0
546 bridge-stp off
547 bridge-fd 0
548
549----
550
551Disabling IPv6 on the Node
552~~~~~~~~~~~~~~~~~~~~~~~~~~
553
554{pve} works correctly in all environments, irrespective of whether IPv6 is
555deployed or not. We recommend leaving all settings at the provided defaults.
556
557Should you still need to disable support for IPv6 on your node, do so by
558creating an appropriate `sysctl.conf (5)` snippet file and setting the proper
559https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt[sysctls],
560for example adding `/etc/sysctl.d/disable-ipv6.conf` with content:
561
562----
563net.ipv6.conf.all.disable_ipv6 = 1
564net.ipv6.conf.default.disable_ipv6 = 1
565----
566
567This method is preferred to disabling the loading of the IPv6 module on the
568https://www.kernel.org/doc/Documentation/networking/ipv6.rst[kernel commandline].
569
570////
571TODO: explain IPv6 support?
572TODO: explain OVS
573////