]> git.proxmox.com Git - pve-docs.git/blob - pve-network.adoc
network: rework introduction for people with less experience
[pve-docs.git] / pve-network.adoc
1 [[sysadmin_network_configuration]]
2 Network Configuration
3 ---------------------
4 ifdef::wiki[]
5 :pve-toplevel:
6 endif::wiki[]
7
8 {pve} is using the Linux network stack. This provides a lot of flexibility on
9 how to set up the network on the {pve} nodes. The configuration can be done
10 either via the GUI, or by manually editing the file `/etc/network/interfaces`,
11 which contains the whole network configuration. The `interfaces(5)` manual
12 page contains the complete format description. All {pve} tools try hard to keep
13 direct user modifications, but using the GUI is still preferable, because it
14 protects you from errors.
15
16 A 'vmbr' interface is needed to connect guests to the underlying physical
17 network. They are a Linux bridge which can be thought of as a virtual switch
18 to which the guests and physical interfaces are connected to. This section
19 provides some examples on how the network can be set up to accomodate different
20 use cases like redundancy with a xref:sysadmin_network_bond['bond'],
21 xref:sysadmin_network_vlan['vlans'] or
22 xref:sysadmin_network_routed['routed'] and
23 xref:sysadmin_network_masquerading['NAT'] setups.
24
25 The xref:chapter_pvesdn[Software Defined Network] is an option for more complex
26 virtual networks in {pve} clusters.
27
28 WARNING: It's discourage to use the Debian traditional tools `ifup` and `ifdown`
29 if unsure, as they have some pitfalls like interupting all guest traffic on
30 `ifdown vmbrX` but not reconnecting those guest again when doing `ifup` on the
31 same bridge later.
32
33 Apply Network Changes
34 ~~~~~~~~~~~~~~~~~~~~~
35
36 {pve} does not write changes directly to `/etc/network/interfaces`. Instead, we
37 write into a temporary file called `/etc/network/interfaces.new`, this way you
38 can do many related changes at once. This also allows to ensure your changes
39 are correct before applying, as a wrong network configuration may render a node
40 inaccessible.
41
42 Reboot Node to apply
43 ^^^^^^^^^^^^^^^^^^^^
44
45 One way to apply a new network configuration is to reboot the node.
46
47 Reload Network with ifupdown2
48 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
49
50 With the 'ifupdown2' package (default since {pve} 7), it is possible to apply
51 network configuration changes without a reboot. If you change the network
52 configuration via the GUI, you can click the 'Apply Configuration' button. Run
53 the following command if you make changes directly to the
54 `/etc/network/interfaces` file:
55
56 ----
57 ifreload -a
58 ----
59
60 NOTE: If you installed {pve} on top of Debian, make sure 'ifupdown2' is
61 installed: 'apt install ifupdown2'
62
63 Naming Conventions
64 ~~~~~~~~~~~~~~~~~~
65
66 We currently use the following naming conventions for device names:
67
68 * Ethernet devices: en*, systemd network interface names. This naming scheme is
69 used for new {pve} installations since version 5.0.
70
71 * Ethernet devices: eth[N], where 0 ≤ N (`eth0`, `eth1`, ...) This naming
72 scheme is used for {pve} hosts which were installed before the 5.0
73 release. When upgrading to 5.0, the names are kept as-is.
74
75 * Bridge names: vmbr[N], where 0 ≤ N ≤ 4094 (`vmbr0` - `vmbr4094`)
76
77 * Bonds: bond[N], where 0 ≤ N (`bond0`, `bond1`, ...)
78
79 * VLANs: Simply add the VLAN number to the device name,
80 separated by a period (`eno1.50`, `bond1.30`)
81
82 This makes it easier to debug networks problems, because the device
83 name implies the device type.
84
85 Systemd Network Interface Names
86 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
87
88 Systemd uses the two character prefix 'en' for Ethernet network
89 devices. The next characters depends on the device driver and the fact
90 which schema matches first.
91
92 * o<index>[n<phys_port_name>|d<dev_port>] — devices on board
93
94 * s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — device by hotplug id
95
96 * [P<domain>]p<bus>s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — devices by bus id
97
98 * x<MAC> — device by MAC address
99
100 The most common patterns are:
101
102 * eno1 — is the first on board NIC
103
104 * enp3s0f1 — is the NIC on pcibus 3 slot 0 and use the NIC function 1.
105
106 For more information see https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/[Predictable Network Interface Names].
107
108 Choosing a network configuration
109 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
110
111 Depending on your current network organization and your resources you can
112 choose either a bridged, routed, or masquerading networking setup.
113
114 {pve} server in a private LAN, using an external gateway to reach the internet
115 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
116
117 The *Bridged* model makes the most sense in this case, and this is also
118 the default mode on new {pve} installations.
119 Each of your Guest system will have a virtual interface attached to the
120 {pve} bridge. This is similar in effect to having the Guest network card
121 directly connected to a new switch on your LAN, the {pve} host playing the role
122 of the switch.
123
124 {pve} server at hosting provider, with public IP ranges for Guests
125 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
126
127 For this setup, you can use either a *Bridged* or *Routed* model, depending on
128 what your provider allows.
129
130 {pve} server at hosting provider, with a single public IP address
131 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
132
133 In that case the only way to get outgoing network accesses for your guest
134 systems is to use *Masquerading*. For incoming network access to your guests,
135 you will need to configure *Port Forwarding*.
136
137 For further flexibility, you can configure
138 VLANs (IEEE 802.1q) and network bonding, also known as "link
139 aggregation". That way it is possible to build complex and flexible
140 virtual networks.
141
142 Default Configuration using a Bridge
143 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
144
145 [thumbnail="default-network-setup-bridge.svg"]
146 Bridges are like physical network switches implemented in software.
147 All virtual guests can share a single bridge, or you can create multiple
148 bridges to separate network domains. Each host can have up to 4094 bridges.
149
150 The installation program creates a single bridge named `vmbr0`, which
151 is connected to the first Ethernet card. The corresponding
152 configuration in `/etc/network/interfaces` might look like this:
153
154 ----
155 auto lo
156 iface lo inet loopback
157
158 iface eno1 inet manual
159
160 auto vmbr0
161 iface vmbr0 inet static
162 address 192.168.10.2/24
163 gateway 192.168.10.1
164 bridge-ports eno1
165 bridge-stp off
166 bridge-fd 0
167 ----
168
169 Virtual machines behave as if they were directly connected to the
170 physical network. The network, in turn, sees each virtual machine as
171 having its own MAC, even though there is only one network cable
172 connecting all of these VMs to the network.
173
174 [[sysadmin_network_routed]]
175 Routed Configuration
176 ~~~~~~~~~~~~~~~~~~~~
177
178 Most hosting providers do not support the above setup. For security
179 reasons, they disable networking as soon as they detect multiple MAC
180 addresses on a single interface.
181
182 TIP: Some providers allow you to register additional MACs through their
183 management interface. This avoids the problem, but can be clumsy to
184 configure because you need to register a MAC for each of your VMs.
185
186 You can avoid the problem by ``routing'' all traffic via a single
187 interface. This makes sure that all network packets use the same MAC
188 address.
189
190 [thumbnail="default-network-setup-routed.svg"]
191 A common scenario is that you have a public IP (assume `198.51.100.5`
192 for this example), and an additional IP block for your VMs
193 (`203.0.113.16/28`). We recommend the following setup for such
194 situations:
195
196 ----
197 auto lo
198 iface lo inet loopback
199
200 auto eno0
201 iface eno0 inet static
202 address 198.51.100.5/29
203 gateway 198.51.100.1
204 post-up echo 1 > /proc/sys/net/ipv4/ip_forward
205 post-up echo 1 > /proc/sys/net/ipv4/conf/eno0/proxy_arp
206
207
208 auto vmbr0
209 iface vmbr0 inet static
210 address 203.0.113.17/28
211 bridge-ports none
212 bridge-stp off
213 bridge-fd 0
214 ----
215
216
217 [[sysadmin_network_masquerading]]
218 Masquerading (NAT) with `iptables`
219 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
220
221 Masquerading allows guests having only a private IP address to access the
222 network by using the host IP address for outgoing traffic. Each outgoing
223 packet is rewritten by `iptables` to appear as originating from the host,
224 and responses are rewritten accordingly to be routed to the original sender.
225
226 ----
227 auto lo
228 iface lo inet loopback
229
230 auto eno1
231 #real IP address
232 iface eno1 inet static
233 address 198.51.100.5/24
234 gateway 198.51.100.1
235
236 auto vmbr0
237 #private sub network
238 iface vmbr0 inet static
239 address 10.10.10.1/24
240 bridge-ports none
241 bridge-stp off
242 bridge-fd 0
243
244 post-up echo 1 > /proc/sys/net/ipv4/ip_forward
245 post-up iptables -t nat -A POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
246 post-down iptables -t nat -D POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
247 ----
248
249 NOTE: In some masquerade setups with firewall enabled, conntrack zones might be
250 needed for outgoing connections. Otherwise the firewall could block outgoing
251 connections since they will prefer the `POSTROUTING` of the VM bridge (and not
252 `MASQUERADE`).
253
254 Adding these lines in the `/etc/network/interfaces` can fix this problem:
255
256 ----
257 post-up iptables -t raw -I PREROUTING -i fwbr+ -j CT --zone 1
258 post-down iptables -t raw -D PREROUTING -i fwbr+ -j CT --zone 1
259 ----
260
261 For more information about this, refer to the following links:
262
263 https://commons.wikimedia.org/wiki/File:Netfilter-packet-flow.svg[Netfilter Packet Flow]
264
265 https://lwn.net/Articles/370152/[Patch on netdev-list introducing conntrack zones]
266
267 https://blog.lobraun.de/2019/05/19/prox/[Blog post with a good explanation by using TRACE in the raw table]
268
269
270 [[sysadmin_network_bond]]
271 Linux Bond
272 ~~~~~~~~~~
273
274 Bonding (also called NIC teaming or Link Aggregation) is a technique
275 for binding multiple NIC's to a single network device. It is possible
276 to achieve different goals, like make the network fault-tolerant,
277 increase the performance or both together.
278
279 High-speed hardware like Fibre Channel and the associated switching
280 hardware can be quite expensive. By doing link aggregation, two NICs
281 can appear as one logical interface, resulting in double speed. This
282 is a native Linux kernel feature that is supported by most
283 switches. If your nodes have multiple Ethernet ports, you can
284 distribute your points of failure by running network cables to
285 different switches and the bonded connection will failover to one
286 cable or the other in case of network trouble.
287
288 Aggregated links can improve live-migration delays and improve the
289 speed of replication of data between Proxmox VE Cluster nodes.
290
291 There are 7 modes for bonding:
292
293 * *Round-robin (balance-rr):* Transmit network packets in sequential
294 order from the first available network interface (NIC) slave through
295 the last. This mode provides load balancing and fault tolerance.
296
297 * *Active-backup (active-backup):* Only one NIC slave in the bond is
298 active. A different slave becomes active if, and only if, the active
299 slave fails. The single logical bonded interface's MAC address is
300 externally visible on only one NIC (port) to avoid distortion in the
301 network switch. This mode provides fault tolerance.
302
303 * *XOR (balance-xor):* Transmit network packets based on [(source MAC
304 address XOR'd with destination MAC address) modulo NIC slave
305 count]. This selects the same NIC slave for each destination MAC
306 address. This mode provides load balancing and fault tolerance.
307
308 * *Broadcast (broadcast):* Transmit network packets on all slave
309 network interfaces. This mode provides fault tolerance.
310
311 * *IEEE 802.3ad Dynamic link aggregation (802.3ad)(LACP):* Creates
312 aggregation groups that share the same speed and duplex
313 settings. Utilizes all slave network interfaces in the active
314 aggregator group according to the 802.3ad specification.
315
316 * *Adaptive transmit load balancing (balance-tlb):* Linux bonding
317 driver mode that does not require any special network-switch
318 support. The outgoing network packet traffic is distributed according
319 to the current load (computed relative to the speed) on each network
320 interface slave. Incoming traffic is received by one currently
321 designated slave network interface. If this receiving slave fails,
322 another slave takes over the MAC address of the failed receiving
323 slave.
324
325 * *Adaptive load balancing (balance-alb):* Includes balance-tlb plus receive
326 load balancing (rlb) for IPV4 traffic, and does not require any
327 special network switch support. The receive load balancing is achieved
328 by ARP negotiation. The bonding driver intercepts the ARP Replies sent
329 by the local system on their way out and overwrites the source
330 hardware address with the unique hardware address of one of the NIC
331 slaves in the single logical bonded interface such that different
332 network-peers use different MAC addresses for their network packet
333 traffic.
334
335 If your switch support the LACP (IEEE 802.3ad) protocol then we recommend using
336 the corresponding bonding mode (802.3ad). Otherwise you should generally use the
337 active-backup mode. +
338 // http://lists.linux-ha.org/pipermail/linux-ha/2013-January/046295.html
339 If you intend to run your cluster network on the bonding interfaces, then you
340 have to use active-passive mode on the bonding interfaces, other modes are
341 unsupported.
342
343 The following bond configuration can be used as distributed/shared
344 storage network. The benefit would be that you get more speed and the
345 network will be fault-tolerant.
346
347 .Example: Use bond with fixed IP address
348 ----
349 auto lo
350 iface lo inet loopback
351
352 iface eno1 inet manual
353
354 iface eno2 inet manual
355
356 iface eno3 inet manual
357
358 auto bond0
359 iface bond0 inet static
360 bond-slaves eno1 eno2
361 address 192.168.1.2/24
362 bond-miimon 100
363 bond-mode 802.3ad
364 bond-xmit-hash-policy layer2+3
365
366 auto vmbr0
367 iface vmbr0 inet static
368 address 10.10.10.2/24
369 gateway 10.10.10.1
370 bridge-ports eno3
371 bridge-stp off
372 bridge-fd 0
373
374 ----
375
376
377 [thumbnail="default-network-setup-bond.svg"]
378 Another possibility it to use the bond directly as bridge port.
379 This can be used to make the guest network fault-tolerant.
380
381 .Example: Use a bond as bridge port
382 ----
383 auto lo
384 iface lo inet loopback
385
386 iface eno1 inet manual
387
388 iface eno2 inet manual
389
390 auto bond0
391 iface bond0 inet manual
392 bond-slaves eno1 eno2
393 bond-miimon 100
394 bond-mode 802.3ad
395 bond-xmit-hash-policy layer2+3
396
397 auto vmbr0
398 iface vmbr0 inet static
399 address 10.10.10.2/24
400 gateway 10.10.10.1
401 bridge-ports bond0
402 bridge-stp off
403 bridge-fd 0
404
405 ----
406
407
408 [[sysadmin_network_vlan]]
409 VLAN 802.1Q
410 ~~~~~~~~~~~
411
412 A virtual LAN (VLAN) is a broadcast domain that is partitioned and
413 isolated in the network at layer two. So it is possible to have
414 multiple networks (4096) in a physical network, each independent of
415 the other ones.
416
417 Each VLAN network is identified by a number often called 'tag'.
418 Network packages are then 'tagged' to identify which virtual network
419 they belong to.
420
421
422 VLAN for Guest Networks
423 ^^^^^^^^^^^^^^^^^^^^^^^
424
425 {pve} supports this setup out of the box. You can specify the VLAN tag
426 when you create a VM. The VLAN tag is part of the guest network
427 configuration. The networking layer supports different modes to
428 implement VLANs, depending on the bridge configuration:
429
430 * *VLAN awareness on the Linux bridge:*
431 In this case, each guest's virtual network card is assigned to a VLAN tag,
432 which is transparently supported by the Linux bridge.
433 Trunk mode is also possible, but that makes configuration
434 in the guest necessary.
435
436 * *"traditional" VLAN on the Linux bridge:*
437 In contrast to the VLAN awareness method, this method is not transparent
438 and creates a VLAN device with associated bridge for each VLAN.
439 That is, creating a guest on VLAN 5 for example, would create two
440 interfaces eno1.5 and vmbr0v5, which would remain until a reboot occurs.
441
442 * *Open vSwitch VLAN:*
443 This mode uses the OVS VLAN feature.
444
445 * *Guest configured VLAN:*
446 VLANs are assigned inside the guest. In this case, the setup is
447 completely done inside the guest and can not be influenced from the
448 outside. The benefit is that you can use more than one VLAN on a
449 single virtual NIC.
450
451
452 VLAN on the Host
453 ^^^^^^^^^^^^^^^^
454
455 To allow host communication with an isolated network. It is possible
456 to apply VLAN tags to any network device (NIC, Bond, Bridge). In
457 general, you should configure the VLAN on the interface with the least
458 abstraction layers between itself and the physical NIC.
459
460 For example, in a default configuration where you want to place
461 the host management address on a separate VLAN.
462
463
464 .Example: Use VLAN 5 for the {pve} management IP with traditional Linux bridge
465 ----
466 auto lo
467 iface lo inet loopback
468
469 iface eno1 inet manual
470
471 iface eno1.5 inet manual
472
473 auto vmbr0v5
474 iface vmbr0v5 inet static
475 address 10.10.10.2/24
476 gateway 10.10.10.1
477 bridge-ports eno1.5
478 bridge-stp off
479 bridge-fd 0
480
481 auto vmbr0
482 iface vmbr0 inet manual
483 bridge-ports eno1
484 bridge-stp off
485 bridge-fd 0
486
487 ----
488
489 .Example: Use VLAN 5 for the {pve} management IP with VLAN aware Linux bridge
490 ----
491 auto lo
492 iface lo inet loopback
493
494 iface eno1 inet manual
495
496
497 auto vmbr0.5
498 iface vmbr0.5 inet static
499 address 10.10.10.2/24
500 gateway 10.10.10.1
501
502 auto vmbr0
503 iface vmbr0 inet manual
504 bridge-ports eno1
505 bridge-stp off
506 bridge-fd 0
507 bridge-vlan-aware yes
508 bridge-vids 2-4094
509 ----
510
511 The next example is the same setup but a bond is used to
512 make this network fail-safe.
513
514 .Example: Use VLAN 5 with bond0 for the {pve} management IP with traditional Linux bridge
515 ----
516 auto lo
517 iface lo inet loopback
518
519 iface eno1 inet manual
520
521 iface eno2 inet manual
522
523 auto bond0
524 iface bond0 inet manual
525 bond-slaves eno1 eno2
526 bond-miimon 100
527 bond-mode 802.3ad
528 bond-xmit-hash-policy layer2+3
529
530 iface bond0.5 inet manual
531
532 auto vmbr0v5
533 iface vmbr0v5 inet static
534 address 10.10.10.2/24
535 gateway 10.10.10.1
536 bridge-ports bond0.5
537 bridge-stp off
538 bridge-fd 0
539
540 auto vmbr0
541 iface vmbr0 inet manual
542 bridge-ports bond0
543 bridge-stp off
544 bridge-fd 0
545
546 ----
547
548 Disabling IPv6 on the Node
549 ~~~~~~~~~~~~~~~~~~~~~~~~~~
550
551 {pve} works correctly in all environments, irrespective of whether IPv6 is
552 deployed or not. We recommend leaving all settings at the provided defaults.
553
554 Should you still need to disable support for IPv6 on your node, do so by
555 creating an appropriate `sysctl.conf (5)` snippet file and setting the proper
556 https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt[sysctls],
557 for example adding `/etc/sysctl.d/disable-ipv6.conf` with content:
558
559 ----
560 net.ipv6.conf.all.disable_ipv6 = 1
561 net.ipv6.conf.default.disable_ipv6 = 1
562 ----
563
564 This method is preferred to disabling the loading of the IPv6 module on the
565 https://www.kernel.org/doc/Documentation/networking/ipv6.rst[kernel commandline].
566
567 ////
568 TODO: explain IPv6 support?
569 TODO: explain OVS
570 ////