]> git.proxmox.com Git - pve-docs.git/blob - pvesdn.adoc
sdn: add notes about bgp controller
[pve-docs.git] / pvesdn.adoc
1 [[chapter_pvesdn]]
2 Software Defined Network
3 ========================
4 ifndef::manvolnum[]
5 :pve-toplevel:
6 endif::manvolnum[]
7
8 The **S**oftware **D**efined **N**etwork (SDN) feature allows you to create
9 virtual networks (VNets) at the datacenter level.
10
11 WARNING: SDN is currently an **experimental feature** in {pve}. This
12 documentation for it is also still under development. Ask on our
13 xref:getting_help[mailing lists or in the forum] for questions and feedback.
14
15
16 [[pvesdn_installation]]
17 Installation
18 ------------
19
20 To enable the experimental Software Defined Network (SDN) integration, you need
21 to install the `libpve-network-perl` and `ifupdown2` packages on every node:
22
23 ----
24 apt update
25 apt install libpve-network-perl ifupdown2
26 ----
27
28 NOTE: {pve} version 7 and above come installed with ifupdown2.
29
30 After this, you need to add the following line to the end of the
31 `/etc/network/interfaces` configuration file, so that the SDN configuration gets
32 included and activated.
33
34 ----
35 source /etc/network/interfaces.d/*
36 ----
37
38
39 Basic Overview
40 --------------
41
42 The {pve} SDN allows for separation and fine-grained control of virtual guest
43 networks, using flexible, software-controlled configurations.
44
45 Separation is managed through zones, where a zone is its own virtual separated
46 network area. A 'VNet' is a type of a virtual network connected to a zone.
47 Depending on which type or plugin the zone uses, it can behave differently and
48 offer different features, advantages, and disadvantages. Normally, a 'VNet'
49 appears as a common Linux bridge with either a VLAN or 'VXLAN' tag, however,
50 some can also use layer 3 routing for control. 'VNets' are deployed locally on
51 each node, after being configured from the cluster-wide datacenter SDN
52 administration interface.
53
54
55 Main Configuration
56 ~~~~~~~~~~~~~~~~~~
57
58 Configuration is done at the datacenter (cluster-wide) level and is saved in
59 files located in the shared configuration file system:
60 `/etc/pve/sdn`
61
62 On the web-interface, SDN features 3 main sections:
63
64 * SDN: An overview of the SDN state
65
66 * Zones: Create and manage the virtually separated network zones
67
68 * VNets: Create virtual network bridges and manage subnets
69
70 In addition to this, the following options are offered:
71
72 * Controller: For controlling layer 3 routing in complex setups
73
74 * Subnets: Used to defined IP networks on VNets
75
76 * IPAM: Enables the use of external tools for IP address management (guest
77 IPs)
78
79 * DNS: Define a DNS server API for registering virtual guests' hostname and IP
80 addresses
81
82 [[pvesdn_config_main_sdn]]
83
84 SDN
85 ~~~
86
87 This is the main status panel. Here you can see the deployment status of zones
88 on different nodes.
89
90 The 'Apply' button is used to push and reload local configuration on all cluster
91 nodes.
92
93
94 [[pvesdn_local_deployment_monitoring]]
95 Local Deployment Monitoring
96 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
97
98 After applying the configuration through the main SDN panel,
99 the local network configuration is generated locally on each node in
100 the file `/etc/network/interfaces.d/sdn`, and reloaded with ifupdown2.
101
102 You can monitor the status of local zones and VNets through the main tree.
103
104
105 [[pvesdn_config_zone]]
106 Zones
107 -----
108
109 A zone defines a virtually separated network. Zones can be restricted to
110 specific nodes and assigned permissions, in order to restrict users to a certain
111 zone and its contained VNets.
112
113 Different technologies can be used for separation:
114
115 * VLAN: Virtual LANs are the classic method of subdividing a LAN
116
117 * QinQ: Stacked VLAN (formally known as `IEEE 802.1ad`)
118
119 * VXLAN: Layer2 VXLAN
120
121 * Simple: Isolated Bridge. A simple layer 3 routing bridge (NAT)
122
123 * EVPN (BGP EVPN): VXLAN using layer 3 border gateway protocol (BGP) routing
124
125 Common options
126 ~~~~~~~~~~~~~~
127
128 The following options are available for all zone types:
129
130 nodes:: The nodes which the zone and associated VNets should be deployed on
131
132 ipam:: Optional. Use an IP Address Management (IPAM) tool to manage IPs in the
133 zone.
134
135 dns:: Optional. DNS API server.
136
137 reversedns:: Optional. Reverse DNS API server.
138
139 dnszone:: Optional. DNS domain name. Used to register hostnames, such as
140 `<hostname>.<domain>`. The DNS zone must already exist on the DNS server.
141
142
143 [[pvesdn_zone_plugin_simple]]
144 Simple Zones
145 ~~~~~~~~~~~~
146
147 This is the simplest plugin. It will create an isolated VNet bridge.
148 This bridge is not linked to a physical interface, and VM traffic is only
149 local between the node(s).
150 It can also be used in NAT or routed setups.
151
152 [[pvesdn_zone_plugin_vlan]]
153 VLAN Zones
154 ~~~~~~~~~~
155
156 This plugin reuses an existing local Linux or OVS bridge, and manages the VLANs
157 on it. The benefit of using the SDN module is that you can create different
158 zones with specific VNet VLAN tags, and restrict virtual machines to separated
159 zones.
160
161 Specific `VLAN` configuration options:
162
163 bridge:: Reuse this local bridge or OVS switch, already configured on *each*
164 local node.
165
166 [[pvesdn_zone_plugin_qinq]]
167 QinQ Zones
168 ~~~~~~~~~~
169
170 QinQ also known as VLAN stacking, wherein the first VLAN tag is defined for the
171 zone (the 'service-vlan'), and the second VLAN tag is defined for the
172 VNets.
173
174 NOTE: Your physical network switches must support stacked VLANs for this
175 configuration!
176
177 Below are the configuration options specific to QinQ:
178
179 bridge:: A local, VLAN-aware bridge that is already configured on each local
180 node
181
182 service vlan:: The main VLAN tag of this zone
183
184 service vlan protocol:: Allows you to choose between an 802.1q (default) or
185 802.1ad service VLAN type.
186
187 mtu:: Due to the double stacking of tags, you need 4 more bytes for QinQ VLANs.
188 For example, you must reduce the MTU to `1496` if you physical interface MTU is
189 `1500`.
190
191 [[pvesdn_zone_plugin_vxlan]]
192 VXLAN Zones
193 ~~~~~~~~~~~
194
195 The VXLAN plugin establishes a tunnel (overlay) on top of an existing
196 network (underlay). This encapsulates layer 2 Ethernet frames within layer
197 4 UDP datagrams, using `4789` as the default destination port. You can, for
198 example, create a private IPv4 VXLAN network on top of public internet network
199 nodes.
200
201 This is a layer 2 tunnel only, so no routing between different VNets is
202 possible.
203
204 Each VNet will have a specific VXLAN ID in the range 1 - 16777215.
205
206 Specific EVPN configuration options:
207
208 peers address list:: A list of IP addresses from each node through which you
209 want to communicate. Can also be external nodes.
210
211 mtu:: Because VXLAN encapsulation uses 50 bytes, the MTU needs to be 50 bytes
212 lower than the outgoing physical interface.
213
214 [[pvesdn_zone_plugin_evpn]]
215 EVPN Zones
216 ~~~~~~~~~~
217
218 This is the most complex of all the supported plugins.
219
220 BGP-EVPN allows you to create a routable layer 3 network. The VNet of EVPN can
221 have an anycast IP address and/or MAC address. The bridge IP is the same on each
222 node, meaning a virtual guest can use this address as gateway.
223
224 Routing can work across VNets from different zones through a VRF (Virtual
225 Routing and Forwarding) interface.
226
227 The configuration options specific to EVPN are as follows:
228
229 VRF VXLAN tag:: This is a VXLAN-ID used for routing interconnect between VNets.
230 It must be different than the VXLAN-ID of the VNets.
231
232 controller:: An EVPN-controller must to be defined first (see controller plugins
233 section).
234
235 VNet MAC address:: A unique, anycast MAC address for all VNets in this zone.
236 Will be auto-generated if not defined.
237
238 Exit Nodes:: Optional. This is used if you want to define some {pve} nodes as
239 exit gateways from the EVPN network, through the real network. The configured
240 nodes will announce a default route in the EVPN network.
241
242 Primary Exit Node:: Optional. If you use multiple exit nodes, this forces
243 traffic to a primary exit node, instead of load-balancing on all nodes. This
244 is required if you want to use SNAT or if your upstream router doesn't support
245 ECMP.
246
247 Exit Nodes local routing:: Optional. This is a special option if you need to
248 reach a VM/CT service from an exit node. (By default, the exit nodes only
249 allow forwarding traffic between real network and EVPN network).
250
251 Advertise Subnets:: Optional. If you have silent VMs/CTs (for example, if you
252 have multiple IPs and the anycast gateway doesn't see traffic from theses IPs,
253 the IP addresses won't be able to be reach inside the EVPN network). This
254 option will announce the full subnet in the EVPN network in this case.
255
256 Disable Arp-Nd Suppression:: Optional. Don't suppress ARP or ND packets.
257 This is required if you use floating IPs in your guest VMs
258 (IP are MAC addresses are being moved between systems).
259
260 Route-target import:: Optional. Allows you to import a list of external EVPN
261 route targets. Used for cross-DC or different EVPN network interconnects.
262
263 MTU:: Because VXLAN encapsulation uses 50 bytes, the MTU needs to be 50 bytes
264 less than the maximal MTU of the outgoing physical interface.
265
266
267 [[pvesdn_config_vnet]]
268 VNets
269 -----
270
271 A `VNet` is, in its basic form, a Linux bridge that will be deployed locally on
272 the node and used for virtual machine communication.
273
274 The VNet configuration properties are:
275
276 ID:: An 8 character ID to name and identify a VNet
277
278 Alias:: Optional longer name, if the ID isn't enough
279
280 Zone:: The associated zone for this VNet
281
282 Tag:: The unique VLAN or VXLAN ID
283
284 VLAN Aware:: Enable adding an extra VLAN tag in the virtual machine or
285 container's vNIC configuration, to allow the guest OS to manage the VLAN's tag.
286
287 [[pvesdn_config_subnet]]
288 Subnets
289 ~~~~~~~~
290
291 A subnetwork (subnet) allows you to define a specific IP network
292 (IPv4 or IPv6). For each VNet, you can define one or more subnets.
293
294 A subnet can be used to:
295
296 * Restrict the IP addresses you can define on a specific VNet
297 * Assign routes/gateways on a VNet in layer 3 zones
298 * Enable SNAT on a VNet in layer 3 zones
299 * Auto assign IPs on virtual guests (VM or CT) through IPAM plugins
300 * DNS registration through DNS plugins
301
302 If an IPAM server is associated with the subnet zone, the subnet prefix will be
303 automatically registered in the IPAM.
304
305 Subnet properties are:
306
307 ID:: A CIDR network address, for example 10.0.0.0/8
308
309 Gateway:: The IP address of the network's default gateway. On layer 3 zones
310 (Simple/EVPN plugins), it will be deployed on the VNet.
311
312 SNAT:: Optional. Enable SNAT for layer 3 zones (Simple/EVPN plugins), for this
313 subnet. The subnet's source IP will be NATted to server's outgoing interface/IP.
314 On EVPN zones, this is only done on EVPN gateway-nodes.
315
316 Dnszoneprefix:: Optional. Add a prefix to the domain registration, like
317 <hostname>.prefix.<domain>
318
319 [[pvesdn_config_controllers]]
320 Controllers
321 -----------
322
323 Some zone types need an external controller to manage the VNet control-plane.
324 Currently this is only required for the `bgp-evpn` zone plugin.
325
326 [[pvesdn_controller_plugin_evpn]]
327 EVPN Controller
328 ~~~~~~~~~~~~~~~
329
330 For `BGP-EVPN`, we need a controller to manage the control plane.
331 The currently supported software controller is the "frr" router.
332 You may need to install it on each node where you want to deploy EVPN zones.
333
334 ----
335 apt install frr frr-pythontools
336 ----
337
338 Configuration options:
339
340 asn:: A unique BGP ASN number. It's highly recommended to use a private ASN
341 number (64512 – 65534, 4200000000 – 4294967294), as otherwise you could end up
342 breaking global routing by mistake.
343
344 peers:: An IP list of all nodes where you want to communicate for the EVPN
345 (could also be external nodes or route reflectors servers)
346
347
348 [[pvesdn_controller_plugin_BGP]]
349 BGP Controller
350 ~~~~~~~~~~~~~~~
351
352 The BGP controller is not used directly by a zone.
353 You can use it to configure FRR to manage BGP peers.
354
355 For BGP-EVPN, it can be used to define a different ASN by node, so doing EBGP.
356 It can also be used to export evpn routes to a external bgp peer.
357
358 NOTE: By default, for a simple full mesh evpn, you don't need to define any extra
359 BGP Controller.
360
361 Configuration options:
362
363 node:: The node of this BGP controller
364
365 asn:: A unique BGP ASN number. It's highly recommended to use a private ASN
366 number in the range (64512 - 65534) or (4200000000 - 4294967294), as otherwise
367 you could break global routing by mistake.
368
369 peers:: A list of peer IP addresses you want to communicate with using the
370 underlying BGP network.
371
372 ebgp:: If your peer's remote-AS is different, this enables EBGP.
373
374 loopback:: Use a loopback or dummy interface as the source of the EVPN network
375 (for multipath).
376
377 ebgp-mutltihop:: Increase the number of hops to reach peers, in case they are
378 not directly connected or they use loopback.
379
380 bgp-multipath-as-path-relax:: Allow ECMP if your peers have different ASN.
381
382 [[pvesdn_config_ipam]]
383 IPAMs
384 -----
385
386 IPAM (IP Address Management) tools are used to manage/assign the IP addresses of
387 guests on the network. It can be used to find free IP addresses when you create
388 a VM/CT for example (not yet implemented).
389
390 An IPAM can be associated with one or more zones, to provide IP addresses
391 for all subnets defined in those zones.
392
393 [[pvesdn_ipam_plugin_pveipam]]
394 {pve} IPAM Plugin
395 ~~~~~~~~~~~~~~~~~
396
397 This is the default internal IPAM for your {pve} cluster, if you don't have
398 external IPAM software.
399
400 [[pvesdn_ipam_plugin_phpipam]]
401 phpIPAM Plugin
402 ~~~~~~~~~~~~~~
403 https://phpipam.net/
404
405 You need to create an application in phpIPAM and add an API token with admin
406 privileges.
407
408 The phpIPAM configuration properties are:
409
410 url:: The REST-API endpoint: `http://phpipam.domain.com/api/<appname>/`
411
412 token:: An API access token
413
414 section:: An integer ID. Sections are a group of subnets in phpIPAM. Default
415 installations use `sectionid=1` for customers.
416
417 [[pvesdn_ipam_plugin_netbox]]
418 NetBox IPAM Plugin
419 ~~~~~~~~~~~~~~~~~~
420
421 NetBox is an IP address management (IPAM) and datacenter infrastructure
422 management (DCIM) tool. See the source code repository for details:
423 https://github.com/netbox-community/netbox
424
425 You need to create an API token in NetBox to use it:
426 https://docs.netbox.dev/en/stable/integrations/rest-api/#tokens
427
428 The NetBox configuration properties are:
429
430 url:: The REST API endpoint: `http://yournetbox.domain.com/api`
431
432 token:: An API access token
433
434 [[pvesdn_config_dns]]
435 DNS
436 ---
437
438 The DNS plugin in {pve} SDN is used to define a DNS API server for registration
439 of your hostname and IP address. A DNS configuration is associated with one or
440 more zones, to provide DNS registration for all the subnet IPs configured for
441 a zone.
442
443 [[pvesdn_dns_plugin_powerdns]]
444 PowerDNS Plugin
445 ~~~~~~~~~~~~~~~
446 https://doc.powerdns.com/authoritative/http-api/index.html
447
448 You need to enable the web server and the API in your PowerDNS config:
449
450 ----
451 api=yes
452 api-key=arandomgeneratedstring
453 webserver=yes
454 webserver-port=8081
455 ----
456
457 The PowerDNS configuration options are:
458
459 url:: The REST API endpoint: http://yourpowerdnserver.domain.com:8081/api/v1/servers/localhost
460
461 key:: An API access key
462
463 ttl:: The default TTL for records
464
465
466 Examples
467 --------
468
469 [[pvesdn_setup_example_vlan]]
470 VLAN Setup Example
471 ~~~~~~~~~~~~~~~~~~
472
473 TIP: While we show plaintext configuration content here, almost everything
474 should be configurable using the web-interface only.
475
476 Node1: /etc/network/interfaces
477
478 ----
479 auto vmbr0
480 iface vmbr0 inet manual
481 bridge-ports eno1
482 bridge-stp off
483 bridge-fd 0
484 bridge-vlan-aware yes
485 bridge-vids 2-4094
486
487 #management ip on vlan100
488 auto vmbr0.100
489 iface vmbr0.100 inet static
490 address 192.168.0.1/24
491
492 source /etc/network/interfaces.d/*
493 ----
494
495 Node2: /etc/network/interfaces
496
497 ----
498 auto vmbr0
499 iface vmbr0 inet manual
500 bridge-ports eno1
501 bridge-stp off
502 bridge-fd 0
503 bridge-vlan-aware yes
504 bridge-vids 2-4094
505
506 #management ip on vlan100
507 auto vmbr0.100
508 iface vmbr0.100 inet static
509 address 192.168.0.2/24
510
511 source /etc/network/interfaces.d/*
512 ----
513
514 Create a VLAN zone named `myvlanzone':
515
516 ----
517 id: myvlanzone
518 bridge: vmbr0
519 ----
520
521 Create a VNet named `myvnet1' with `vlan-id` `10' and the previously created
522 `myvlanzone' as its zone.
523
524 ----
525 id: myvnet1
526 zone: myvlanzone
527 tag: 10
528 ----
529
530 Apply the configuration through the main SDN panel, to create VNets locally on
531 each node.
532
533 Create a Debian-based virtual machine (vm1) on node1, with a vNIC on `myvnet1'.
534
535 Use the following network configuration for this VM:
536
537 ----
538 auto eth0
539 iface eth0 inet static
540 address 10.0.3.100/24
541 ----
542
543 Create a second virtual machine (vm2) on node2, with a vNIC on the same VNet
544 `myvnet1' as vm1.
545
546 Use the following network configuration for this VM:
547
548 ----
549 auto eth0
550 iface eth0 inet static
551 address 10.0.3.101/24
552 ----
553
554 Following this, you should be able to ping between both VMs over that network.
555
556
557 [[pvesdn_setup_example_qinq]]
558 QinQ Setup Example
559 ~~~~~~~~~~~~~~~~~~
560
561 TIP: While we show plaintext configuration content here, almost everything
562 should be configurable using the web-interface only.
563
564 Node1: /etc/network/interfaces
565
566 ----
567 auto vmbr0
568 iface vmbr0 inet manual
569 bridge-ports eno1
570 bridge-stp off
571 bridge-fd 0
572 bridge-vlan-aware yes
573 bridge-vids 2-4094
574
575 #management ip on vlan100
576 auto vmbr0.100
577 iface vmbr0.100 inet static
578 address 192.168.0.1/24
579
580 source /etc/network/interfaces.d/*
581 ----
582
583 Node2: /etc/network/interfaces
584
585 ----
586 auto vmbr0
587 iface vmbr0 inet manual
588 bridge-ports eno1
589 bridge-stp off
590 bridge-fd 0
591 bridge-vlan-aware yes
592 bridge-vids 2-4094
593
594 #management ip on vlan100
595 auto vmbr0.100
596 iface vmbr0.100 inet static
597 address 192.168.0.2/24
598
599 source /etc/network/interfaces.d/*
600 ----
601
602 Create a QinQ zone named `qinqzone1' with service VLAN 20
603
604 ----
605 id: qinqzone1
606 bridge: vmbr0
607 service vlan: 20
608 ----
609
610 Create another QinQ zone named `qinqzone2' with service VLAN 30
611
612 ----
613 id: qinqzone2
614 bridge: vmbr0
615 service vlan: 30
616 ----
617
618 Create a VNet named `myvnet1' with customer VLAN-ID 100 on the previously
619 created `qinqzone1' zone.
620
621 ----
622 id: myvnet1
623 zone: qinqzone1
624 tag: 100
625 ----
626
627 Create a `myvnet2' with customer VLAN-ID 100 on the previously created
628 `qinqzone2' zone.
629
630 ----
631 id: myvnet2
632 zone: qinqzone2
633 tag: 100
634 ----
635
636 Apply the configuration on the main SDN web-interface panel to create VNets
637 locally on each nodes.
638
639 Create a Debian-based virtual machine (vm1) on node1, with a vNIC on `myvnet1'.
640
641 Use the following network configuration for this VM:
642
643 ----
644 auto eth0
645 iface eth0 inet static
646 address 10.0.3.100/24
647 ----
648
649 Create a second virtual machine (vm2) on node2, with a vNIC on the same VNet
650 `myvnet1' as vm1.
651
652 Use the following network configuration for this VM:
653
654 ----
655 auto eth0
656 iface eth0 inet static
657 address 10.0.3.101/24
658 ----
659
660 Create a third virtual machine (vm3) on node1, with a vNIC on the other VNet
661 `myvnet2'.
662
663 Use the following network configuration for this VM:
664
665 ----
666 auto eth0
667 iface eth0 inet static
668 address 10.0.3.102/24
669 ----
670
671 Create another virtual machine (vm4) on node2, with a vNIC on the same VNet
672 `myvnet2' as vm3.
673
674 Use the following network configuration for this VM:
675
676 ----
677 auto eth0
678 iface eth0 inet static
679 address 10.0.3.103/24
680 ----
681
682 Then, you should be able to ping between the VMs 'vm1' and 'vm2', as well as
683 between 'vm3' and 'vm4'. However, neither of VMs 'vm1' or 'vm2' can ping VMs
684 'vm3' or 'vm4', as they are on a different zone with a different service-vlan.
685
686
687 [[pvesdn_setup_example_vxlan]]
688 VXLAN Setup Example
689 ~~~~~~~~~~~~~~~~~~~
690
691 TIP: While we show plaintext configuration content here, almost everything
692 is configurable through the web-interface.
693
694 node1: /etc/network/interfaces
695
696 ----
697 auto vmbr0
698 iface vmbr0 inet static
699 address 192.168.0.1/24
700 gateway 192.168.0.254
701 bridge-ports eno1
702 bridge-stp off
703 bridge-fd 0
704 mtu 1500
705
706 source /etc/network/interfaces.d/*
707 ----
708
709 node2: /etc/network/interfaces
710
711 ----
712 auto vmbr0
713 iface vmbr0 inet static
714 address 192.168.0.2/24
715 gateway 192.168.0.254
716 bridge-ports eno1
717 bridge-stp off
718 bridge-fd 0
719 mtu 1500
720
721 source /etc/network/interfaces.d/*
722 ----
723
724 node3: /etc/network/interfaces
725
726 ----
727 auto vmbr0
728 iface vmbr0 inet static
729 address 192.168.0.3/24
730 gateway 192.168.0.254
731 bridge-ports eno1
732 bridge-stp off
733 bridge-fd 0
734 mtu 1500
735
736 source /etc/network/interfaces.d/*
737 ----
738
739 Create a VXLAN zone named `myvxlanzone', using a lower MTU to ensure the extra
740 50 bytes of the VXLAN header can fit. Add all previously configured IPs from
741 the nodes to the peer address list.
742
743 ----
744 id: myvxlanzone
745 peers address list: 192.168.0.1,192.168.0.2,192.168.0.3
746 mtu: 1450
747 ----
748
749 Create a VNet named `myvnet1' using the VXLAN zone `myvxlanzone' created
750 previously.
751
752 ----
753 id: myvnet1
754 zone: myvxlanzone
755 tag: 100000
756 ----
757
758 Apply the configuration on the main SDN web-interface panel to create VNets
759 locally on each nodes.
760
761 Create a Debian-based virtual machine (vm1) on node1, with a vNIC on `myvnet1'.
762
763 Use the following network configuration for this VM (note the lower MTU).
764
765 ----
766 auto eth0
767 iface eth0 inet static
768 address 10.0.3.100/24
769 mtu 1450
770 ----
771
772 Create a second virtual machine (vm2) on node3, with a vNIC on the same VNet
773 `myvnet1' as vm1.
774
775 Use the following network configuration for this VM:
776
777 ----
778 auto eth0
779 iface eth0 inet static
780 address 10.0.3.101/24
781 mtu 1450
782 ----
783
784 Then, you should be able to ping between between 'vm1' and 'vm2'.
785
786
787 [[pvesdn_setup_example_evpn]]
788 EVPN Setup Example
789 ~~~~~~~~~~~~~~~~~~
790
791 node1: /etc/network/interfaces
792
793 ----
794 auto vmbr0
795 iface vmbr0 inet static
796 address 192.168.0.1/24
797 gateway 192.168.0.254
798 bridge-ports eno1
799 bridge-stp off
800 bridge-fd 0
801 mtu 1500
802
803 source /etc/network/interfaces.d/*
804 ----
805
806 node2: /etc/network/interfaces
807
808 ----
809 auto vmbr0
810 iface vmbr0 inet static
811 address 192.168.0.2/24
812 gateway 192.168.0.254
813 bridge-ports eno1
814 bridge-stp off
815 bridge-fd 0
816 mtu 1500
817
818 source /etc/network/interfaces.d/*
819 ----
820
821 node3: /etc/network/interfaces
822
823 ----
824 auto vmbr0
825 iface vmbr0 inet static
826 address 192.168.0.3/24
827 gateway 192.168.0.254
828 bridge-ports eno1
829 bridge-stp off
830 bridge-fd 0
831 mtu 1500
832
833 source /etc/network/interfaces.d/*
834 ----
835
836 Create an EVPN controller, using a private ASN number and the above node
837 addresses as peers.
838
839 ----
840 id: myevpnctl
841 asn: 65000
842 peers: 192.168.0.1,192.168.0.2,192.168.0.3
843 ----
844
845 Create an EVPN zone named `myevpnzone', using the previously created
846 EVPN-controller. Define 'node1' and 'node2' as exit nodes.
847
848 ----
849 id: myevpnzone
850 vrf vxlan tag: 10000
851 controller: myevpnctl
852 mtu: 1450
853 vnet mac address: 32:F4:05:FE:6C:0A
854 exitnodes: node1,node2
855 ----
856
857 Create the first VNet named `myvnet1' using the EVPN zone `myevpnzone'.
858 ----
859 id: myvnet1
860 zone: myevpnzone
861 tag: 11000
862 ----
863
864 Create a subnet 10.0.1.0/24 with 10.0.1.1 as gateway on `myvnet1`.
865
866 ----
867 subnet: 10.0.1.0/24
868 gateway: 10.0.1.1
869 ----
870
871 Create the second VNet named `myvnet2' using the same EVPN zone `myevpnzone', a
872 different IPv4 CIDR network.
873
874 ----
875 id: myvnet2
876 zone: myevpnzone
877 tag: 12000
878 ----
879
880 Create a different subnet 10.0.2.0/24 with 10.0.2.1 as gateway on vnet2
881
882 ----
883 subnet: 10.0.2.0/24
884 gateway: 10.0.2.1
885 ----
886
887
888 Apply the configuration from the main SDN web-interface panel to create VNets
889 locally on each node and generate the FRR config.
890
891 Create a Debian-based virtual machine (vm1) on node1, with a vNIC on `myvnet1'.
892
893 Use the following network configuration for this VM:
894
895 ----
896 auto eth0
897 iface eth0 inet static
898 address 10.0.1.100/24
899 gateway 10.0.1.1 #this is the ip of the vnet1
900 mtu 1450
901 ----
902
903 Create a second virtual machine (vm2) on node2, with a vNIC on the other VNet
904 `myvnet2'.
905
906 Use the following network configuration for this VM:
907
908 ----
909 auto eth0
910 iface eth0 inet static
911 address 10.0.2.100/24
912 gateway 10.0.2.1 #this is the ip of the myvnet2
913 mtu 1450
914 ----
915
916
917 Then, you should be able to ping vm2 from vm1, and vm1 from vm2.
918
919 If you ping an external IP from 'vm2' on the non-gateway 'node3', the packet
920 will go to the configured 'myvnet2' gateway, then will be routed to the exit
921 nodes ('node1' or 'node2') and from there it will leave those nodes over the
922 default gateway configured on node1 or node2.
923
924 NOTE: You need to add reverse routes for the '10.0.1.0/24' and '10.0.2.0/24'
925 networks to node1 and node2 on your external gateway, so that the public network
926 can reply back.
927
928 If you have configured an external BGP router, the BGP-EVPN routes (10.0.1.0/24
929 and 10.0.2.0/24 in this example), will be announced dynamically.
930
931
932 Notes
933 -----
934
935 Multiple EVPN Exit Nodes
936 ~~~~~~~~~~~~~~~~~~~~~~~~
937
938 If you have multiple gateway nodes, you should disable the `rp_filter` (Strict
939 Reverse Path Filter) option, because packets can arrive at one node but go out
940 from another node.
941
942 .sysctl.conf disabling `rp_filter`
943 -----
944 net.ipv4.conf.default.rp_filter=0
945 net.ipv4.conf.all.rp_filter=0
946 -----
947
948 VXLAN IPSEC Encryption
949 ~~~~~~~~~~~~~~~~~~~~~~
950
951 If you need to add encryption on top of a VXLAN, it's possible to do so with
952 IPSEC, through `strongswan`. You'll need to reduce the 'MTU' by 60 bytes (IPv4)
953 or 80 bytes (IPv6) to handle encryption.
954
955 So with default real 1500 MTU, you need to use a MTU of 1370 (1370 + 80 (IPSEC)
956 + 50 (VXLAN) == 1500).
957
958 .Install strongswan
959 ----
960 apt install strongswan
961 ----
962
963 Add configuration to `/etc/ipsec.conf'. We only need to encrypt traffic from
964 the VXLAN UDP port '4789'.
965
966 ----
967 conn %default
968 ike=aes256-sha1-modp1024! # the fastest, but reasonably secure cipher on modern HW
969 esp=aes256-sha1!
970 leftfirewall=yes # this is necessary when using Proxmox VE firewall rules
971
972 conn output
973 rightsubnet=%dynamic[udp/4789]
974 right=%any
975 type=transport
976 authby=psk
977 auto=route
978
979 conn input
980 leftsubnet=%dynamic[udp/4789]
981 type=transport
982 authby=psk
983 auto=route
984 ----
985
986 Then generate a pre-shared key with:
987
988 ----
989 openssl rand -base64 128
990 ----
991
992 and add the key to `/etc/ipsec.secrets', so that the file contents looks like:
993
994 ----
995 : PSK <generatedbase64key>
996 ----
997
998 You need to copy the PSK and the configuration onto the other nodes.