Each overlay network is known as a VXLAN Segment and identified by a unique
24-bit segment ID called a VXLAN Network Identifier (VNI).
+VXLAN encapsulation add 50bytes overhead, so you need to increase mtu on your host
+physical interfaces to 1550 at minimum. (or decrease mtu inside your vms to 1450)
+
For BUM traffic (broadcast / unknown unicast traffic, multicast),
we have 3 differents vxlan setup modes : multicast, unicast, bgp-evpn
----
auto eno1
iface eno1 inet manual
+ mtu 1550
auto vmbr0
iface vmbr0 inet static
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan-svcnodeip 225.20.1.1
vxlan-physdev eno1
auto vxlan3
iface vxlan3 inet manual
+ vxlan-id 3
vxlan-svcnodeip 225.20.1.1
vxlan-physdev eno1
----
auto eno1
iface eno1 inet manual
+ mtu 1550
auto vmbr0
iface vmbr0 inet static
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan-svcnodeip 225.20.1.1
vxlan-physdev eno1
auto vxlan3
iface vxlan3 inet manual
+ vxlan-id 3
vxlan-svcnodeip 225.20.1.1
vxlan-physdev eno1
----
auto eno1
iface eno1 inet manual
+ mtu 1550
auto vmbr0
iface vmbr0 inet static
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan-svcnodeip 225.20.1.1
vxlan-physdev eno1
auto vxlan3
iface vxlan3 inet manual
+ vxlan-id 3
vxlan-svcnodeip 225.20.1.1
vxlan-physdev eno1
----
auto eno1
iface eno1 inet manual
+ mtu 1550
auto vmbr0
iface vmbr0 inet static
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan_remoteip 192.168.0.2
vxlan_remoteip 192.168.0.3
auto vxlan3
iface vxlan2 inet manual
+ vxlan-id 3
vxlan_remoteip 192.168.0.2
vxlan_remoteip 192.168.0.3
----
auto eno1
iface eno1 inet manual
+ mtu 1550
auto vmbr0
iface vmbr0 inet static
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan_remoteip 192.168.0.1
vxlan_remoteip 192.168.0.3
auto vxlan3
iface vxlan2 inet manual
+ vxlan-id 3
vxlan_remoteip 192.168.0.1
vxlan_remoteip 192.168.0.3
----
auto eno1
iface eno1 inet manual
+ mtu 1550
auto vmbr0
iface vmbr0 inet static
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan_remoteip 192.168.0.2
vxlan_remoteip 192.168.0.3
auto vxlan3
iface vxlan2 inet manual
+ vxlan-id 3
vxlan_remoteip 192.168.0.2
vxlan_remoteip 192.168.0.3
----
auto eno1
iface eno1 inet manual
+ mtu 1550
auto vmbr0
iface vmbr0 inet static
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan-local-tunnelip 192.168.0.1
bridge-learning off
bridge-arp-nd-suppress on
auto vxlan3
iface vxlan3 inet manual
+ vxlan-id 3
vxlan-local-tunnelip 192.168.0.1
bridge-learning off
bridge-arp-nd-suppress on
----
router bgp 1234
no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
coalesce-time 1000
neighbor 192.168.0.2 remote-as 1234
neighbor 192.168.0.3 remote-as 1234
----
auto eno1
iface eno1 inet manual
+ mtu 1550
auto vmbr0
iface vmbr0 inet static
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan-local-tunnelip 192.168.0.2
bridge-learning off
bridge-arp-nd-suppress on
auto vxlan3
iface vxlan3 inet manual
+ vxlan-id 3
vxlan-local-tunnelip 192.168.0.2
bridge-learning off
bridge-arp-nd-suppress on
----
router bgp 1234
no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
coalesce-time 1000
neighbor 192.168.0.1 remote-as 1234
neighbor 192.168.0.3 remote-as 1234
----
auto eno1
iface eno1 inet manual
+ mtu 1550
auto vmbr0
iface vmbr0 inet static
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan-local-tunnelip 192.168.0.3
bridge-learning off
bridge-arp-nd-suppress on
auto vxlan3
iface vxlan3 inet manual
+ vxlan-id 3
vxlan-local-tunnelip 192.168.0.3
bridge-learning off
bridge-arp-nd-suppress on
----
router bgp 1234
no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
coalesce-time 1000
neighbor 192.168.0.1 remote-as 1234
neighbor 192.168.0.2 remote-as 1234
Same vmbr on different node, will have same ip address and same mac address,
to have working vm live migration and no network disruption.
-VXLAN layer3 routing only work with FRR and non-aware bridge.
+VXLAN layer3 routing only work with FRR and non-aware bridge.
(vlan aware bridge support is buggy currently).
asymmetric model
This is the simplest mode. To get it work, all vxlan need to be defined on all nodes.
-The asymmetric model allows routing and bridging on the VXLAN tunnel ingress,
-but only bridging on the egress.
-This results in bi-directional VXLAN traffic traveling on different VNIs
+The asymmetric model allows routing and bridging on the VXLAN tunnel ingress,
+but only bridging on the egress.
+This results in bi-directional VXLAN traffic traveling on different VNIs
in each direction (always the destination VNI) across the routed infrastructure.
image::images/vxlan-l3-asymmetric.svg["vxlan l3 asymmetric",align="center"]
-
-sysctl.conf tuning
-
-----
-#enable routing
-net.ipv4.ip_forward=1
-net.ipv6.conf.all.forwarding=1
-----
-
* node1
----
auto eno1
iface eno1 inet manual
-
+ mtu 1550
+
auto vmbr0
iface vmbr0 inet static
address 192.168.0.1
bridge_ports eno1
bridge_stp off
bridge_fd 0
-
+
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan-local-tunnelip 192.168.0.1
bridge-learning off
bridge-arp-nd-suppress on
bridge_ports vxlan2
bridge_stp off
bridge_fd 0
-
+ ip-forward on
+ ip6-forward on
+ arp-accept on
auto vxlan3
iface vxlan3 inet manual
+ vxlan-id 3
vxlan-local-tunnelip 192.168.0.1
bridge-learning off
bridge-arp-nd-suppress on
bridge_ports vxlan3
bridge_stp off
bridge_fd 0
+ ip-forward on
+ ip6-forward on
+ arp-accept on
----
router bgp 1234
bgp router-id 192.168.0.1
no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
coalesce-time 1000
neighbor 192.168.0.2 remote-as 1234
neighbor 192.168.0.3 remote-as 1234
address-family l2vpn evpn
neighbor 192.168.0.2 activate
neighbor 192.168.0.3 activate
- advertise-all-vni
+ advertise-all-vni
exit-address-family
!
line vty
----
auto eno1
iface eno1 inet manual
-
+ mtu 1550
+
auto vmbr0
iface vmbr0 inet static
address 192.168.0.2
bridge_ports eno1
bridge_stp off
bridge_fd 0
-
+
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan-local-tunnelip 192.168.0.2
bridge-learning off
bridge-arp-nd-suppress on
bridge_ports vxlan2
bridge_stp off
bridge_fd 0
+ ip-forward on
+ ip6-forward on
+ arp-accept on
auto vxlan3
iface vxlan3 inet manual
+ vxlan-id 3
vxlan-local-tunnelip 192.168.0.2
bridge-learning off
bridge-arp-nd-suppress on
bridge_ports vxlan3
bridge_stp off
bridge_fd 0
+ ip-forward on
+ ip6-forward on
+ arp-accept on
----
router bgp 1234
bgp router-id 192.168.0.2
no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
coalesce-time 1000
neighbor 192.168.0.1 remote-as 1234
neighbor 192.168.0.3 remote-as 1234
address-family l2vpn evpn
neighbor 192.168.0.1 activate
neighbor 192.168.0.3 activate
- advertise-all-vni
+ advertise-all-vni
exit-address-family
!
line vty
----
auto eno1
iface eno1 inet manual
-
+ mtu 1550
+
auto vmbr0
iface vmbr0 inet static
address 192.168.0.3
bridge_ports eno1
bridge_stp off
bridge_fd 0
-
+
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan-local-tunnelip 192.168.0.3
bridge-learning off
bridge-arp-nd-suppress on
bridge_ports vxlan2
bridge_stp off
bridge_fd 0
-
+ ip-forward on
+ ip6-forward on
+ arp-accept on
auto vxlan3
iface vxlan3 inet manual
+ vxlan-id 3
vxlan-local-tunnelip 192.168.0.3
bridge-learning off
bridge-arp-nd-suppress on
bridge-unicast-flood off
bridge-multicast-flood off
-
auto vmbr3
iface vmbr3 inet static
address 10.0.3.254
bridge_ports vxlan3
bridge_stp off
bridge_fd 0
+ ip-forward on
+ ip6-forward on
+ arp-accept on
----
router bgp 1234
bgp router-id 192.168.0.3
no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
coalesce-time 1000
neighbor 192.168.0.1 remote-as 1234
neighbor 192.168.0.2 remote-as 1234
address-family l2vpn evpn
neighbor 192.168.0.1 activate
neighbor 192.168.0.2 activate
- advertise-all-vni
+ advertise-all-vni
exit-address-family
!
line vty
^^^^^^^^^^^^^^^
With this model, you don't need to have all vxlan on all nodes.
-This model will also be needed to route traffic to an external router.
+This model will also be needed to route traffic to an external router.
-The symmetric model routes and bridges on both the ingress and the egress leafs.
-This results in bi-directional traffic being able to travel on the same VNI, hence the symmetric name.
-However, a new specialty transit VNI is used for all routed VXLAN traffic, called the L3VNI.
-All traffic that needs to be routed will be routed onto the L3VNI, tunneled across the layer 3 Infrastructure,
+The symmetric model routes and bridges on both the ingress and the egress leafs.
+This results in bi-directional traffic being able to travel on the same VNI, hence the symmetric name.
+However, a new specialty transit VNI is used for all routed VXLAN traffic, called the L3VNI.
+All traffic that needs to be routed will be routed onto the L3VNI, tunneled across the layer 3 Infrastructure,
routed off the L3VNI to the appropriate VLAN and ultimately bridged to the destination.
A vrf is needed for the L3VNI, so all vmbr bridge need to be in the vrf if they want to be able to reach each others.
image::images/vxlan-l3-symmetric.svg["vxlan l3 symmetric",align="center"]
-sysctl.conf tuning
-
-----
-#enable routing
-net.ipv4.ip_forward=1
-net.ipv6.conf.all.forwarding=1
-#disable reverse path filtering
-net.ipv4.conf.default.rp_filter=0
-net.ipv4.conf.all.rp_filter=0
-#allow frr to work with vrf (kernel >4.14 bug)
-net.ipv4.tcp_l3mdev_accept=1
-----
-
* node1
----
auto eno1
iface eno1 inet manual
-
+ mtu 1550
+
auto vmbr0
iface vmbr0 inet static
address 192.168.0.1
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan-local-tunnelip 192.168.0.1
bridge-learning off
bridge-arp-nd-suppress on
netmask 255.255.255.0
hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr2
vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
auto vxlan3
iface vxlan3 inet manual
+ vxlan-id 3
vxlan-local-tunnelip 192.168.0.1
bridge-learning off
bridge-arp-nd-suppress on
netmask 255.255.255.0
hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr3
vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
#interconnect vxlan-vfr l3vni
auto vxlan4000
iface vxlan4000 inet manual
+ vxlan-id 4000
vxlan-local-tunnelip 192.168.0.1
bridge-learning off
bridge-arp-nd-suppress on
bridge_ports vxlan4000
bridge_stp off
bridge_fd 0
- hwaddress 44:39:39:FF:40:90 #must be different on each node
vrf vrf1
----
----
vrf vrf1
vni 4000
+ exit-vrf
!
router bgp 1234
bgp router-id 192.168.0.1
no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
coalesce-time 1000
neighbor 192.168.0.2 remote-as 1234
neighbor 192.168.0.3 remote-as 1234
advertise-all-vni
exit-address-family
!
-router bgp 1234 vrf vrf1
-!
- bgp router-id 192.168.0.1
- !
- address-family ipv4 unicast
- redistribute connected
- exit-address-family
- !
- address-family l2vpn evpn
- advertise ipv4 unicast
- exit-address-family
-!
line vty
!
----
auto eno1
iface eno1 inet manual
-
+ mtu 1550
+
auto vmbr0
iface vmbr0 inet static
address 192.168.0.2
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan-local-tunnelip 192.168.0.2
bridge-learning off
bridge-arp-nd-suppress on
netmask 255.255.255.0
hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr2
vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
auto vxlan3
iface vxlan3 inet manual
+ vxlan-id 3
vxlan-local-tunnelip 192.168.0.2
bridge-learning off
bridge-arp-nd-suppress on
netmask 255.255.255.0
hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr3
vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
#interconnect vxlan-vfr l3vni
auto vxlan4000
iface vxlan4000 inet manual
+ vxlan-id 4000
vxlan-local-tunnelip 192.168.0.2
bridge-learning off
bridge-arp-nd-suppress on
bridge_ports vxlan4000
bridge_stp off
bridge_fd 0
- hwaddress 44:39:39:FF:40:91 #must be different on each node
vrf vrf1
----
----
vrf vrf1
vni 4000
+ exit-vrf
!
router bgp 1234
bgp router-id 192.168.0.2
no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
coalesce-time 1000
neighbor 192.168.0.1 remote-as 1234
neighbor 192.168.0.3 remote-as 1234
advertise-all-vni
exit-address-family
!
-router bgp 1234 vrf vrf1
-!
- bgp router-id 192.168.0.2
- !
- address-family ipv4 unicast
- redistribute connected
- exit-address-family
- !
- address-family l2vpn evpn
- advertise ipv4 unicast
- exit-address-family
-!
line vty
!
----
auto eno1
iface eno1 inet manual
-
+ mtu 1550
+
auto vmbr0
iface vmbr0 inet static
address 192.168.0.3
auto vxlan2
iface vxlan2 inet manual
+ vxlan-id 2
vxlan-local-tunnelip 192.168.0.3
bridge-learning off
bridge-arp-nd-suppress on
netmask 255.255.255.0
hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr2
vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
auto vxlan3
iface vxlan3 inet manual
+ vxlan-id 3
vxlan-local-tunnelip 192.168.0.3
bridge-learning off
bridge-arp-nd-suppress on
netmask 255.255.255.0
hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr3
vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
#interconnect vxlan-vfr l3vni
auto vxlan4000
iface vxlan4000 inet manual
+ vxlan-id 4000
vxlan-local-tunnelip 192.168.0.3
bridge-learning off
bridge-arp-nd-suppress on
bridge_ports vxlan4000
bridge_stp off
bridge_fd 0
- hwaddress 44:39:39:FF:40:92 #must be different on each node
vrf vrf1
----
----
vrf vrf1
vni 4000
+ exit-vrf
!
router bgp 1234
bgp router-id 192.168.0.3
no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
coalesce-time 1000
neighbor 192.168.0.1 remote-as 1234
neighbor 192.168.0.2 remote-as 1234
advertise-all-vni
exit-address-family
!
+line vty
+!
+----
+
+VXLAN layer3 routing with anycast gateway + routing to outside with external router
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Routing to outside need the symmetric model.
+
+1 gateway node
+^^^^^^^^^^^^^^
+In this example, we'll use only 1 proxmox node as exit gateway. (node1)
+This node announce the default gw in vrf1 (default originate) and forward to his own default gateway (192.168.0.254) (no bgp between router and node1)
+
+
+*node1
+
+----
+auto vrf1
+iface vrf1
+ vrf-table auto
+
+auto eno1
+iface eno1 inet manual
+ mtu 1550
+
+auto vmbr0
+iface vmbr0 inet static
+ address 192.168.0.1
+ netmask 255.255.255.0
+ gateway 192.168.0.254
+ bridge_ports eno1
+ bridge_stp off
+ bridge_fd 0
+ ip-forward on
+ ip6-forward on
+
+auto vxlan2
+iface vxlan2 inet manual
+ vxlan-id 2
+ vxlan-local-tunnelip 192.168.0.1
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr2
+iface vmbr2 inet static
+ bridge_ports vxlan2
+ bridge_stp off
+ bridge_fd 0
+ address 10.0.2.254
+ netmask 255.255.255.0
+ hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr2
+ vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
+
+auto vxlan3
+iface vxlan3 inet manual
+ vxlan-id 3
+ vxlan-local-tunnelip 192.168.0.1
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr3
+iface vmbr3 inet static
+ bridge_ports vxlan3
+ bridge_stp off
+ bridge_fd 0
+ address 10.0.3.254
+ netmask 255.255.255.0
+ hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr3
+ vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
+
+#interconnect vxlan-vfr l3vni
+auto vxlan4000
+iface vxlan4000 inet manual
+ vxlan-id 4000
+ vxlan-local-tunnelip 192.168.0.1
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr4000
+iface vmbr4000 inet manual
+ bridge_ports vxlan4000
+ bridge_stp off
+ bridge_fd 0
+ vrf vrf1
+----
+
+
+frr.conf
+
+----
+vrf vrf1
+ vni 4000
+ exit-vrf
+!
+router bgp 1234
+ bgp router-id 192.168.0.1
+ no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
+ coalesce-time 1000
+ neighbor 192.168.0.2 remote-as 1234
+ neighbor 192.168.0.3 remote-as 1234
+ !
+ address-family ipv4 unicast
+ import vrf vrf1
+ exit-address-family
+ !
+ address-family ipv6 unicast
+ import vrf vrf1
+ exit-address-family
+ !
+ address-family l2vpn evpn
+ neighbor 192.168.0.2 activate
+ neighbor 192.168.0.3 activate
+ advertise-all-vni
+ exit-address-family
+!
router bgp 1234 vrf vrf1
-!
- bgp router-id 192.168.0.3
- !
+!
address-family ipv4 unicast
redistribute connected
exit-address-family
!
+ address-family ipv6 unicast
+ redistribute connected
+ exit-address-family
+ !
+ address-family l2vpn evpn
+ default-originate ipv4
+ default-originate ipv6
+ exit-address-family
+!
+line vty
+!
+----
+
+
+* node2
+
+----
+auto vrf1
+iface vrf1
+ vrf-table auto
+
+auto eno1
+iface eno1 inet manual
+ mtu 1550
+
+auto vmbr0
+iface vmbr0 inet static
+ address 192.168.0.2
+ netmask 255.255.255.0
+ bridge_ports eno1
+ bridge_stp off
+ bridge_fd 0
+
+auto vxlan2
+iface vxlan2 inet manual
+ vxlan-id 2
+ vxlan-local-tunnelip 192.168.0.2
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr2
+iface vmbr2 inet static
+ bridge_ports vxlan2
+ bridge_stp off
+ bridge_fd 0
+ address 10.0.2.254
+ netmask 255.255.255.0
+ hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr2
+ vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
+
+auto vxlan3
+iface vxlan3 inet manual
+ vxlan-id 3
+ vxlan-local-tunnelip 192.168.0.2
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr3
+iface vmbr3 inet static
+ bridge_ports vxlan3
+ bridge_stp off
+ bridge_fd 0
+ address 10.0.3.254
+ netmask 255.255.255.0
+ hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr3
+ vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
+
+#interconnect vxlan-vfr l3vni
+auto vxlan4000
+iface vxlan4000 inet manual
+ vxlan-id 4000
+ vxlan-local-tunnelip 192.168.0.2
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+
+auto vmbr4000
+iface vmbr4000 inet manual
+ bridge_ports vxlan4000
+ bridge_stp off
+ bridge_fd 0
+ vrf vrf1
+----
+
+
+frr.conf
+
+----
+vrf vrf1
+ vni 4000
+ exit-vrf
+!
+router bgp 1234
+ bgp router-id 192.168.0.2
+ no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
+ coalesce-time 1000
+ neighbor 192.168.0.1 remote-as 1234
+ neighbor 192.168.0.3 remote-as 1234
+ !
address-family l2vpn evpn
- advertise ipv4 unicast
+ neighbor 192.168.0.1 activate
+ neighbor 192.168.0.3 activate
+ advertise-all-vni
exit-address-family
!
line vty
!
----
+
+
+* node3
+
+----
+auto vrf1
+iface vrf1
+ vrf-table auto
+
+auto eno1
+iface eno1 inet manual
+ mtu 1550
+
+auto vmbr0
+iface vmbr0 inet static
+ address 192.168.0.3
+ netmask 255.255.255.0
+ bridge_ports eno1
+ bridge_stp off
+ bridge_fd 0
+
+auto vxlan2
+iface vxlan2 inet manual
+ vxlan-id 2
+ vxlan-local-tunnelip 192.168.0.3
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr2
+iface vmbr2 inet static
+ bridge_ports vxlan2
+ bridge_stp off
+ bridge_fd 0
+ address 10.0.2.254
+ netmask 255.255.255.0
+ hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr2
+ vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
+
+auto vxlan3
+iface vxlan3 inet manual
+ vxlan-id 3
+ vxlan-local-tunnelip 192.168.0.3
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr3
+iface vmbr3 inet static
+ bridge_ports vxlan3
+ bridge_stp off
+ bridge_fd 0
+ address 10.0.3.254
+ netmask 255.255.255.0
+ hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr3
+ vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
+
+#interconnect vxlan-vfr l3vni
+auto vxlan4000
+iface vxlan4000 inet manual
+ vxlan-id 4000
+ vxlan-local-tunnelip 192.168.0.3
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+
+auto vmbr4000
+iface vmbr4000 inet manual
+ bridge_ports vxlan4000
+ bridge_stp off
+ bridge_fd 0
+ vrf vrf1
+----
+
+
+frr.conf
+
+----
+vrf vrf1
+ vni 4000
+ exit-vrf
+!
+router bgp 1234
+ bgp router-id 192.168.0.3
+ no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
+ coalesce-time 1000
+ neighbor 192.168.0.1 remote-as 1234
+ neighbor 192.168.0.2 remote-as 1234
+ !
+ address-family l2vpn evpn
+ neighbor 192.168.0.1 activate
+ neighbor 192.168.0.2 activate
+ advertise-all-vni
+ exit-address-family
+!
+line vty
+!
+----
+
+multiple gateway nodes
+^^^^^^^^^^^^^^^^^^^^^^
+In this example, all nodes will be used as exit gateway. (But you can use only 2 nodes if you want)
+All nodes have a a default gw to the external router (192.168.0.254) (no bgp between router and node1)
+and announce this default gw in the vrf (default originate)
+The external router have ecmp routes to all proxmox nodes.(balancing).
+If the router send the packet to a wrong node (vm is not on this node), this node will route through
+vxlan the packet to final destination.
+
+*node1
+
+----
+auto vrf1
+iface vrf1
+ vrf-table auto
+
+auto eno1
+iface eno1 inet manual
+ mtu 1550
+
+auto vmbr0
+iface vmbr0 inet static
+ address 192.168.0.1
+ netmask 255.255.255.0
+ gateway 192.168.0.254
+ bridge_ports eno1
+ bridge_stp off
+ bridge_fd 0
+ ip-forward on
+ ip6-forward on
+
+auto vxlan2
+iface vxlan2 inet manual
+ vxlan-id 2
+ vxlan-local-tunnelip 192.168.0.1
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr2
+iface vmbr2 inet static
+ bridge_ports vxlan2
+ bridge_stp off
+ bridge_fd 0
+ address 10.0.2.254
+ netmask 255.255.255.0
+ hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr2
+ vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
+
+auto vxlan3
+iface vxlan3 inet manual
+ vxlan-id 3
+ vxlan-local-tunnelip 192.168.0.1
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr3
+iface vmbr3 inet static
+ bridge_ports vxlan3
+ bridge_stp off
+ bridge_fd 0
+ address 10.0.3.254
+ netmask 255.255.255.0
+ hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr3
+ vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
+
+#interconnect vxlan-vfr l3vni
+auto vxlan4000
+iface vxlan4000 inet manual
+ vxlan-id 4000
+ vxlan-local-tunnelip 192.168.0.1
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr4000
+iface vmbr4000 inet manual
+ bridge_ports vxlan4000
+ bridge_stp off
+ bridge_fd 0
+ vrf vrf1
+----
+
+
+frr.conf
+
+----
+vrf vrf1
+ vni 4000
+ exit-vrf
+!
+router bgp 1234
+ bgp router-id 192.168.0.1
+ no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
+ coalesce-time 1000
+ neighbor 192.168.0.2 remote-as 1234
+ neighbor 192.168.0.3 remote-as 1234
+ !
+ address-family ipv4 unicast
+ import vrf vrf1
+ exit-address-family
+ !
+ address-family ipv6 unicast
+ import vrf vrf1
+ exit-address-family
+ !
+ address-family l2vpn evpn
+ neighbor 192.168.0.2 activate
+ neighbor 192.168.0.3 activate
+ advertise-all-vni
+ exit-address-family
+!
+router bgp 1234 vrf vrf1
+!
+ address-family ipv4 unicast
+ redistribute connected
+ exit-address-family
+ !
+ address-family ipv6 unicast
+ redistribute connected
+ exit-address-family
+ !
+ address-family l2vpn evpn
+ default-originate ipv4
+ default-originate ipv6
+ exit-address-family
+!
+line vty
+!
+----
+
+
+* node2
+
+----
+auto vrf1
+iface vrf1
+ vrf-table auto
+
+auto eno1
+iface eno1 inet manual
+ mtu 1550
+
+auto vmbr0
+iface vmbr0 inet static
+ address 192.168.0.2
+ netmask 255.255.255.0
+ gateway 192.168.0.254
+ bridge_ports eno1
+ bridge_stp off
+ bridge_fd 0
+ ip-forward on
+ ip6-forward on
+
+auto vxlan2
+iface vxlan2 inet manual
+ vxlan-id 2
+ vxlan-local-tunnelip 192.168.0.2
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr2
+iface vmbr2 inet static
+ bridge_ports vxlan2
+ bridge_stp off
+ bridge_fd 0
+ address 10.0.2.254
+ netmask 255.255.255.0
+ hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr2
+ vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
+
+auto vxlan3
+iface vxlan3 inet manual
+ vxlan-id 3
+ vxlan-local-tunnelip 192.168.0.2
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr3
+iface vmbr3 inet static
+ bridge_ports vxlan3
+ bridge_stp off
+ bridge_fd 0
+ address 10.0.3.254
+ netmask 255.255.255.0
+ hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr3
+ vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
+
+#interconnect vxlan-vfr l3vni
+auto vxlan4000
+iface vxlan4000 inet manual
+ vxlan-id 4000
+ vxlan-local-tunnelip 192.168.0.2
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+
+auto vmbr4000
+iface vmbr4000 inet manual
+ bridge_ports vxlan4000
+ bridge_stp off
+ bridge_fd 0
+ vrf vrf1
+----
+
+
+frr.conf
+
+----
+vrf vrf1
+ vni 4000
+ exit-vrf
+!
+router bgp 1234
+ bgp router-id 192.168.0.2
+ no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
+ coalesce-time 1000
+ neighbor 192.168.0.1 remote-as 1234
+ neighbor 192.168.0.3 remote-as 1234
+ !
+ address-family ipv4 unicast
+ import vrf vrf1
+ exit-address-family
+ !
+ address-family ipv6 unicast
+ import vrf vrf1
+ exit-address-family
+ !
+ address-family l2vpn evpn
+ neighbor 192.168.0.1 activate
+ neighbor 192.168.0.3 activate
+ advertise-all-vni
+ exit-address-family
+!
+ address-family ipv4 unicast
+ redistribute connected
+ exit-address-family
+ !
+ address-family ipv6 unicast
+ redistribute connected
+ exit-address-family
+ !
+ address-family l2vpn evpn
+ default-originate ipv4
+ default-originate ipv6
+ exit-address-family
+!
+line vty
+!
+----
+
+
+* node3
+
+----
+auto vrf1
+iface vrf1
+ vrf-table auto
+
+auto eno1
+iface eno1 inet manual
+ mtu 1550
+
+auto vmbr0
+iface vmbr0 inet static
+ address 192.168.0.3
+ netmask 255.255.255.0
+ gateway 192.168.0.254
+ bridge_ports eno1
+ bridge_stp off
+ bridge_fd 0
+ ip-forward on
+ ip6-forward on
+
+auto vxlan2
+iface vxlan2 inet manual
+ vxlan-id 2
+ vxlan-local-tunnelip 192.168.0.3
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr2
+iface vmbr2 inet static
+ bridge_ports vxlan2
+ bridge_stp off
+ bridge_fd 0
+ address 10.0.2.254
+ netmask 255.255.255.0
+ hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr2
+ vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
+
+auto vxlan3
+iface vxlan3 inet manual
+ vxlan-id 3
+ vxlan-local-tunnelip 192.168.0.3
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+auto vmbr3
+iface vmbr3 inet static
+ bridge_ports vxlan3
+ bridge_stp off
+ bridge_fd 0
+ address 10.0.3.254
+ netmask 255.255.255.0
+ hwaddress 44:39:39:FF:40:94 #must be same on each node vmbr3
+ vrf vrf1
+ ip-forward on
+ ip6-forward on
+ arp-accept on
+
+#interconnect vxlan-vfr l3vni
+auto vxlan4000
+iface vxlan4000 inet manual
+ vxlan-id 4000
+ vxlan-local-tunnelip 192.168.0.3
+ bridge-learning off
+ bridge-arp-nd-suppress on
+ bridge-unicast-flood off
+ bridge-multicast-flood off
+
+
+auto vmbr4000
+iface vmbr4000 inet manual
+ bridge_ports vxlan4000
+ bridge_stp off
+ bridge_fd 0
+ vrf vrf1
+----
+
+
+frr.conf
+
+----
+vrf vrf1
+ vni 4000
+ exit-vrf
+!
+router bgp 1234
+ bgp router-id 192.168.0.3
+ no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
+ coalesce-time 1000
+ neighbor 192.168.0.1 remote-as 1234
+ neighbor 192.168.0.2 remote-as 1234
+ !
+ address-family ipv4 unicast
+ import vrf vrf1
+ exit-address-family
+ !
+ address-family ipv6 unicast
+ import vrf vrf1
+ exit-address-family
+ !
+ address-family l2vpn evpn
+ neighbor 192.168.0.1 activate
+ neighbor 192.168.0.2 activate
+ advertise-all-vni
+ exit-address-family
+!
+router bgp 1234 vrf vrf1
+!
+ address-family ipv4 unicast
+ redistribute connected
+ exit-address-family
+ !
+ address-family ipv6 unicast
+ redistribute connected
+ exit-address-family
+ !
+ address-family l2vpn evpn
+ default-originate ipv4
+ default-originate ipv6
+ exit-address-family
+!
+line vty
+!
+----
+
+Note
+^^^^
+
+If your external router doesn't support 'ECMP static routes' to reach multiple
+{pve} nodes, you can setup an HA floating vip on proxmox nodes by using the
+Virtual Router Redundancy Protocol (VRRP).
+
+In this example, we will setup an floating 192.168.0.10 IP on node1 and node2.
+Node1 is the primary with failover to node2 in case of outage.
+
+This setup currently needs 'vrrpd' package (`apt install vrrpd`).
+#TODO : It should be possible to do it with frr directly with last version.
+
+* node1
+
+----
+auto vmbr0
+iface vmbr0 inet static
+ address 192.168.0.1
+ netmask 255.255.255.0
+ gateway 192.168.0.254
+ bridge_ports eno1
+ bridge_stp off
+ bridge_fd 0
+ vrrp-id 1
+ vrrp-priority 1
+ vrrp-virtual-ip 192.168.0.10
+----
+
+* node2
+
+----
+auto vmbr0
+iface vmbr0 inet static
+ address 192.168.0.2
+ netmask 255.255.255.0
+ gateway 192.168.0.254
+ bridge_ports eno1
+ bridge_stp off
+ bridge_fd 0
+ vrrp-id 1
+ vrrp-priority 2
+ vrrp-virtual-ip 192.168.0.10
+----
+
+
+Route Reflectors
+^^^^^^^^^^^^^^^^
+If you have a lot of proxmox nodes, or multiple proxmox clusters, you may want
+to avoid that all node peers with each others nodes.
+For this, you can create dedicated route reflectors (RR) servers. As a RR is a
+single point of failure, a minimum of two servers acting as an RR is highly
+recommended for redundancy.
+
+Below is an example of configuration with 'frr', with `rrserver1
+(192.168.0.200)' and `rrserver2 (192.168.0.201)`.
+
+rrserver1
+----
+router bgp 1234
+ bgp router-id 192.168.0.200
+ bgp cluster-id 1.1.1.1 #cluster-id must be the same on each route reflector
+ bgp log-neighbor-changes
+ no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
+ neighbor fabric peer-group
+ neighbor fabric remote-as 1234
+ neighbor fabric capability extended-nexthop
+ neighbor fabric update-source 192.168.0.200
+ bgp listen range 192.168.0.0/24 peer-group fabric #allow any proxmoxnode client in the network range
+ !
+ address-family l2vpn evpn
+ neighbor fabric activate
+ neighbor fabric route-reflector-client
+ neighbor fabric allowas-in
+ exit-address-family
+ !
+ exit
+!
+---
+
+rrserver2
+----
+router bgp 1234
+ bgp router-id 192.168.0.201
+ bgp cluster-id 1.1.1.1
+ bgp log-neighbor-changes
+ no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
+ neighbor fabric peer-group
+ neighbor fabric remote-as 1234
+ neighbor fabric capability extended-nexthop
+ neighbor fabric update-source 192.168.0.201
+ bgp listen range 192.168.0.0/24 peer-group fabric
+ !
+ address-family l2vpn evpn
+ neighbor fabric activate
+ neighbor fabric route-reflector-client
+ neighbor fabric allowas-in
+ exit-address-family
+ !
+ exit
+!
+---
+
+proxmoxnode(s)
+----
+router bgp 1234
+ bgp router-id 192.168.0.x
+ no bgp default ipv4-unicast
+ no bgp default ipv6-unicast
+ coalesce-time 1000
+ neighbor 192.168.0.200 remote-as 1234
+ neighbor 192.168.0.201 remote-as 1234
+ !
+ address-family l2vpn evpn
+ neighbor 192.168.0.200 activate
+ neighbor 192.168.0.201 activate
+ advertise-all-vni
+ exit-address-family
+!
+----
+
+#TODO : Documentation with bgp upstream router.