[[chapter_pvesdn]] Software Defined Network ======================== ifndef::manvolnum[] :pve-toplevel: endif::manvolnum[] The **S**oftware **D**efined **N**etwork (SDN) feature allows one to create virtual networks (vnets) at datacenter level. WARNING: SDN is currently an **experimental feature** in {pve}. This Documentation for it is also still under development, ask on our xref:getting_help[mailing lists or in the forum] for questions and feedback. [[pvesdn_installation]] Installation ------------ To enable the experimental SDN integration, you need to install "libpve-network-perl" package ---- apt install libpve-network-perl ---- You need to have `ifupdown2` package installed on each node to manage local configuration reloading without reboot: ---- apt install ifupdown2 ---- Basic Overview -------------- The {pve} SDN allows separation and fine grained control of Virtual Guests networks, using flexible software controlled configurations. Separation consists of zones, a zone is it's own virtual separated network area. A 'VNet' is a type of a virtual network connected to a zone. Depending on which type or plugin the zone uses it can behave differently and offer different features, advantages or disadvantages. Normally a 'VNet' shows up as a common Linux bridge with either a VLAN or 'VXLAN' tag, but some can also use layer 3 routing for control. The 'VNets' are deployed locally on each node, after configuration was committed from the cluster wide datacenter SDN administration interface. Main configuration ------------------ The configuration is done at datacenter (cluster-wide) level, it will be saved in configuration files located in the shared configuration file system: `/etc/pve/sdn` On the web-interface SDN feature have 4 main sections for the configuration * SDN: a overview of the SDN state * Zones: Create and manage the virtual separated network Zones * VNets: The per-node building block to provide a Zone for VMs * Controller: For complex setups to control Layer 3 routing [[pvesdn_config_main_sdn]] SDN ~~~ This is the main status panel. Here you can see deployment status of zones on different nodes. There is an 'Apply' button, to push and reload local configuration on all cluster nodes nodes. [[pvesdn_config_zone]] Zones ~~~~~ A zone will define a virtually separated network. It can use different technologies for separation: * VLAN: Virtual LANs are the classic method to sub-divide a LAN * QinQ: stacked VLAN (formally known as `IEEE 802.1ad`) * VXLAN: (layer2 vxlan) * bgp-evpn: vxlan using layer3 border gateway protocol routing You can restrict a zone to specific nodes. It's also possible to add permissions on a zone, to restrict user to use only a specific zone and only the VNets in that zone [[pvesdn_config_vnet]] VNets ~~~~~ A `VNet` is in its basic form just a Linux bridge that will be deployed locally on the node and used for Virtual Machine communication. VNet properties are: * ID: a 8 characters ID to name and identify a VNet * Alias: Optional longer name, if the ID isn't enough * Zone: The associated zone for this VNet * Tag: The unique VLAN or VXLAN id * VLAN Aware: Allow to add an extra VLAN tag in the virtual machine or container vNIC configurations or allow the guest OS to manage the VLAN's tag. * IPv4: an anycast IPv4 address, it will be configured on the underlying bridge on each node part of the Zone. It's only useful for `bgp-evpn` routing. * IPv6: an anycast IPv6 address, it will be configured on the underlying bridge on each node part of the Zone. It's only useful for `bgp-evpn` routing. [[pvesdn_config_controllers]] Controllers ~~~~~~~~~~~ Some zone types need an external controller to manage the VNet control-plane. Currently this is only required for the `bgp-evpn` zone plugin. [[pvesdn_zone_plugins]] Zones Plugins ------------- Common options ~~~~~~~~~~~~~~ nodes:: Deploy and allow to use a VNets configured for this Zone only on these nodes. [[pvesdn_zone_plugin_vlan]] VLAN Zones ~~~~~~~~~~ This is the simplest plugin, it will reuse an existing local Linux or OVS bridge, and manage VLANs on it. The benefit of using SDN module, is that you can create different zones with specific VNets VLAN tag, and restrict Virtual Machines to separated zones. Specific `VLAN` configuration options: bridge:: Reuse this local bridge or OVS switch, already configured on *each* local node. [[pvesdn_zone_plugin_qinq]] QinQ Zones ~~~~~~~~~~ QinQ is stacked VLAN. The first VLAN tag defined for the zone (so called 'service-vlan'), and the second VLAN tag defined for the vnets NOTE: Your physical network switches must support stacked VLANs! Specific QinQ configuration options: bridge:: A local VLAN-aware bridge already configured on each local node service vlan:: The main VLAN tag of this zone mtu:: Due to the double stacking of tags you need 4 more bytes for QinQ VLANs. For example, you reduce the MTU to `1496` if you physical interface MTU is `1500`. [[pvesdn_zone_plugin_vxlan]] VXLAN Zones ~~~~~~~~~~~ The VXLAN plugin will establish a tunnel (named overlay) on top of an existing network (named underlay). It encapsulate layer 2 Ethernet frames within layer 4 UDP datagrams, using `4789` as the default destination port. You can, for example, create a private IPv4 VXLAN network on top of public internet network nodes. This is a layer2 tunnel only, no routing between different VNets is possible. Each VNet will have use specific VXLAN id from the range (1 - 16777215). Specific EVPN configuration options: peers address list:: A list of IPs from all nodes through which you want to communicate. Can also be external nodes. mtu:: Because VXLAN encapsulation use 50bytes, the MTU need to be 50 bytes lower than the outgoing physical interface. [[pvesdn_zone_plugin_evpn]] EVPN Zones ~~~~~~~~~~ This is the most complex of all supported plugins. BGP-EVPN allows one to create routable layer3 network. The VNet of EVPN can have an anycast IP-address and or MAC-address. The bridge IP is the same on each node, with this a virtual guest can use that address as gateway. Routing can work across VNets from different zones through a VRF (Virtual Routing and Forwarding) interface. Specific EVPN configuration options: VRF VXLAN Tag:: This is a vxlan-id used for routing interconnect between vnets, it must be different than VXLAN-id of VNets controller:: an EVPN-controller need to be defined first (see controller plugins section) mtu:: because VXLAN encapsulation use 50bytes, the MTU need to be 50 bytes lower than the outgoing physical interface. [[pvesdn_controller_plugins]] Controllers Plugins ------------------- For complex zones requiring a control plane. [[pvesdn_controller_plugin_evpn]] EVPN Controller ~~~~~~~~~~~~~~~ For `BGP-EVPN`, we need a controller to manage the control plane. The currently supported software controller is the "frr" router. You may need to install it on each node where you want to deploy EVPN zones. ---- apt install frr ---- Configuration options: asn:: A unique BGP ASN number. It's highly recommended to use private ASN number (64512 – 65534, 4200000000 – 4294967294), as else you could end up breaking, or get broken, by global routing by mistake. peers:: An ip list of all nodes where you want to communicate (could be also external nodes or route reflectors servers) Additionally, if you want to route traffic from a SDN BGP-EVPN network to external world: gateway-nodes:: The proxmox nodes from where the bgp-evpn traffic will exit to external through the nodes default gateway gateway-external-peers:: If you want that gateway nodes don't use the default gateway, but, for example, sent traffic to external BGP routers, which handle (reverse) routing then dynamically you can use. For example `192.168.0.253,192.168.0.254' [[pvesdn_local_deployment_monitoring]] Local Deployment Monitoring --------------------------- After applying the configuration through the main SDN web-interface panel, the local network configuration is generated locally on each node in `/etc/network/interfaces.d/sdn`, and with ifupdown2 reloaded. You need to add ---- source /etc/network/interfaces.d/* ---- at the end of /etc/network/interfaces to have the sdn config included You can monitor the status of local zones and vnets through the main tree. [[pvesdn_setup_example_vlan]] VLAN Setup Example ------------------ TIP: While we show plain configuration content here, almost everything should be configurable using the web-interface only. Node1: /etc/network/interfaces ---- auto vmbr0 iface vmbr0 inet manual bridge-ports eno1 bridge-stp off bridge-fd 0 bridge-vlan-aware yes bridge-vids 2-4094 #management ip on vlan100 auto vmbr0.100 iface vmbr0.100 inet static address 192.168.0.1/24 source /etc/network/interfaces.d/* ---- Node2: /etc/network/interfaces ---- auto vmbr0 iface vmbr0 inet manual bridge-ports eno1 bridge-stp off bridge-fd 0 bridge-vlan-aware yes bridge-vids 2-4094 #management ip on vlan100 auto vmbr0.100 iface vmbr0.100 inet static address 192.168.0.2/24 source /etc/network/interfaces.d/* ---- Create a VLAN zone named `myvlanzone': ---- id: myvlanzone bridge: vmbr0 ---- Create a VNet named `myvnet1' with `vlan-id` `10' and the previously created `myvlanzone' as it's zone. ---- id: myvnet1 zone: myvlanzone tag: 10 ---- Apply the configuration through the main SDN panel, to create VNets locally on each nodes. Create a Debian-based Virtual Machine (vm1) on node1, with a vNIC on `myvnet1'. Use the following network configuration for this VM: ---- auto eth0 iface eth0 inet static address 10.0.3.100/24 ---- Create a second Virtual Machine (vm2) on node2, with a vNIC on the same VNet `myvnet1' as vm1. Use the following network configuration for this VM: ---- auto eth0 iface eth0 inet static address 10.0.3.101/24 ---- Then, you should be able to ping between both VMs over that network. [[pvesdn_setup_example_qinq]] QinQ Setup Example ------------------ TIP: While we show plain configuration content here, almost everything should be configurable using the web-interface only. Node1: /etc/network/interfaces ---- auto vmbr0 iface vmbr0 inet manual bridge-ports eno1 bridge-stp off bridge-fd 0 bridge-vlan-aware yes bridge-vids 2-4094 #management ip on vlan100 auto vmbr0.100 iface vmbr0.100 inet static address 192.168.0.1/24 source /etc/network/interfaces.d/* ---- Node2: /etc/network/interfaces ---- auto vmbr0 iface vmbr0 inet manual bridge-ports eno1 bridge-stp off bridge-fd 0 bridge-vlan-aware yes bridge-vids 2-4094 #management ip on vlan100 auto vmbr0.100 iface vmbr0.100 inet static address 192.168.0.2/24 source /etc/network/interfaces.d/* ---- Create an QinQ zone named `qinqzone1' with service VLAN 20 ---- id: qinqzone1 bridge: vmbr0 service vlan: 20 ---- Create another QinQ zone named `qinqzone2' with service VLAN 30 ---- id: qinqzone2 bridge: vmbr0 service vlan: 30 ---- Create a VNet named `myvnet1' with customer vlan-id 100 on the previously created `qinqzone1' zone. ---- id: myvnet1 zone: qinqzone1 tag: 100 ---- Create a `myvnet2' with customer VLAN-id 100 on the previously created `qinqzone2' zone. ---- id: myvnet2 zone: qinqzone2 tag: 100 ---- Apply the configuration on the main SDN web-interface panel to create VNets locally on each nodes. Create a Debian-based Virtual Machine (vm1) on node1, with a vNIC on `myvnet1'. Use the following network configuration for this VM: ---- auto eth0 iface eth0 inet static address 10.0.3.100/24 ---- Create a second Virtual Machine (vm2) on node2, with a vNIC on the same VNet `myvnet1' as vm1. Use the following network configuration for this VM: ---- auto eth0 iface eth0 inet static address 10.0.3.101/24 ---- Create a third Virtual Machine (vm3) on node1, with a vNIC on the other VNet `myvnet2'. Use the following network configuration for this VM: ---- auto eth0 iface eth0 inet static address 10.0.3.102/24 ---- Create another Virtual Machine (vm4) on node2, with a vNIC on the same VNet `myvnet2' as vm3. Use the following network configuration for this VM: ---- auto eth0 iface eth0 inet static address 10.0.3.103/24 ---- Then, you should be able to ping between the VMs 'vm1' and 'vm2', also between 'vm3' and 'vm4'. But, none of VMs 'vm1' or 'vm2' can ping the VMs 'vm3' or 'vm4', as they are on a different zone with different service-vlan. [[pvesdn_setup_example_vxlan]] VXLAN Setup Example ------------------- TIP: While we show plain configuration content here, almost everything should be configurable using the web-interface only. node1: /etc/network/interfaces ---- auto vmbr0 iface vmbr0 inet static address 192.168.0.1/24 gateway 192.168.0.254 bridge-ports eno1 bridge-stp off bridge-fd 0 mtu 1500 source /etc/network/interfaces.d/* ---- node2: /etc/network/interfaces ---- auto vmbr0 iface vmbr0 inet static address 192.168.0.2/24 gateway 192.168.0.254 bridge-ports eno1 bridge-stp off bridge-fd 0 mtu 1500 source /etc/network/interfaces.d/* ---- node3: /etc/network/interfaces ---- auto vmbr0 iface vmbr0 inet static address 192.168.0.3/24 gateway 192.168.0.254 bridge-ports eno1 bridge-stp off bridge-fd 0 mtu 1500 source /etc/network/interfaces.d/* ---- Create an VXLAN zone named `myvxlanzone', use the lower MTU to ensure the extra 50 bytes of the VXLAN header can fit. Add all previously configured IPs from the nodes as peer address list. ---- id: myvxlanzone peers address list: 192.168.0.1,192.168.0.2,192.168.0.3 mtu: 1450 ---- Create a VNet named `myvnet1' using the VXLAN zone `myvxlanzone' created previously. ---- id: myvnet1 zone: myvxlanzone tag: 100000 ---- Apply the configuration on the main SDN web-interface panel to create VNets locally on each nodes. Create a Debian-based Virtual Machine (vm1) on node1, with a vNIC on `myvnet1'. Use the following network configuration for this VM, note the lower MTU here. ---- auto eth0 iface eth0 inet static address 10.0.3.100/24 mtu 1450 ---- Create a second Virtual Machine (vm2) on node3, with a vNIC on the same VNet `myvnet1' as vm1. Use the following network configuration for this VM: ---- auto eth0 iface eth0 inet static address 10.0.3.101/24 mtu 1450 ---- Then, you should be able to ping between between 'vm1' and 'vm2'. [[pvesdn_setup_example_evpn]] EVPN Setup Example ------------------ node1: /etc/network/interfaces ---- auto vmbr0 iface vmbr0 inet static address 192.168.0.1/24 gateway 192.168.0.254 bridge-ports eno1 bridge-stp off bridge-fd 0 mtu 1500 source /etc/network/interfaces.d/* ---- node2: /etc/network/interfaces ---- auto vmbr0 iface vmbr0 inet static address 192.168.0.2/24 gateway 192.168.0.254 bridge-ports eno1 bridge-stp off bridge-fd 0 mtu 1500 source /etc/network/interfaces.d/* ---- node3: /etc/network/interfaces ---- auto vmbr0 iface vmbr0 inet static address 192.168.0.3/24 gateway 192.168.0.254 bridge-ports eno1 bridge-stp off bridge-fd 0 mtu 1500 source /etc/network/interfaces.d/* ---- Create a EVPN controller, using a private ASN number and above node addreesses as peers. Define 'node1' and 'node2' as gateway nodes. ---- id: myevpnctl asn: 65000 peers: 192.168.0.1,192.168.0.2,192.168.0.3 gateway nodes: node1,node2 ---- Create an EVPN zone named `myevpnzone' using the previously created EVPN-controller. ---- id: myevpnzone vrf vxlan tag: 10000 controller: myevpnctl mtu: 1450 ---- Create the first VNet named `myvnet1' using the EVPN zone `myevpnzone', a IPv4 CIDR network and a random MAC address. ---- id: myvnet1 zone: myevpnzone tag: 11000 ipv4: 10.0.1.1/24 mac address: 8C:73:B2:7B:F9:60 #random generate mac address ---- Create the second VNet named `myvnet2' using the same EVPN zone `myevpnzone', a different IPv4 CIDR network and a different random MAC address than `myvnet1'. ---- id: myvnet2 zone: myevpnzone tag: 12000 ipv4: 10.0.2.1/24 mac address: 8C:73:B2:7B:F9:61 #random mac, need to be different on each vnet ---- Apply the configuration on the main SDN web-interface panel to create VNets locally on each nodes and generate the FRR config. Create a Debian-based Virtual Machine (vm1) on node1, with a vNIC on `myvnet1'. Use the following network configuration for this VM: ---- auto eth0 iface eth0 inet static address 10.0.1.100/24 gateway 10.0.1.1 #this is the ip of the vnet1 mtu 1450 ---- Create a second Virtual Machine (vm2) on node2, with a vNIC on the other VNet `myvnet2'. Use the following network configuration for this VM: ---- auto eth0 iface eth0 inet static address 10.0.2.100/24 gateway 10.0.2.1 #this is the ip of the vnet2 mtu 1450 ---- Then, you should be able to ping vm2 from vm1, and vm1 from vm2. If you ping an external IP from 'vm2' on the non-gateway 'node3', the packet will go to the configured 'myvnet2' gateway, then will be routed to gateway nodes ('node1' or 'node2') and from there it will leave those nodes over the default gateway configured on node1 or node2. NOTE: Of course you need to add reverse routes for the '10.0.1.0/24' and '10.0.2.0/24' network to node1, node2 on your external gateway, so that the public network can reply back. If you have configured an external BGP router, the BGP-EVPN routes (10.0.1.0/24 and 10.0.2.0/24 in this example), will be announced dynamically.