2 ================================
6 1. [Overview](#overview)
7 2. [Building and Installation](#build)
8 3. [Setup OVS DPDK datapath](#ovssetup)
9 4. [DPDK in the VM](#builddpdk)
10 5. [OVS Testcases](#ovstc)
11 6. [Limitations ](#ovslimits)
13 ## <a name="overview"></a> 1. Overview
15 Open vSwitch can use DPDK lib to operate entirely in userspace.
16 This file provides information on installation and use of Open vSwitch
17 using DPDK datapath. This version of Open vSwitch should be built manually
18 with `configure` and `make`.
20 The DPDK support of Open vSwitch is considered 'experimental'.
24 * Required: DPDK 16.07
25 * Hardware: [DPDK Supported NICs] when physical ports in use
27 ## <a name="build"></a> 2. Building and Installation
29 ### 2.1 Configure & build the Linux kernel
31 On Linux Distros running kernel version >= 3.0, kernel rebuild is not required
32 and only grub cmdline needs to be updated for enabling IOMMU [VFIO support - 3.2].
33 For older kernels, check if kernel is built with UIO, HUGETLBFS, PROC_PAGE_MONITOR,
34 HPET, HPET_MMAP support.
36 Detailed system requirements can be found at [DPDK requirements] and also refer to
37 advanced install guide [INSTALL.DPDK-ADVANCED.md]
40 1. [Download DPDK] and extract the file, for example in to /usr/src
45 wget http://dpdk.org/browse/dpdk/snapshot/dpdk-16.07.zip
48 export DPDK_DIR=/usr/src/dpdk-16.07
52 2. Configure and Install DPDK
54 Build and install the DPDK library.
57 export DPDK_TARGET=x86_64-native-linuxapp-gcc
58 export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
59 make install T=$DPDK_TARGET DESTDIR=install
62 Note: For IVSHMEM, Set `export DPDK_TARGET=x86_64-ivshmem-linuxapp-gcc`
65 OVS can be installed using different methods. For OVS to use DPDK datapath,
66 it has to be configured with DPDK support and is done by './configure --with-dpdk'.
67 This section focus on generic recipe that suits most cases and for distribution
68 specific instructions, refer [INSTALL.Fedora.md], [INSTALL.RHEL.md] and
71 The OVS sources can be downloaded in different ways and skip this section
72 if already having the correct sources. Otherwise download the correct version using
73 one of the below suggested methods and follow the documentation of that specific
76 - OVS stable releases can be downloaded in compressed format from [Download OVS]
80 wget http://openvswitch.org/releases/openvswitch-<version>.tar.gz
81 tar -zxvf openvswitch-<version>.tar.gz
82 export OVS_DIR=/usr/src/openvswitch-<version>
85 - OVS current development can be clone using 'git' tool
89 git clone https://github.com/openvswitch/ovs.git
90 export OVS_DIR=/usr/src/ovs
93 - Install OVS dependencies
95 GNU make, GCC 4.x (or) Clang 3.4, libnuma (Mandatory)
96 libssl, libcap-ng, Python 2.7 (Optional)
97 More information can be found at [Build Requirements]
99 - Configure, Install OVS
104 ./configure --with-dpdk=$DPDK_BUILD
108 Note: Passing DPDK_BUILD can be skipped if DPDK library is installed in
109 standard locations i.e `./configure --with-dpdk` should suffice.
111 Additional information can be found in [INSTALL.md].
113 ## <a name="ovssetup"></a> 3. Setup OVS with DPDK datapath
115 ### 3.1 Setup Hugepages
117 Allocate and mount 2M Huge pages:
119 - For persistent allocation of huge pages, write to hugepages.conf file
122 `echo 'vm.nr_hugepages=2048' > /etc/sysctl.d/hugepages.conf`
124 - For run-time allocation of huge pages
126 `sysctl -w vm.nr_hugepages=N` where N = No. of 2M huge pages allocated
128 - To verify hugepage configuration
130 `grep HugePages_ /proc/meminfo`
134 `mount -t hugetlbfs none /dev/hugepages`
136 Note: Mount hugepages if not already mounted by default.
138 ### 3.2 Setup DPDK devices using VFIO
140 - Supported with kernel version >= 3.6
141 - VFIO needs support from BIOS and kernel.
144 Enable VT-d, can be verified from `dmesg | grep -e DMAR -e IOMMU` output
148 Add `iommu=pt intel_iommu=on`, can be verified from `cat /proc/cmdline` output
150 - Load modules and bind the NIC to VFIO driver
154 sudo /usr/bin/chmod a+x /dev/vfio
155 sudo /usr/bin/chmod 0666 /dev/vfio/*
156 $DPDK_DIR/tools/dpdk-devbind.py --bind=vfio-pci eth1
157 $DPDK_DIR/tools/dpdk-devbind.py --status
160 Note: If running kernels < 3.6 UIO drivers to be used,
161 please check [DPDK in the VM], DPDK devices using UIO section for the steps.
165 1. DB creation (One time step)
168 mkdir -p /usr/local/etc/openvswitch
169 mkdir -p /usr/local/var/run/openvswitch
170 rm /usr/local/etc/openvswitch/conf.db
171 ovsdb-tool create /usr/local/etc/openvswitch/conf.db \
172 /usr/local/share/openvswitch/vswitch.ovsschema
175 2. Start ovsdb-server
180 ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock \
181 --remote=db:Open_vSwitch,Open_vSwitch,manager_options \
188 ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock \
189 --remote=db:Open_vSwitch,Open_vSwitch,manager_options \
190 --private-key=db:Open_vSwitch,SSL,private_key \
191 --certificate=Open_vSwitch,SSL,certificate \
192 --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach
195 3. Initialize DB (One time step)
198 ovs-vsctl --no-wait init
203 DPDK configuration arguments can be passed to vswitchd via Open_vSwitch
204 'other_config' column. The important configuration options are listed below.
205 Defaults will be provided for all values not explicitly set. Refer
206 ovs-vswitchd.conf.db(5) for additional information on configuration options.
209 Specifies whether OVS should initialize and support DPDK ports. This is
210 a boolean, and defaults to false.
213 Specifies the CPU cores on which dpdk lcore threads should be spawned and
214 expects hex string (eg '0x123').
217 Comma separated list of memory to pre-allocate from hugepages on specific
221 Directory where hugetlbfs is mounted
224 Option to set the path to the vhost_user unix socket files.
226 NOTE: Changing any of these options requires restarting the ovs-vswitchd
229 Open vSwitch can be started as normal. DPDK will be initialized as long
230 as the dpdk-init option has been set to 'true'.
233 export DB_SOCK=/usr/local/var/run/openvswitch/db.sock
234 ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
235 ovs-vswitchd unix:$DB_SOCK --pidfile --detach
238 If allocated more than one GB hugepage (as for IVSHMEM), set amount and
239 use NUMA node 0 memory. For details on using ivshmem with DPDK, refer to
243 ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,0"
244 ovs-vswitchd unix:$DB_SOCK --pidfile --detach
247 To better scale the work loads across cores, Multiple pmd threads can be
248 created and pinned to CPU cores by explicity specifying pmd-cpu-mask.
249 eg: To spawn 2 pmd threads and pin them to cores 1, 2
252 ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6
255 5. Create bridge & add DPDK devices
257 create a bridge with datapath_type "netdev" in the configuration database
259 `ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev`
261 Now you can add DPDK devices. OVS expects DPDK device names to start with
262 "dpdk" and end with a portid. vswitchd should print (in the log file) the
263 number of dpdk devices found.
266 ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
267 ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
270 After the DPDK ports get added to switch, a polling thread continuously polls
271 DPDK devices and consumes 100% of the core as can be checked from 'top' and 'ps' cmds.
275 ps -eLo pid,psr,comm | grep pmd
278 Note: creating bonds of DPDK interfaces is slightly different to creating
279 bonds of system interfaces. For DPDK, the interface type must be explicitly
283 ovs-vsctl add-bond br0 dpdkbond dpdk0 dpdk1 -- set Interface dpdk0 type=dpdk -- set Interface dpdk1 type=dpdk
286 6. PMD thread statistics
289 # Check current stats
290 ovs-appctl dpif-netdev/pmd-stats-show
292 # Clear previous stats
293 ovs-appctl dpif-netdev/pmd-stats-clear
296 7. Port/rxq assigment to PMD threads
299 # Show port/rxq assignment
300 ovs-appctl dpif-netdev/pmd-rxq-show
303 To change default rxq assignment to pmd threads rxqs may be manually
304 pinned to desired cores using:
307 ovs-vsctl set Interface <iface> \
308 other_config:pmd-rxq-affinity=<rxq-affinity-list>
313 <rxq-affinity-list> ::= NULL | <non-empty-list>
314 <non-empty-list> ::= <affinity-pair> |
315 <affinity-pair> , <non-empty-list>
316 <affinity-pair> ::= <queue-id> : <core-id>
322 ovs-vsctl set interface dpdk0 options:n_rxq=4 \
323 other_config:pmd-rxq-affinity="0:3,1:7,3:8"
325 Queue #0 pinned to core 3;
326 Queue #1 pinned to core 7;
328 Queue #3 pinned to core 8;
331 After that PMD threads on cores where RX queues was pinned will become
332 `isolated`. This means that this thread will poll only pinned RX queues.
334 WARNING: If there are no `non-isolated` PMD threads, `non-pinned` RX queues
335 will not be polled. Also, if provided `core_id` is not available (ex. this
336 `core_id` not in `pmd-cpu-mask`), RX queue will not be polled by any
339 Isolation of PMD threads also can be checked using
340 `ovs-appctl dpif-netdev/pmd-rxq-show` command.
342 8. Stop vswitchd & Delete bridge
345 ovs-appctl -t ovs-vswitchd exit
346 ovs-appctl -t ovsdb-server exit
350 ## <a name="builddpdk"></a> 4. DPDK in the VM
352 DPDK 'testpmd' application can be run in the Guest VM for high speed
353 packet forwarding between vhostuser ports. DPDK and testpmd application
354 has to be compiled on the guest VM. Below are the steps for setting up
355 the testpmd application in the VM. More information on the vhostuser ports
356 can be found in [Vhost Walkthrough].
358 * Instantiate the Guest
361 Qemu version >= 2.2.0
363 export VM_NAME=Centos-vm
364 export GUEST_MEM=3072M
365 export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2
366 export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
368 qemu-system-x86_64 -name $VM_NAME -cpu host -enable-kvm -m $GUEST_MEM -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc -smp sockets=1,cores=2 -drive file=$QCOW2_IMAGE -chardev socket,id=char0,path=$VHOST_SOCK_DIR/dpdkvhostuser0 -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser1 -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off --nographic -snapshot
371 * Download the DPDK Srcs to VM and build DPDK
375 wget http://dpdk.org/browse/dpdk/snapshot/dpdk-16.07.zip
377 export DPDK_DIR=/root/dpdk/dpdk-16.07
378 export DPDK_TARGET=x86_64-native-linuxapp-gcc
379 export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
381 make install T=$DPDK_TARGET DESTDIR=install
384 * Build the test-pmd application
388 export RTE_SDK=$DPDK_DIR
389 export RTE_TARGET=$DPDK_TARGET
393 * Setup Huge pages and DPDK devices using UIO
396 sysctl vm.nr_hugepages=1024
397 mkdir -p /dev/hugepages
398 mount -t hugetlbfs hugetlbfs /dev/hugepages (only if not already mounted)
400 insmod $DPDK_BUILD/kmod/igb_uio.ko
401 $DPDK_DIR/tools/dpdk-devbind.py --status
402 $DPDK_DIR/tools/dpdk-devbind.py -b igb_uio 00:03.0 00:04.0
405 vhost ports pci ids can be retrieved using `lspci | grep Ethernet` cmd.
407 ## <a name="ovstc"></a> 5. OVS Testcases
409 Below are few testcases and the list of steps to be followed.
413 The steps (1-5) in 3.3 section will create & initialize DB, start vswitchd and also
414 add DPDK devices to bridge 'br0'.
416 1. Add Test flows to forward packets betwen DPDK port 0 and port 1
419 # Clear current flows
420 ovs-ofctl del-flows br0
422 # Add flows between port 1 (dpdk0) to port 2 (dpdk1)
423 ovs-ofctl add-flow br0 in_port=1,action=output:2
424 ovs-ofctl add-flow br0 in_port=2,action=output:1
427 ### 5.2 PHY-VM-PHY [VHOST LOOPBACK]
429 The steps (1-5) in 3.3 section will create & initialize DB, start vswitchd and also
430 add DPDK devices to bridge 'br0'.
432 1. Add dpdkvhostuser ports to bridge 'br0'. More information on the dpdkvhostuser ports
433 can be found in [Vhost Walkthrough].
436 ovs-vsctl add-port br0 dpdkvhostuser0 -- set Interface dpdkvhostuser0 type=dpdkvhostuser
437 ovs-vsctl add-port br0 dpdkvhostuser1 -- set Interface dpdkvhostuser1 type=dpdkvhostuser
440 2. Add Test flows to forward packets betwen DPDK devices and VM ports
443 # Clear current flows
444 ovs-ofctl del-flows br0
447 ovs-ofctl add-flow br0 in_port=1,action=output:3
448 ovs-ofctl add-flow br0 in_port=3,action=output:1
449 ovs-ofctl add-flow br0 in_port=4,action=output:2
450 ovs-ofctl add-flow br0 in_port=2,action=output:4
453 ovs-ofctl dump-flows br0
456 3. Instantiate Guest VM using Qemu cmdline
461 | configuration | values | comments
462 |----------------------|--------|-----------------
463 | qemu version | 2.2.0 |
464 | qemu thread affinity | core 5 | taskset 0x20
467 | Qcow2 image | CentOS7| -
468 | mrg_rxbuf | off | -
474 export VM_NAME=vhost-vm
475 export GUEST_MEM=3072M
476 export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2
477 export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
479 taskset 0x20 qemu-system-x86_64 -name $VM_NAME -cpu host -enable-kvm -m $GUEST_MEM -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc -smp sockets=1,cores=2 -drive file=$QCOW2_IMAGE -chardev socket,id=char0,path=$VHOST_SOCK_DIR/dpdkvhostuser0 -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser1 -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off --nographic -snapshot
482 4. Guest VM using libvirt
484 The below is a simple xml configuration of 'demovm' guest that can be instantiated
485 using 'virsh'. The guest uses a pair of vhostuser port and boots with 4GB RAM and 2 cores.
486 More information can be found in [Vhost Walkthrough].
491 <uuid>4a9b3f53-fa2a-47f3-a757-dd87720d9d1d</uuid>
492 <memory unit='KiB'>4194304</memory>
493 <currentMemory unit='KiB'>4194304</currentMemory>
496 <page size='2' unit='M' nodeset='0'/>
499 <vcpu placement='static'>2</vcpu>
501 <shares>4096</shares>
502 <vcpupin vcpu='0' cpuset='4'/>
503 <vcpupin vcpu='1' cpuset='5'/>
504 <emulatorpin cpuset='4,5'/>
507 <type arch='x86_64' machine='pc'>hvm</type>
514 <cpu mode='host-model'>
515 <model fallback='allow'/>
516 <topology sockets='2' cores='1' threads='1'/>
518 <cell id='0' cpus='0-1' memory='4194304' unit='KiB' memAccess='shared'/>
521 <on_poweroff>destroy</on_poweroff>
522 <on_reboot>restart</on_reboot>
523 <on_crash>destroy</on_crash>
525 <emulator>/usr/bin/qemu-kvm</emulator>
526 <disk type='file' device='disk'>
527 <driver name='qemu' type='qcow2' cache='none'/>
528 <source file='/root/CentOS7_x86_64.qcow2'/>
529 <target dev='vda' bus='virtio'/>
531 <disk type='dir' device='disk'>
532 <driver name='qemu' type='fat'/>
533 <source dir='/usr/src/dpdk-16.07'/>
534 <target dev='vdb' bus='virtio'/>
537 <interface type='vhostuser'>
538 <mac address='00:00:00:00:00:01'/>
539 <source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostuser0' mode='client'/>
540 <model type='virtio'/>
542 <host mrg_rxbuf='off'/>
545 <interface type='vhostuser'>
546 <mac address='00:00:00:00:00:02'/>
547 <source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostuser1' mode='client'/>
548 <model type='virtio'/>
550 <host mrg_rxbuf='off'/>
557 <target type='serial' port='0'/>
563 5. DPDK Packet forwarding in Guest VM
565 To accomplish this, DPDK and testpmd application have to be first compiled
566 on the VM and the steps are listed in [DPDK in the VM].
568 * Run test-pmd application
571 cd $DPDK_DIR/app/test-pmd;
572 ./testpmd -c 0x3 -n 4 --socket-mem 1024 -- --burst=64 -i --txqflags=0xf00 --disable-hw-vlan
577 * Bind vNIC back to kernel once the test is completed.
580 $DPDK_DIR/tools/dpdk-devbind.py --bind=virtio-pci 0000:00:03.0
581 $DPDK_DIR/tools/dpdk-devbind.py --bind=virtio-pci 0000:00:04.0
583 Note: Appropriate PCI IDs to be passed in above example. The PCI IDs can be
584 retrieved using '$DPDK_DIR/tools/dpdk-devbind.py --status' cmd.
586 ### 5.3 PHY-VM-PHY [IVSHMEM]
588 The steps for setup of IVSHMEM are covered in section 5.2(PVP - IVSHMEM)
589 of [OVS Testcases] in ADVANCED install guide.
591 ## <a name="ovslimits"></a> 6. Limitations
593 - Supports MTU size 1500, MTU setting for DPDK netdevs will be in future OVS release.
594 - Currently DPDK ports does not use HW offload functionality.
595 - Network Interface Firmware requirements:
596 Each release of DPDK is validated against a specific firmware version for
597 a supported Network Interface. New firmware versions introduce bug fixes,
598 performance improvements and new functionality that DPDK leverages. The
599 validated firmware versions are available as part of the release notes for
600 DPDK. It is recommended that users update Network Interface firmware to
601 match what has been validated for the DPDK release.
603 For DPDK 16.07, the list of validated firmware versions can be found at:
605 http://dpdk.org/doc/guides/rel_notes/release_16.07.html
611 Please report problems to bugs@openvswitch.org.
614 [DPDK requirements]: http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html
615 [Download DPDK]: http://dpdk.org/browse/dpdk/refs/
616 [Download OVS]: http://openvswitch.org/releases/
617 [DPDK Supported NICs]: http://dpdk.org/doc/nics
618 [Build Requirements]: https://github.com/openvswitch/ovs/blob/master/INSTALL.md#build-requirements
619 [INSTALL.DPDK-ADVANCED.md]: INSTALL.DPDK-ADVANCED.md
620 [OVS Testcases]: INSTALL.DPDK-ADVANCED.md#ovstc
621 [Vhost Walkthrough]: INSTALL.DPDK-ADVANCED.md#vhost
622 [DPDK in the VM]: INSTALL.DPDK.md#builddpdk
623 [INSTALL.md]:INSTALL.md
624 [INSTALL.Fedora.md]:INSTALL.Fedora.md
625 [INSTALL.RHEL.md]:INSTALL.RHEL.md
626 [INSTALL.Debian.md]:INSTALL.Debian.md