2 Licensed under the Apache License, Version 2.0 (the "License"); you may
3 not use this file except in compliance with the License. You may obtain
4 a copy of the License at
6 http://www.apache.org/licenses/LICENSE-2.0
8 Unless required by applicable law or agreed to in writing, software
9 distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
10 WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
11 License for the specific language governing permissions and limitations
14 Convention for heading levels in Open vSwitch documentation:
16 ======= Heading 0 (reserved for the title in a document)
22 Avoid deeper levels because they do not render well.
24 ======================
25 Open vSwitch with DPDK
26 ======================
28 This document describes how to build and install Open vSwitch using a DPDK
29 datapath. Open vSwitch can use the DPDK library to operate entirely in
33 The DPDK support of Open vSwitch is considered 'experimental'.
38 In addition to the requirements described in the `installation guide
39 <INSTALL.rst>`__, building Open vSwitch with DPDK will require the following:
43 - A `DPDK supported NIC`_
45 Only required when physical ports are in use
49 On Linux Distros running kernel version >= 3.0, only `IOMMU` needs to enabled
50 via the grub cmdline, assuming you are using **VFIO**. For older kernels,
51 ensure the kernel is built with ``UIO``, ``HUGETLBFS``,
52 ``PROC_PAGE_MONITOR``, ``HPET``, ``HPET_MMAP`` support. If these are not
53 present, it will be necessary to upgrade your kernel or build a custom kernel
54 with these flags enabled.
56 Detailed system requirements can be found at `DPDK requirements`_, while more
57 detailed install information can be found in the `advanced installation guide
58 <../../INSTALL.DPDK-advanced.md>`__.
60 .. _DPDK supported NIC: http://dpdk.org/doc/nics
61 .. _DPDK requirements: http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html
69 1. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``:::
72 $ wget http://dpdk.org/browse/dpdk/snapshot/dpdk-16.07.zip
73 $ unzip dpdk-16.07.zip
74 $ export DPDK_DIR=/usr/src/dpdk-16.07
77 2. Configure and install DPDK
79 Build and install the DPDK library:::
81 $ export DPDK_TARGET=x86_64-native-linuxapp-gcc
82 $ export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
83 $ make install T=$DPDK_TARGET DESTDIR=install
85 If IVSHMEM support is required, use a different target:::
87 $ export DPDK_TARGET=x86_64-ivshmem-linuxapp-gcc
89 .. _DPDK sources: http://dpdk.org/browse/dpdk/refs/
94 OVS can be installed using different methods. For OVS to use DPDK datapath, it
95 has to be configured with DPDK support (``--with-dpdk``).
98 This section focuses on generic recipe that suits most cases. For
99 distribution specific instructions, refer to one of the more relevant guides.
101 .. _OVS sources: http://openvswitch.org/releases/
103 1. Ensure the standard OVS requirements, described in the `installation guide
104 <INSTALL.rst>`__, are installed.
106 2. Bootstrap, if required, as described in the `installation guide
109 3. Configure the package using the ``--with-dpdk`` flag:::
111 $ ./configure --with-dpdk=$DPDK_BUILD
113 where ``DPDK_BUILD`` is the path to the built DPDK library. This can be
114 skipped if DPDK library is installed in its default location.
117 While ``--with-dpdk`` is required, you can pass any other configuration
118 option described in the `installation guide <INSTALL.rst>`__.
120 4. Build and install OVS, as described in the `installation guide
123 Additional information can be found in the `installation guide
132 Allocate a number of 2M Huge pages:
134 - For persistent allocation of huge pages, write to hugepages.conf file
135 in `/etc/sysctl.d`:::
137 $ echo 'vm.nr_hugepages=2048' > /etc/sysctl.d/hugepages.conf
139 - For run-time allocation of huge pages, use the ``sysctl`` utility:::
141 $ sysctl -w vm.nr_hugepages=N # where N = No. of 2M huge pages
143 To verify hugepage configuration:::
145 $ grep HugePages_ /proc/meminfo
147 Mount the hugepages, if not already mounted by default:::
149 $ mount -t hugetlbfs none /dev/hugepages``
153 Setup DPDK devices using VFIO
154 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
156 VFIO is prefered to the UIO driver when using recent versions of DPDK. VFIO
157 support required support from both the kernel and BIOS. For the former, kernel
158 version > 3.6 must be used. For the latter, you must enable VT-d in the BIOS
159 and ensure this is configured via grub. To ensure VT-d is enabled via the BIOS,
162 $ dmesg | grep -e DMAR -e IOMMU
164 If VT-d is not enabled in the BIOS, enable it now.
166 To ensure VT-d is enabled in the kernel, run:::
168 $ cat /proc/cmdline | grep iommu=pt
169 $ cat /proc/cmdline | grep intel_iommu=on
171 If VT-d is not enabled in the kernel, enable it now.
173 Once VT-d is correctly configured, load the required modules and bind the NIC
174 to the VFIO driver:::
177 $ /usr/bin/chmod a+x /dev/vfio
178 $ /usr/bin/chmod 0666 /dev/vfio/*
179 $ $DPDK_DIR/tools/dpdk-devbind.py --bind=vfio-pci eth1
180 $ $DPDK_DIR/tools/dpdk-devbind.py --status
185 Open vSwitch should be started as described in the `installation guide
186 <INSTALL.rst>`__ with the exception of ovs-vswitchd, which requires some
187 special configuration to enable DPDK functionality. DPDK configuration
188 arguments can be passed to ovs-vswitchd via the ``other_config`` column of the
189 ``Open_vSwitch`` table. At a minimum, the ``dpdk-init`` option must be set to
190 ``true``. For example:::
192 $ export DB_SOCK=/usr/local/var/run/openvswitch/db.sock
193 $ ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
194 $ ovs-vswitchd unix:$DB_SOCK --pidfile --detach
196 There are many other configuration options, the most important of which are
197 listed below. Defaults will be provided for all values not explicitly set.
200 Specifies whether OVS should initialize and support DPDK ports. This is a
201 boolean, and defaults to false.
204 Specifies the CPU cores on which dpdk lcore threads should be spawned and
205 expects hex string (eg '0x123').
208 Comma separated list of memory to pre-allocate from hugepages on specific
211 ``dpdk-hugepage-dir``
212 Directory where hugetlbfs is mounted
215 Option to set the path to the vhost-user unix socket files.
217 If allocating more than one GB hugepage (as for IVSHMEM), you can configure the
218 amount of memory used from any given NUMA nodes. For example, to use 1GB from
221 $ ovs-vsctl --no-wait set Open_vSwitch . \
222 other_config:dpdk-socket-mem="1024,0"
224 Similarly, if you wish to better scale the workloads across cores, then
225 multiple pmd threads can be created and pinned to CPU cores by explicity
226 specifying ``pmd-cpu-mask``. For example, to spawn two pmd threads and pin
227 them to cores 1,2, run:::
229 $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6
231 For details on using ivshmem with DPDK, refer to `the advanced installation
232 guide <../../INSTALL.DPDK-ADVANCED.md>`__.
234 Refer to ovs-vswitchd.conf.db(5) for additional information on configuration
238 Changing any of these options requires restarting the ovs-vswitchd
244 Creating bridges and ports
245 ~~~~~~~~~~~~~~~~~~~~~~~~~~
247 You can now use ovs-vsctl to set up bridges and other Open vSwitch features.
248 Bridges should be created with a ``datapath_type=netdev``:::
250 $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
252 Now you can add DPDK devices. OVS expects DPDK device names to start with
253 ``dpdk`` and end with a portid. ovs-vswitchd should print the number of dpdk
254 devices found in the log file:::
256 $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
257 $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
259 After the DPDK ports get added to switch, a polling thread continuously polls
260 DPDK devices and consumes 100% of the core, as can be checked from 'top' and
264 $ ps -eLo pid,psr,comm | grep pmd
266 Creating bonds of DPDK interfaces is slightly different to creating bonds of
267 system interfaces. For DPDK, the interface type must be explicitly set. For
270 $ ovs-vsctl add-bond br0 dpdkbond dpdk0 dpdk1 \
271 -- set Interface dpdk0 type=dpdk \
272 -- set Interface dpdk1 type=dpdk
274 To stop ovs-vswitchd & delete bridge, run:::
276 $ ovs-appctl -t ovs-vswitchd exit
277 $ ovs-appctl -t ovsdb-server exit
278 $ ovs-vsctl del-br br0
280 PMD thread statistics
281 ~~~~~~~~~~~~~~~~~~~~~
283 To show current stats:::
285 $ ovs-appctl dpif-netdev/pmd-stats-show
287 To clear previous stats:::
289 $ ovs-appctl dpif-netdev/pmd-stats-clear
291 Port/rxq assigment to PMD threads
292 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
294 To show port/rxq assignment:::
296 $ ovs-appctl dpif-netdev/pmd-rxq-show
298 To change default rxq assignment to pmd threads, rxqs may be manually pinned to
299 desired cores using:::
301 $ ovs-vsctl set Interface <iface> \
302 other_config:pmd-rxq-affinity=<rxq-affinity-list>
306 - ``<rxq-affinity-list>`` ::= ``NULL`` | ``<non-empty-list>``
307 - ``<non-empty-list>`` ::= ``<affinity-pair>`` |
308 ``<affinity-pair>`` , ``<non-empty-list>``
309 - ``<affinity-pair>`` ::= ``<queue-id>`` : ``<core-id>``
313 $ ovs-vsctl set interface dpdk0 options:n_rxq=4 \
314 other_config:pmd-rxq-affinity="0:3,1:7,3:8"
318 - Queue #0 pinned to core 3
319 - Queue #1 pinned to core 7
320 - Queue #2 not pinned
321 - Queue #3 pinned to core 8
323 After that PMD threads on cores where RX queues was pinned will become
324 ``isolated``. This means that this thread will poll only pinned RX queues.
327 If there are no ``non-isolated`` PMD threads, ``non-pinned`` RX queues will
328 not be polled. Also, if provided ``core_id`` is not available (ex. this
329 ``core_id`` not in ``pmd-cpu-mask``), RX queue will not be polled by any PMD
332 .. _dpdk-guest-setup:
337 DPDK 'testpmd' application can be run in the Guest VM for high speed packet
338 forwarding between vhostuser ports. DPDK and testpmd application has to be
339 compiled on the guest VM. Below are the steps for setting up the testpmd
340 application in the VM. More information on the vhostuser ports can be found in
341 the `advanced install guide <../../INSTALL.DPDK-ADVANCED.md>`__.
344 Support for DPDK in the guest requires QEMU >= 2.2.0.
346 To being, instantiate the guest:::
348 $ export VM_NAME=Centos-vm export GUEST_MEM=3072M
349 $ export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2
350 $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
352 $ qemu-system-x86_64 -name $VM_NAME -cpu host -enable-kvm \
353 -m $GUEST_MEM -drive file=$QCOW2_IMAGE --nographic -snapshot \
354 -numa node,memdev=mem -mem-prealloc -smp sockets=1,cores=2 \
355 -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \
356 -chardev socket,id=char0,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \
357 -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
358 -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off \
359 -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \
360 -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \
361 -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off \
363 Download the DPDK sourcs to VM and build DPDK:::
366 $ wget http://dpdk.org/browse/dpdk/snapshot/dpdk-16.07.zip
367 $ unzip dpdk-16.07.zip
368 $ export DPDK_DIR=/root/dpdk/dpdk-16.07
369 $ export DPDK_TARGET=x86_64-native-linuxapp-gcc
370 $ export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
372 $ make install T=$DPDK_TARGET DESTDIR=install
374 Build the test-pmd application:::
377 $ export RTE_SDK=$DPDK_DIR
378 $ export RTE_TARGET=$DPDK_TARGET
381 Setup huge pages and DPDK devices using UIO:::
383 $ sysctl vm.nr_hugepages=1024
384 $ mkdir -p /dev/hugepages
385 $ mount -t hugetlbfs hugetlbfs /dev/hugepages # only if not already mounted
387 $ insmod $DPDK_BUILD/kmod/igb_uio.ko
388 $ $DPDK_DIR/tools/dpdk-devbind.py --status
389 $ $DPDK_DIR/tools/dpdk-devbind.py -b igb_uio 00:03.0 00:04.0
393 vhost ports pci ids can be retrieved using::
395 lspci | grep Ethernet
400 Below are few testcases and the list of steps to be followed. Before beginning,
401 ensure a userspace bridge has been created and two DPDK ports added:::
403 $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
404 $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
405 $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
410 Add test flows to forward packets betwen DPDK port 0 and port 1:::
412 # Clear current flows
413 $ ovs-ofctl del-flows br0
415 # Add flows between port 1 (dpdk0) to port 2 (dpdk1)
416 $ ovs-ofctl add-flow br0 in_port=1,action=output:2
417 $ ovs-ofctl add-flow br0 in_port=2,action=output:1
419 Transmit traffic into either port. You should see it returned via the other.
421 PHY-VM-PHY (vhost loopback)
422 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
424 Add two ``dpdkvhostuser`` ports to bridge ``br0``:::
426 $ ovs-vsctl add-port br0 dpdkvhostuser0 \
427 -- set Interface dpdkvhostuser0 type=dpdkvhostuser
428 $ ovs-vsctl add-port br0 dpdkvhostuser1 \
429 -- set Interface dpdkvhostuser1 type=dpdkvhostuser
431 Add test flows to forward packets betwen DPDK devices and VM ports:::
433 # Clear current flows
434 $ ovs-ofctl del-flows br0
437 $ ovs-ofctl add-flow br0 in_port=1,action=output:3
438 $ ovs-ofctl add-flow br0 in_port=3,action=output:1
439 $ ovs-ofctl add-flow br0 in_port=4,action=output:2
440 $ ovs-ofctl add-flow br0 in_port=2,action=output:4
443 $ ovs-ofctl dump-flows br0
445 Create a VM using the following configuration:
447 +----------------------+--------+-----------------+
448 | configuration | values | comments |
449 +----------------------+--------+-----------------+
450 | qemu version | 2.2.0 | n/a |
451 | qemu thread affinity | core 5 | taskset 0x20 |
452 | memory | 4GB | n/a |
454 | Qcow2 image | CentOS7| n/a |
455 | mrg_rxbuf | off | n/a |
456 +----------------------+--------+-----------------+
458 You can do this directly with QEMU via the ``qemu-system-x86_64``
461 $ export VM_NAME=vhost-vm
462 $ export GUEST_MEM=3072M
463 $ export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2
464 $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
466 $ taskset 0x20 qemu-system-x86_64 -name $VM_NAME -cpu host -enable-kvm \
467 -m $GUEST_MEM -drive file=$QCOW2_IMAGE --nographic -snapshot \
468 -numa node,memdev=mem -mem-prealloc -smp sockets=1,cores=2 \
469 -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \
470 -chardev socket,id=char0,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \
471 -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
472 -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off \
473 -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \
474 -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \
475 -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off
477 Alternatively, you can configure the guest using libvirt. Below is an XML
478 configuration for a 'demovm' guest that can be instantiated using `virsh`:::
482 <uuid>4a9b3f53-fa2a-47f3-a757-dd87720d9d1d</uuid>
483 <memory unit='KiB'>4194304</memory>
484 <currentMemory unit='KiB'>4194304</currentMemory>
487 <page size='2' unit='M' nodeset='0'/>
490 <vcpu placement='static'>2</vcpu>
492 <shares>4096</shares>
493 <vcpupin vcpu='0' cpuset='4'/>
494 <vcpupin vcpu='1' cpuset='5'/>
495 <emulatorpin cpuset='4,5'/>
498 <type arch='x86_64' machine='pc'>hvm</type>
505 <cpu mode='host-model'>
506 <model fallback='allow'/>
507 <topology sockets='2' cores='1' threads='1'/>
509 <cell id='0' cpus='0-1' memory='4194304' unit='KiB' memAccess='shared'/>
512 <on_poweroff>destroy</on_poweroff>
513 <on_reboot>restart</on_reboot>
514 <on_crash>destroy</on_crash>
516 <emulator>/usr/bin/qemu-kvm</emulator>
517 <disk type='file' device='disk'>
518 <driver name='qemu' type='qcow2' cache='none'/>
519 <source file='/root/CentOS7_x86_64.qcow2'/>
520 <target dev='vda' bus='virtio'/>
522 <disk type='dir' device='disk'>
523 <driver name='qemu' type='fat'/>
524 <source dir='/usr/src/dpdk-16.07'/>
525 <target dev='vdb' bus='virtio'/>
528 <interface type='vhostuser'>
529 <mac address='00:00:00:00:00:01'/>
530 <source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostuser0' mode='client'/>
531 <model type='virtio'/>
533 <host mrg_rxbuf='off'/>
536 <interface type='vhostuser'>
537 <mac address='00:00:00:00:00:02'/>
538 <source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostuser1' mode='client'/>
539 <model type='virtio'/>
541 <host mrg_rxbuf='off'/>
548 <target type='serial' port='0'/>
553 Once the guest is configured and booted, configure DPDK packet forwarding
554 within the guest. To accomplish this, DPDK and testpmd application have to
555 be first compiled on the VM as described in **Guest Setup**. Once compiled, run
556 the ``test-pmd`` application:::
558 $ cd $DPDK_DIR/app/test-pmd;
559 $ ./testpmd -c 0x3 -n 4 --socket-mem 1024 -- \
560 --burst=64 -i --txqflags=0xf00 --disable-hw-vlan
564 When you finish testing, bind the vNICs back to kernel:::
566 $ $DPDK_DIR/tools/dpdk-devbind.py --bind=virtio-pci 0000:00:03.0
567 $ $DPDK_DIR/tools/dpdk-devbind.py --bind=virtio-pci 0000:00:04.0
570 Appropriate PCI IDs to be passed in above example. The PCI IDs can be
573 $ $DPDK_DIR/tools/dpdk-devbind.py --status
576 More information on the dpdkvhostuser ports can be found in the `advanced
577 installation guide <../../INSTALL.DPDK-ADVANCED.md>`__.
579 PHY-VM-PHY (IVSHMEM loopback)
580 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
582 Refer to the `advanced installation guide <../../INSTALL.DPDK-ADVANCED.md>`__.
587 - Currently DPDK ports does not use HW offload functionality.
588 - Network Interface Firmware requirements: Each release of DPDK is
589 validated against a specific firmware version for a supported Network
590 Interface. New firmware versions introduce bug fixes, performance
591 improvements and new functionality that DPDK leverages. The validated
592 firmware versions are available as part of the release notes for
593 DPDK. It is recommended that users update Network Interface firmware
594 to match what has been validated for the DPDK release.
596 The latest list of validated firmware versions can be found in the `DPDK
599 .. _DPDK release notes: http://dpdk.org/doc/guides/rel_notes/release_16.07.html
604 Please report problems to bugs@openvswitch.org.