2 Licensed under the Apache License, Version 2.0 (the "License"); you may
3 not use this file except in compliance with the License. You may obtain
4 a copy of the License at
6 http://www.apache.org/licenses/LICENSE-2.0
8 Unless required by applicable law or agreed to in writing, software
9 distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
10 WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
11 License for the specific language governing permissions and limitations
14 Convention for heading levels in Open vSwitch documentation:
16 ======= Heading 0 (reserved for the title in a document)
22 Avoid deeper levels because they do not render well.
24 ============================
25 Using Open vSwitch with DPDK
26 ============================
28 This document describes how to use Open vSwitch with DPDK.
32 Using DPDK with OVS requires configuring OVS at build time to use
33 the DPDK library. The version of DPDK that OVS supports varies
34 from one OVS release to another, as described in the :doc:`releases
35 FAQ </faq/releases>`. For build instructions refer to
36 :doc:`/intro/install/dpdk`.
41 ovs-vsctl can be used to set up bridges and other Open vSwitch features.
42 Bridges should be created with a ``datapath_type=netdev``::
44 $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
46 ovs-vsctl can also be used to add DPDK devices. ovs-vswitchd should print the
47 number of dpdk devices found in the log file::
49 $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
50 options:dpdk-devargs=0000:01:00.0
51 $ ovs-vsctl add-port br0 dpdk-p1 -- set Interface dpdk-p1 type=dpdk \
52 options:dpdk-devargs=0000:01:00.1
54 Some NICs (i.e. Mellanox ConnectX-3) have only one PCI address associated with
55 multiple ports. Using a PCI device like above won't work. Instead, below usage
58 $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
59 options:dpdk-devargs="class=eth,mac=00:11:22:33:44:55"
60 $ ovs-vsctl add-port br0 dpdk-p1 -- set Interface dpdk-p1 type=dpdk \
61 options:dpdk-devargs="class=eth,mac=00:11:22:33:44:56"
65 Hotplugging physical interfaces is not supported using the above syntax.
66 This is expected to change with the release of DPDK v18.05. For information
67 on hotplugging physical interfaces, you should instead refer to
70 After the DPDK ports get added to switch, a polling thread continuously polls
71 DPDK devices and consumes 100% of the core, as can be checked from ``top`` and
75 $ ps -eLo pid,psr,comm | grep pmd
77 Creating bonds of DPDK interfaces is slightly different to creating bonds of
78 system interfaces. For DPDK, the interface type and devargs must be explicitly
81 $ ovs-vsctl add-bond br0 dpdkbond p0 p1 \
82 -- set Interface p0 type=dpdk options:dpdk-devargs=0000:01:00.0 \
83 -- set Interface p1 type=dpdk options:dpdk-devargs=0000:01:00.1
85 To stop ovs-vswitchd & delete bridge, run::
87 $ ovs-appctl -t ovs-vswitchd exit
88 $ ovs-appctl -t ovsdb-server exit
89 $ ovs-vsctl del-br br0
91 .. _dpdk-ovs-in-guest:
93 OVS with DPDK Inside VMs
94 ------------------------
96 Additional configuration is required if you want to run ovs-vswitchd with DPDK
97 backend inside a QEMU virtual machine. ovs-vswitchd creates separate DPDK TX
98 queues for each CPU core available. This operation fails inside QEMU virtual
99 machine because, by default, VirtIO NIC provided to the guest is configured to
100 support only single TX queue and single RX queue. To change this behavior, you
101 need to turn on ``mq`` (multiqueue) property of all ``virtio-net-pci`` devices
102 emulated by QEMU and used by DPDK. You may do it manually (by changing QEMU
103 command line) or, if you use Libvirt, by adding the following string to
104 ``<interface>`` sections of all network devices used by DPDK::
106 <driver name='vhost' queues='N'/>
111 determines how many queues can be used by the guest.
113 This requires QEMU >= 2.2.
120 Add a userspace bridge and two ``dpdk`` (PHY) ports::
122 # Add userspace bridge
123 $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
126 $ ovs-vsctl add-port br0 phy0 -- set Interface phy0 type=dpdk \
127 options:dpdk-devargs=0000:01:00.0 ofport_request=1
129 $ ovs-vsctl add-port br0 phy1 -- set Interface phy1 type=dpdk
130 options:dpdk-devargs=0000:01:00.1 ofport_request=2
132 Add test flows to forward packets between DPDK port 0 and port 1::
134 # Clear current flows
135 $ ovs-ofctl del-flows br0
137 # Add flows between port 1 (phy0) to port 2 (phy1)
138 $ ovs-ofctl add-flow br0 in_port=1,action=output:2
139 $ ovs-ofctl add-flow br0 in_port=2,action=output:1
141 Transmit traffic into either port. You should see it returned via the other.
143 .. _dpdk-vhost-loopback:
145 PHY-VM-PHY (vHost Loopback)
146 ---------------------------
148 Add a userspace bridge, two ``dpdk`` (PHY) ports, and two ``dpdkvhostuser``
151 # Add userspace bridge
152 $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
155 $ ovs-vsctl add-port br0 phy0 -- set Interface phy0 type=dpdk \
156 options:dpdk-devargs=0000:01:00.0 ofport_request=1
158 $ ovs-vsctl add-port br0 phy1 -- set Interface phy1 type=dpdk
159 options:dpdk-devargs=0000:01:00.1 ofport_request=2
161 # Add two dpdkvhostuser ports
162 $ ovs-vsctl add-port br0 dpdkvhostuser0 \
163 -- set Interface dpdkvhostuser0 type=dpdkvhostuser ofport_request=3
164 $ ovs-vsctl add-port br0 dpdkvhostuser1 \
165 -- set Interface dpdkvhostuser1 type=dpdkvhostuser ofport_request=4
167 Add test flows to forward packets between DPDK devices and VM ports::
169 # Clear current flows
170 $ ovs-ofctl del-flows br0
173 $ ovs-ofctl add-flow br0 in_port=1,action=output:3
174 $ ovs-ofctl add-flow br0 in_port=3,action=output:1
175 $ ovs-ofctl add-flow br0 in_port=4,action=output:2
176 $ ovs-ofctl add-flow br0 in_port=2,action=output:4
179 $ ovs-ofctl dump-flows br0
181 Create a VM using the following configuration:
185 ===================== ======== ============
186 Configuration Values Comments
187 ===================== ======== ============
188 QEMU version 2.2.0 n/a
189 QEMU thread affinity core 5 taskset 0x20
192 Qcow2 image CentOS7 n/a
194 ===================== ======== ============
196 You can do this directly with QEMU via the ``qemu-system-x86_64`` application::
198 $ export VM_NAME=vhost-vm
199 $ export GUEST_MEM=3072M
200 $ export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2
201 $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
203 $ taskset 0x20 qemu-system-x86_64 -name $VM_NAME -cpu host -enable-kvm \
204 -m $GUEST_MEM -drive file=$QCOW2_IMAGE --nographic -snapshot \
205 -numa node,memdev=mem -mem-prealloc -smp sockets=1,cores=2 \
206 -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \
207 -chardev socket,id=char0,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \
208 -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
209 -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off \
210 -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \
211 -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \
212 -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off
214 For a explanation of this command, along with alternative approaches such as
215 booting the VM via libvirt, refer to :doc:`/topics/dpdk/vhost-user`.
217 Once the guest is configured and booted, configure DPDK packet forwarding
218 within the guest. To accomplish this, build the ``testpmd`` application as
219 described in :ref:`dpdk-testpmd`. Once compiled, run the application::
221 $ cd $DPDK_DIR/app/test-pmd;
222 $ ./testpmd -c 0x3 -n 4 --socket-mem 1024 -- \
223 --burst=64 -i --txqflags=0xf00 --disable-hw-vlan
227 When you finish testing, bind the vNICs back to kernel::
229 $ $DPDK_DIR/usertools/dpdk-devbind.py --bind=virtio-pci 0000:00:03.0
230 $ $DPDK_DIR/usertools/dpdk-devbind.py --bind=virtio-pci 0000:00:04.0
234 Valid PCI IDs must be passed in above example. The PCI IDs can be retrieved
237 $ $DPDK_DIR/usertools/dpdk-devbind.py --status
239 More information on the dpdkvhostuser ports can be found in
240 :doc:`/topics/dpdk/vhost-user`.
242 PHY-VM-PHY (vHost Loopback) (Kernel Forwarding)
243 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
245 :ref:`dpdk-vhost-loopback` details steps for PHY-VM-PHY loopback
246 testcase and packet forwarding using DPDK testpmd application in the Guest VM.
247 For users wishing to do packet forwarding using kernel stack below, you need to
248 run the below commands on the guest::
250 $ ip addr add 1.1.1.2/24 dev eth1
251 $ ip addr add 1.1.2.2/24 dev eth2
252 $ ip link set eth1 up
253 $ ip link set eth2 up
254 $ systemctl stop firewalld.service
255 $ systemctl stop iptables.service
256 $ sysctl -w net.ipv4.ip_forward=1
257 $ sysctl -w net.ipv4.conf.all.rp_filter=0
258 $ sysctl -w net.ipv4.conf.eth1.rp_filter=0
259 $ sysctl -w net.ipv4.conf.eth2.rp_filter=0
260 $ route add -net 1.1.2.0/24 eth2
261 $ route add -net 1.1.1.0/24 eth1
262 $ arp -s 1.1.2.99 DE:AD:BE:EF:CA:FE
263 $ arp -s 1.1.1.99 DE:AD:BE:EF:CA:EE
265 PHY-VM-PHY (vHost Multiqueue)
266 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
268 vHost Multiqueue functionality can also be validated using the PHY-VM-PHY
269 configuration. To begin, follow the steps described in :ref:`dpdk-phy-phy` to
270 create and initialize the database, start ovs-vswitchd and add ``dpdk``-type
271 devices to bridge ``br0``. Once complete, follow the below steps:
273 1. Configure PMD and RXQs.
275 For example, set the number of dpdk port rx queues to at least 2 The number
276 of rx queues at vhost-user interface gets automatically configured after
277 virtio device connection and doesn't need manual configuration::
279 $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xc
280 $ ovs-vsctl set Interface phy0 options:n_rxq=2
281 $ ovs-vsctl set Interface phy1 options:n_rxq=2
283 2. Instantiate Guest VM using QEMU cmdline
285 We must configure with appropriate software versions to ensure this feature
288 .. list-table:: VM Configuration
295 * - QEMU thread affinity
296 - 2 cores (taskset 0x30)
306 To do this, instantiate the guest as follows::
308 $ export VM_NAME=vhost-vm
309 $ export GUEST_MEM=4096M
310 $ export QCOW2_IMAGE=/root/Fedora22_x86_64.qcow2
311 $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
312 $ taskset 0x30 qemu-system-x86_64 -cpu host -smp 2,cores=2 -m 4096M \
313 -drive file=$QCOW2_IMAGE --enable-kvm -name $VM_NAME \
314 -nographic -numa node,memdev=mem -mem-prealloc \
315 -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \
316 -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \
317 -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce,queues=2 \
318 -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mq=on,vectors=6 \
319 -chardev socket,id=char2,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \
320 -netdev type=vhost-user,id=mynet2,chardev=char2,vhostforce,queues=2 \
321 -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mq=on,vectors=6
324 Queue value above should match the queues configured in OVS, The vector
325 value should be set to "number of queues x 2 + 2"
327 3. Configure the guest interface
329 Assuming there are 2 interfaces in the guest named eth0, eth1 check the
330 channel configuration and set the number of combined channels to 2 for
334 $ ethtool -L eth0 combined 2
335 $ ethtool -L eth1 combined 2
337 More information can be found in vHost walkthrough section.
339 4. Configure kernel packet forwarding
341 Configure IP and enable interfaces::
343 $ ip addr add 5.5.5.1/24 dev eth0
344 $ ip addr add 90.90.90.1/24 dev eth1
345 $ ip link set eth0 up
346 $ ip link set eth1 up
348 Configure IP forwarding and add route entries::
350 $ sysctl -w net.ipv4.ip_forward=1
351 $ sysctl -w net.ipv4.conf.all.rp_filter=0
352 $ sysctl -w net.ipv4.conf.eth0.rp_filter=0
353 $ sysctl -w net.ipv4.conf.eth1.rp_filter=0
354 $ ip route add 2.1.1.0/24 dev eth1
355 $ route add default gw 2.1.1.2 eth1
356 $ route add default gw 90.90.90.90 eth1
357 $ arp -s 90.90.90.90 DE:AD:BE:EF:CA:FE
358 $ arp -s 2.1.1.2 DE:AD:BE:EF:CA:FA
360 Check traffic on multiple queues::
362 $ cat /proc/interrupts | grep virtio
364 .. _dpdk-flow-hardware-offload:
366 Flow Hardware Offload (Experimental)
367 ------------------------------------
369 The flow hardware offload is disabled by default and can be enabled by::
371 $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
373 Matches and actions are programmed into HW to achieve full offload of
374 the flow. If not all actions are supported, fallback to partial flow
375 offload (offloading matches only). Moreover, it only works with PMD
376 drivers that support the configured rte_flow actions.
377 Partial flow offload requires support of "MARK + RSS" actions. Full
378 hardware offload requires support of the actions listed below.
380 The validated NICs are:
382 - Mellanox (ConnectX-4, ConnectX-4 Lx, ConnectX-5)
383 - Napatech (NT200B01)
385 Supported protocols for hardware offload matches are:
389 - L4: TCP, UDP, SCTP, ICMP
391 Supported actions for hardware offload are:
395 - Modification of Ethernet (mod_dl_src/mod_dl_dst).
396 - Modification of IPv4 (mod_nw_src/mod_nw_dst/mod_nw_ttl).
397 - Modification of TCP/UDP (mod_tp_src/mod_tp_dst).
398 - VLAN Push/Pop (push_vlan/pop_vlan).
399 - Modification of IPv6 (set_field:<ADDR>->ipv6_src/ipv6_dst/mod_nw_ttl).
400 - Clone/output (tnl_push and output) for encapsulating over a tunnel.
405 More detailed information can be found in the :doc:`DPDK topics section
406 </topics/dpdk/index>` of the documentation. These guides are listed below.
408 .. NOTE(stephenfin): Remember to keep this in sync with topics/dpdk/index
410 .. include:: ../topics/dpdk/index.rst