]> git.proxmox.com Git - mirror_ovs.git/blob - Documentation/howto/dpdk.rst
docs: DPDK isn't a datapath, so don't use the term.
[mirror_ovs.git] / Documentation / howto / dpdk.rst
1 ..
2 Licensed under the Apache License, Version 2.0 (the "License"); you may
3 not use this file except in compliance with the License. You may obtain
4 a copy of the License at
5
6 http://www.apache.org/licenses/LICENSE-2.0
7
8 Unless required by applicable law or agreed to in writing, software
9 distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
10 WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
11 License for the specific language governing permissions and limitations
12 under the License.
13
14 Convention for heading levels in Open vSwitch documentation:
15
16 ======= Heading 0 (reserved for the title in a document)
17 ------- Heading 1
18 ~~~~~~~ Heading 2
19 +++++++ Heading 3
20 ''''''' Heading 4
21
22 Avoid deeper levels because they do not render well.
23
24 ============================
25 Using Open vSwitch with DPDK
26 ============================
27
28 This document describes how to use Open vSwitch with DPDK.
29
30 .. important::
31
32 Using DPDK with OVS requires configuring OVS at build time to use
33 the DPDK library. The version of DPDK that OVS supports varies
34 from one OVS release to another, as described in the :doc:`releases
35 FAQ </faq/releases>`. For build instructions refer to
36 :doc:`/intro/install/dpdk`.
37
38 Ports and Bridges
39 -----------------
40
41 ovs-vsctl can be used to set up bridges and other Open vSwitch features.
42 Bridges should be created with a ``datapath_type=netdev``::
43
44 $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
45
46 ovs-vsctl can also be used to add DPDK devices. ovs-vswitchd should print the
47 number of dpdk devices found in the log file::
48
49 $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
50 options:dpdk-devargs=0000:01:00.0
51 $ ovs-vsctl add-port br0 dpdk-p1 -- set Interface dpdk-p1 type=dpdk \
52 options:dpdk-devargs=0000:01:00.1
53
54 Some NICs (i.e. Mellanox ConnectX-3) have only one PCI address associated with
55 multiple ports. Using a PCI device like above won't work. Instead, below usage
56 is suggested::
57
58 $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
59 options:dpdk-devargs="class=eth,mac=00:11:22:33:44:55"
60 $ ovs-vsctl add-port br0 dpdk-p1 -- set Interface dpdk-p1 type=dpdk \
61 options:dpdk-devargs="class=eth,mac=00:11:22:33:44:56"
62
63 .. important::
64
65 Hotplugging physical interfaces is not supported using the above syntax.
66 This is expected to change with the release of DPDK v18.05. For information
67 on hotplugging physical interfaces, you should instead refer to
68 :ref:`port-hotplug`.
69
70 After the DPDK ports get added to switch, a polling thread continuously polls
71 DPDK devices and consumes 100% of the core, as can be checked from ``top`` and
72 ``ps`` commands::
73
74 $ top -H
75 $ ps -eLo pid,psr,comm | grep pmd
76
77 Creating bonds of DPDK interfaces is slightly different to creating bonds of
78 system interfaces. For DPDK, the interface type and devargs must be explicitly
79 set. For example::
80
81 $ ovs-vsctl add-bond br0 dpdkbond p0 p1 \
82 -- set Interface p0 type=dpdk options:dpdk-devargs=0000:01:00.0 \
83 -- set Interface p1 type=dpdk options:dpdk-devargs=0000:01:00.1
84
85 To stop ovs-vswitchd & delete bridge, run::
86
87 $ ovs-appctl -t ovs-vswitchd exit
88 $ ovs-appctl -t ovsdb-server exit
89 $ ovs-vsctl del-br br0
90
91 .. _dpdk-ovs-in-guest:
92
93 OVS with DPDK Inside VMs
94 ------------------------
95
96 Additional configuration is required if you want to run ovs-vswitchd with DPDK
97 backend inside a QEMU virtual machine. ovs-vswitchd creates separate DPDK TX
98 queues for each CPU core available. This operation fails inside QEMU virtual
99 machine because, by default, VirtIO NIC provided to the guest is configured to
100 support only single TX queue and single RX queue. To change this behavior, you
101 need to turn on ``mq`` (multiqueue) property of all ``virtio-net-pci`` devices
102 emulated by QEMU and used by DPDK. You may do it manually (by changing QEMU
103 command line) or, if you use Libvirt, by adding the following string to
104 ``<interface>`` sections of all network devices used by DPDK::
105
106 <driver name='vhost' queues='N'/>
107
108 where:
109
110 ``N``
111 determines how many queues can be used by the guest.
112
113 This requires QEMU >= 2.2.
114
115 .. _dpdk-phy-phy:
116
117 PHY-PHY
118 -------
119
120 Add a userspace bridge and two ``dpdk`` (PHY) ports::
121
122 # Add userspace bridge
123 $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
124
125 # Add two dpdk ports
126 $ ovs-vsctl add-port br0 phy0 -- set Interface phy0 type=dpdk \
127 options:dpdk-devargs=0000:01:00.0 ofport_request=1
128
129 $ ovs-vsctl add-port br0 phy1 -- set Interface phy1 type=dpdk
130 options:dpdk-devargs=0000:01:00.1 ofport_request=2
131
132 Add test flows to forward packets between DPDK port 0 and port 1::
133
134 # Clear current flows
135 $ ovs-ofctl del-flows br0
136
137 # Add flows between port 1 (phy0) to port 2 (phy1)
138 $ ovs-ofctl add-flow br0 in_port=1,action=output:2
139 $ ovs-ofctl add-flow br0 in_port=2,action=output:1
140
141 Transmit traffic into either port. You should see it returned via the other.
142
143 .. _dpdk-vhost-loopback:
144
145 PHY-VM-PHY (vHost Loopback)
146 ---------------------------
147
148 Add a userspace bridge, two ``dpdk`` (PHY) ports, and two ``dpdkvhostuser``
149 ports::
150
151 # Add userspace bridge
152 $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
153
154 # Add two dpdk ports
155 $ ovs-vsctl add-port br0 phy0 -- set Interface phy0 type=dpdk \
156 options:dpdk-devargs=0000:01:00.0 ofport_request=1
157
158 $ ovs-vsctl add-port br0 phy1 -- set Interface phy1 type=dpdk
159 options:dpdk-devargs=0000:01:00.1 ofport_request=2
160
161 # Add two dpdkvhostuser ports
162 $ ovs-vsctl add-port br0 dpdkvhostuser0 \
163 -- set Interface dpdkvhostuser0 type=dpdkvhostuser ofport_request=3
164 $ ovs-vsctl add-port br0 dpdkvhostuser1 \
165 -- set Interface dpdkvhostuser1 type=dpdkvhostuser ofport_request=4
166
167 Add test flows to forward packets between DPDK devices and VM ports::
168
169 # Clear current flows
170 $ ovs-ofctl del-flows br0
171
172 # Add flows
173 $ ovs-ofctl add-flow br0 in_port=1,action=output:3
174 $ ovs-ofctl add-flow br0 in_port=3,action=output:1
175 $ ovs-ofctl add-flow br0 in_port=4,action=output:2
176 $ ovs-ofctl add-flow br0 in_port=2,action=output:4
177
178 # Dump flows
179 $ ovs-ofctl dump-flows br0
180
181 Create a VM using the following configuration:
182
183 .. table::
184
185 ===================== ======== ============
186 Configuration Values Comments
187 ===================== ======== ============
188 QEMU version 2.2.0 n/a
189 QEMU thread affinity core 5 taskset 0x20
190 Memory 4GB n/a
191 Cores 2 n/a
192 Qcow2 image CentOS7 n/a
193 mrg_rxbuf off n/a
194 ===================== ======== ============
195
196 You can do this directly with QEMU via the ``qemu-system-x86_64`` application::
197
198 $ export VM_NAME=vhost-vm
199 $ export GUEST_MEM=3072M
200 $ export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2
201 $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
202
203 $ taskset 0x20 qemu-system-x86_64 -name $VM_NAME -cpu host -enable-kvm \
204 -m $GUEST_MEM -drive file=$QCOW2_IMAGE --nographic -snapshot \
205 -numa node,memdev=mem -mem-prealloc -smp sockets=1,cores=2 \
206 -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \
207 -chardev socket,id=char0,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \
208 -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
209 -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off \
210 -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \
211 -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \
212 -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off
213
214 For a explanation of this command, along with alternative approaches such as
215 booting the VM via libvirt, refer to :doc:`/topics/dpdk/vhost-user`.
216
217 Once the guest is configured and booted, configure DPDK packet forwarding
218 within the guest. To accomplish this, build the ``testpmd`` application as
219 described in :ref:`dpdk-testpmd`. Once compiled, run the application::
220
221 $ cd $DPDK_DIR/app/test-pmd;
222 $ ./testpmd -c 0x3 -n 4 --socket-mem 1024 -- \
223 --burst=64 -i --txqflags=0xf00 --disable-hw-vlan
224 $ set fwd mac retry
225 $ start
226
227 When you finish testing, bind the vNICs back to kernel::
228
229 $ $DPDK_DIR/usertools/dpdk-devbind.py --bind=virtio-pci 0000:00:03.0
230 $ $DPDK_DIR/usertools/dpdk-devbind.py --bind=virtio-pci 0000:00:04.0
231
232 .. note::
233
234 Valid PCI IDs must be passed in above example. The PCI IDs can be retrieved
235 like so::
236
237 $ $DPDK_DIR/usertools/dpdk-devbind.py --status
238
239 More information on the dpdkvhostuser ports can be found in
240 :doc:`/topics/dpdk/vhost-user`.
241
242 PHY-VM-PHY (vHost Loopback) (Kernel Forwarding)
243 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
244
245 :ref:`dpdk-vhost-loopback` details steps for PHY-VM-PHY loopback
246 testcase and packet forwarding using DPDK testpmd application in the Guest VM.
247 For users wishing to do packet forwarding using kernel stack below, you need to
248 run the below commands on the guest::
249
250 $ ip addr add 1.1.1.2/24 dev eth1
251 $ ip addr add 1.1.2.2/24 dev eth2
252 $ ip link set eth1 up
253 $ ip link set eth2 up
254 $ systemctl stop firewalld.service
255 $ systemctl stop iptables.service
256 $ sysctl -w net.ipv4.ip_forward=1
257 $ sysctl -w net.ipv4.conf.all.rp_filter=0
258 $ sysctl -w net.ipv4.conf.eth1.rp_filter=0
259 $ sysctl -w net.ipv4.conf.eth2.rp_filter=0
260 $ route add -net 1.1.2.0/24 eth2
261 $ route add -net 1.1.1.0/24 eth1
262 $ arp -s 1.1.2.99 DE:AD:BE:EF:CA:FE
263 $ arp -s 1.1.1.99 DE:AD:BE:EF:CA:EE
264
265 PHY-VM-PHY (vHost Multiqueue)
266 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
267
268 vHost Multiqueue functionality can also be validated using the PHY-VM-PHY
269 configuration. To begin, follow the steps described in :ref:`dpdk-phy-phy` to
270 create and initialize the database, start ovs-vswitchd and add ``dpdk``-type
271 devices to bridge ``br0``. Once complete, follow the below steps:
272
273 1. Configure PMD and RXQs.
274
275 For example, set the number of dpdk port rx queues to at least 2 The number
276 of rx queues at vhost-user interface gets automatically configured after
277 virtio device connection and doesn't need manual configuration::
278
279 $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xc
280 $ ovs-vsctl set Interface phy0 options:n_rxq=2
281 $ ovs-vsctl set Interface phy1 options:n_rxq=2
282
283 2. Instantiate Guest VM using QEMU cmdline
284
285 We must configure with appropriate software versions to ensure this feature
286 is supported.
287
288 .. list-table:: VM Configuration
289 :header-rows: 1
290
291 * - Setting
292 - Value
293 * - QEMU version
294 - 2.5.0
295 * - QEMU thread affinity
296 - 2 cores (taskset 0x30)
297 * - Memory
298 - 4 GB
299 * - Cores
300 - 2
301 * - Distro
302 - Fedora 22
303 * - Multiqueue
304 - Enabled
305
306 To do this, instantiate the guest as follows::
307
308 $ export VM_NAME=vhost-vm
309 $ export GUEST_MEM=4096M
310 $ export QCOW2_IMAGE=/root/Fedora22_x86_64.qcow2
311 $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch
312 $ taskset 0x30 qemu-system-x86_64 -cpu host -smp 2,cores=2 -m 4096M \
313 -drive file=$QCOW2_IMAGE --enable-kvm -name $VM_NAME \
314 -nographic -numa node,memdev=mem -mem-prealloc \
315 -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \
316 -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \
317 -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce,queues=2 \
318 -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mq=on,vectors=6 \
319 -chardev socket,id=char2,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \
320 -netdev type=vhost-user,id=mynet2,chardev=char2,vhostforce,queues=2 \
321 -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mq=on,vectors=6
322
323 .. note::
324 Queue value above should match the queues configured in OVS, The vector
325 value should be set to "number of queues x 2 + 2"
326
327 3. Configure the guest interface
328
329 Assuming there are 2 interfaces in the guest named eth0, eth1 check the
330 channel configuration and set the number of combined channels to 2 for
331 virtio devices::
332
333 $ ethtool -l eth0
334 $ ethtool -L eth0 combined 2
335 $ ethtool -L eth1 combined 2
336
337 More information can be found in vHost walkthrough section.
338
339 4. Configure kernel packet forwarding
340
341 Configure IP and enable interfaces::
342
343 $ ip addr add 5.5.5.1/24 dev eth0
344 $ ip addr add 90.90.90.1/24 dev eth1
345 $ ip link set eth0 up
346 $ ip link set eth1 up
347
348 Configure IP forwarding and add route entries::
349
350 $ sysctl -w net.ipv4.ip_forward=1
351 $ sysctl -w net.ipv4.conf.all.rp_filter=0
352 $ sysctl -w net.ipv4.conf.eth0.rp_filter=0
353 $ sysctl -w net.ipv4.conf.eth1.rp_filter=0
354 $ ip route add 2.1.1.0/24 dev eth1
355 $ route add default gw 2.1.1.2 eth1
356 $ route add default gw 90.90.90.90 eth1
357 $ arp -s 90.90.90.90 DE:AD:BE:EF:CA:FE
358 $ arp -s 2.1.1.2 DE:AD:BE:EF:CA:FA
359
360 Check traffic on multiple queues::
361
362 $ cat /proc/interrupts | grep virtio
363
364 .. _dpdk-flow-hardware-offload:
365
366 Flow Hardware Offload (Experimental)
367 ------------------------------------
368
369 The flow hardware offload is disabled by default and can be enabled by::
370
371 $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
372
373 So far only partial flow offload is implemented. Moreover, it only works
374 with PMD drivers have the rte_flow action "MARK + RSS" support.
375
376 The validated NICs are:
377
378 - Mellanox (ConnectX-4, ConnectX-4 Lx, ConnectX-5)
379 - Napatech (NT200B01)
380
381 Supported protocols for hardware offload are:
382 - L2: Ethernet, VLAN
383 - L3: IPv4, IPv6
384 - L4: TCP, UDP, SCTP, ICMP
385
386 Further Reading
387 ---------------
388
389 More detailed information can be found in the :doc:`DPDK topics section
390 </topics/dpdk/index>` of the documentation. These guides are listed below.
391
392 .. NOTE(stephenfin): Remember to keep this in sync with topics/dpdk/index
393
394 .. include:: ../topics/dpdk/index.rst
395 :start-line: 30