]>
Commit | Line | Data |
---|---|---|
e69e4f5b SF |
1 | .. |
2 | Licensed under the Apache License, Version 2.0 (the "License"); you may | |
3 | not use this file except in compliance with the License. You may obtain | |
4 | a copy of the License at | |
5 | ||
6 | http://www.apache.org/licenses/LICENSE-2.0 | |
7 | ||
8 | Unless required by applicable law or agreed to in writing, software | |
9 | distributed under the License is distributed on an "AS IS" BASIS, WITHOUT | |
10 | WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the | |
11 | License for the specific language governing permissions and limitations | |
12 | under the License. | |
13 | ||
14 | Convention for heading levels in Open vSwitch documentation: | |
15 | ||
16 | ======= Heading 0 (reserved for the title in a document) | |
17 | ------- Heading 1 | |
18 | ~~~~~~~ Heading 2 | |
19 | +++++++ Heading 3 | |
20 | ''''''' Heading 4 | |
21 | ||
22 | Avoid deeper levels because they do not render well. | |
23 | ||
24 | ============================ | |
25 | Using Open vSwitch with DPDK | |
26 | ============================ | |
27 | ||
28 | This document describes how to use Open vSwitch with DPDK datapath. | |
29 | ||
30 | .. important:: | |
31 | ||
32 | Using the DPDK datapath requires building OVS with DPDK support. Refer to | |
33 | :doc:`/intro/install/dpdk` for more information. | |
34 | ||
35 | Ports and Bridges | |
36 | ----------------- | |
37 | ||
38 | ovs-vsctl can be used to set up bridges and other Open vSwitch features. | |
39 | Bridges should be created with a ``datapath_type=netdev``:: | |
40 | ||
41 | $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev | |
42 | ||
a2673b6c | 43 | ovs-vsctl can also be used to add DPDK devices. ovs-vswitchd should print the |
e69e4f5b SF |
44 | number of dpdk devices found in the log file:: |
45 | ||
fafa41a6 DDP |
46 | $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \ |
47 | options:dpdk-devargs=0000:01:00.0 | |
48 | $ ovs-vsctl add-port br0 dpdk-p1 -- set Interface dpdk-p1 type=dpdk \ | |
49 | options:dpdk-devargs=0000:01:00.1 | |
e69e4f5b SF |
50 | |
51 | After the DPDK ports get added to switch, a polling thread continuously polls | |
52 | DPDK devices and consumes 100% of the core, as can be checked from ``top`` and | |
53 | ``ps`` commands:: | |
54 | ||
55 | $ top -H | |
56 | $ ps -eLo pid,psr,comm | grep pmd | |
57 | ||
58 | Creating bonds of DPDK interfaces is slightly different to creating bonds of | |
fafa41a6 DDP |
59 | system interfaces. For DPDK, the interface type and devargs must be explicitly |
60 | set. For example:: | |
e69e4f5b | 61 | |
fafa41a6 DDP |
62 | $ ovs-vsctl add-bond br0 dpdkbond p0 p1 \ |
63 | -- set Interface p0 type=dpdk options:dpdk-devargs=0000:01:00.0 \ | |
64 | -- set Interface p1 type=dpdk options:dpdk-devargs=0000:01:00.1 | |
e69e4f5b SF |
65 | |
66 | To stop ovs-vswitchd & delete bridge, run:: | |
67 | ||
68 | $ ovs-appctl -t ovs-vswitchd exit | |
69 | $ ovs-appctl -t ovsdb-server exit | |
70 | $ ovs-vsctl del-br br0 | |
71 | ||
72 | PMD Thread Statistics | |
73 | --------------------- | |
74 | ||
75 | To show current stats:: | |
76 | ||
77 | $ ovs-appctl dpif-netdev/pmd-stats-show | |
78 | ||
79 | To clear previous stats:: | |
80 | ||
81 | $ ovs-appctl dpif-netdev/pmd-stats-clear | |
82 | ||
83 | Port/RXQ Assigment to PMD Threads | |
84 | --------------------------------- | |
85 | ||
86 | To show port/rxq assignment:: | |
87 | ||
88 | $ ovs-appctl dpif-netdev/pmd-rxq-show | |
89 | ||
90 | To change default rxq assignment to pmd threads, rxqs may be manually pinned to | |
91 | desired cores using:: | |
92 | ||
93 | $ ovs-vsctl set Interface <iface> \ | |
94 | other_config:pmd-rxq-affinity=<rxq-affinity-list> | |
95 | ||
96 | where: | |
97 | ||
98 | - ``<rxq-affinity-list>`` is a CSV list of ``<queue-id>:<core-id>`` values | |
99 | ||
100 | For example:: | |
101 | ||
fafa41a6 | 102 | $ ovs-vsctl set interface dpdk-p0 options:n_rxq=4 \ |
e69e4f5b SF |
103 | other_config:pmd-rxq-affinity="0:3,1:7,3:8" |
104 | ||
105 | This will ensure: | |
106 | ||
107 | - Queue #0 pinned to core 3 | |
108 | - Queue #1 pinned to core 7 | |
109 | - Queue #2 not pinned | |
110 | - Queue #3 pinned to core 8 | |
111 | ||
112 | After that PMD threads on cores where RX queues was pinned will become | |
113 | ``isolated``. This means that this thread will poll only pinned RX queues. | |
114 | ||
115 | .. warning:: | |
116 | If there are no ``non-isolated`` PMD threads, ``non-pinned`` RX queues will | |
117 | not be polled. Also, if provided ``core_id`` is not available (ex. this | |
118 | ``core_id`` not in ``pmd-cpu-mask``), RX queue will not be polled by any PMD | |
119 | thread. | |
120 | ||
121 | QoS | |
122 | --- | |
123 | ||
124 | Assuming you have a vhost-user port transmitting traffic consisting of packets | |
125 | of size 64 bytes, the following command would limit the egress transmission | |
126 | rate of the port to ~1,000,000 packets per second:: | |
127 | ||
128 | $ ovs-vsctl set port vhost-user0 qos=@newqos -- \ | |
129 | --id=@newqos create qos type=egress-policer other-config:cir=46000000 \ | |
130 | other-config:cbs=2048` | |
131 | ||
132 | To examine the QoS configuration of the port, run:: | |
133 | ||
134 | $ ovs-appctl -t ovs-vswitchd qos/show vhost-user0 | |
135 | ||
136 | To clear the QoS configuration from the port and ovsdb, run:: | |
137 | ||
138 | $ ovs-vsctl destroy QoS vhost-user0 -- clear Port vhost-user0 qos | |
139 | ||
140 | Refer to vswitch.xml for more details on egress-policer. | |
141 | ||
142 | Rate Limiting | |
143 | -------------- | |
144 | ||
145 | Here is an example on Ingress Policing usage. Assuming you have a vhost-user | |
146 | port receiving traffic consisting of packets of size 64 bytes, the following | |
147 | command would limit the reception rate of the port to ~1,000,000 packets per | |
148 | second:: | |
149 | ||
150 | $ ovs-vsctl set interface vhost-user0 ingress_policing_rate=368000 \ | |
151 | ingress_policing_burst=1000` | |
152 | ||
153 | To examine the ingress policer configuration of the port:: | |
154 | ||
155 | $ ovs-vsctl list interface vhost-user0 | |
156 | ||
157 | To clear the ingress policer configuration from the port:: | |
158 | ||
159 | $ ovs-vsctl set interface vhost-user0 ingress_policing_rate=0 | |
160 | ||
161 | Refer to vswitch.xml for more details on ingress-policer. | |
162 | ||
163 | Flow Control | |
164 | ------------ | |
165 | ||
166 | Flow control can be enabled only on DPDK physical ports. To enable flow control | |
167 | support at tx side while adding a port, run:: | |
168 | ||
fafa41a6 DDP |
169 | $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \ |
170 | options:dpdk-devargs=0000:01:00.0 options:tx-flow-ctrl=true | |
e69e4f5b SF |
171 | |
172 | Similarly, to enable rx flow control, run:: | |
173 | ||
fafa41a6 DDP |
174 | $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \ |
175 | options:dpdk-devargs=0000:01:00.0 options:rx-flow-ctrl=true | |
e69e4f5b SF |
176 | |
177 | To enable flow control auto-negotiation, run:: | |
178 | ||
fafa41a6 DDP |
179 | $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \ |
180 | options:dpdk-devargs=0000:01:00.0 options:flow-ctrl-autoneg=true | |
e69e4f5b SF |
181 | |
182 | To turn ON the tx flow control at run time for an existing port, run:: | |
183 | ||
fafa41a6 | 184 | $ ovs-vsctl set Interface dpdk-p0 options:tx-flow-ctrl=true |
e69e4f5b SF |
185 | |
186 | The flow control parameters can be turned off by setting ``false`` to the | |
187 | respective parameter. To disable the flow control at tx side, run:: | |
188 | ||
fafa41a6 | 189 | $ ovs-vsctl set Interface dpdk-p0 options:tx-flow-ctrl=false |
e69e4f5b SF |
190 | |
191 | pdump | |
192 | ----- | |
193 | ||
194 | pdump allows you to listen on DPDK ports and view the traffic that is passing | |
195 | on them. To use this utility, one must have libpcap installed on the system. | |
196 | Furthermore, DPDK must be built with ``CONFIG_RTE_LIBRTE_PDUMP=y`` and | |
197 | ``CONFIG_RTE_LIBRTE_PMD_PCAP=y``. | |
198 | ||
199 | .. warning:: | |
200 | A performance decrease is expected when using a monitoring application like | |
201 | the DPDK pdump app. | |
202 | ||
203 | To use pdump, simply launch OVS as usual, then navigate to the ``app/pdump`` | |
204 | directory in DPDK, ``make`` the application and run like so:: | |
205 | ||
206 | $ sudo ./build/app/dpdk-pdump -- \ | |
207 | --pdump port=0,queue=0,rx-dev=/tmp/pkts.pcap \ | |
208 | --server-socket-path=/usr/local/var/run/openvswitch | |
209 | ||
210 | The above command captures traffic received on queue 0 of port 0 and stores it | |
211 | in ``/tmp/pkts.pcap``. Other combinations of port numbers, queues numbers and | |
212 | pcap locations are of course also available to use. For example, to capture all | |
213 | packets that traverse port 0 in a single pcap file:: | |
214 | ||
215 | $ sudo ./build/app/dpdk-pdump -- \ | |
216 | --pdump 'port=0,queue=*,rx-dev=/tmp/pkts.pcap,tx-dev=/tmp/pkts.pcap' \ | |
217 | --server-socket-path=/usr/local/var/run/openvswitch | |
218 | ||
219 | ``server-socket-path`` must be set to the value of ``ovs_rundir()`` which | |
220 | typically resolves to ``/usr/local/var/run/openvswitch``. | |
221 | ||
222 | Many tools are available to view the contents of the pcap file. Once example is | |
223 | tcpdump. Issue the following command to view the contents of ``pkts.pcap``:: | |
224 | ||
225 | $ tcpdump -r pkts.pcap | |
226 | ||
227 | More information on the pdump app and its usage can be found in the `DPDK docs | |
34aa9cf9 | 228 | <http://dpdk.org/doc/guides/tools/pdump.html>`__. |
e69e4f5b SF |
229 | |
230 | Jumbo Frames | |
231 | ------------ | |
232 | ||
233 | By default, DPDK ports are configured with standard Ethernet MTU (1500B). To | |
234 | enable Jumbo Frames support for a DPDK port, change the Interface's | |
235 | ``mtu_request`` attribute to a sufficiently large value. For example, to add a | |
236 | DPDK Phy port with MTU of 9000:: | |
237 | ||
fafa41a6 DDP |
238 | $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \ |
239 | options:dpdk-devargs=0000:01:00.0 mtu_request=9000 | |
e69e4f5b SF |
240 | |
241 | Similarly, to change the MTU of an existing port to 6200:: | |
242 | ||
fafa41a6 | 243 | $ ovs-vsctl set Interface dpdk-p0 mtu_request=6200 |
e69e4f5b SF |
244 | |
245 | Some additional configuration is needed to take advantage of jumbo frames with | |
246 | vHost ports: | |
247 | ||
248 | 1. *mergeable buffers* must be enabled for vHost ports, as demonstrated in the | |
249 | QEMU command line snippet below:: | |
250 | ||
251 | -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ | |
252 | -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=on | |
253 | ||
254 | 2. Where virtio devices are bound to the Linux kernel driver in a guest | |
255 | environment (i.e. interfaces are not bound to an in-guest DPDK driver), the | |
256 | MTU of those logical network interfaces must also be increased to a | |
257 | sufficiently large value. This avoids segmentation of Jumbo Frames received | |
258 | in the guest. Note that 'MTU' refers to the length of the IP packet only, | |
259 | and not that of the entire frame. | |
260 | ||
261 | To calculate the exact MTU of a standard IPv4 frame, subtract the L2 header | |
262 | and CRC lengths (i.e. 18B) from the max supported frame size. So, to set | |
263 | the MTU for a 9018B Jumbo Frame:: | |
264 | ||
0b2c7e69 | 265 | $ ip link set eth1 mtu 9000 |
e69e4f5b SF |
266 | |
267 | When Jumbo Frames are enabled, the size of a DPDK port's mbuf segments are | |
268 | increased, such that a full Jumbo Frame of a specific size may be accommodated | |
269 | within a single mbuf segment. | |
270 | ||
271 | Jumbo frame support has been validated against 9728B frames, which is the | |
272 | largest frame size supported by Fortville NIC using the DPDK i40e driver, but | |
273 | larger frames and other DPDK NIC drivers may be supported. These cases are | |
274 | common for use cases involving East-West traffic only. | |
275 | ||
1a2bb118 SC |
276 | Rx Checksum Offload |
277 | ------------------- | |
278 | ||
d4f5282c | 279 | By default, DPDK physical ports are enabled with Rx checksum offload. |
1a2bb118 SC |
280 | |
281 | Rx checksum offload can offer performance improvement only for tunneling | |
282 | traffic in OVS-DPDK because the checksum validation of tunnel packets is | |
283 | offloaded to the NIC. Also enabling Rx checksum may slightly reduce the | |
284 | performance of non-tunnel traffic, specifically for smaller size packet. | |
1a2bb118 | 285 | |
9b49f85f BB |
286 | .. _extended-statistics: |
287 | ||
288 | Extended Statistics | |
289 | ------------------- | |
290 | ||
291 | DPDK Extended Statistics API allows PMD to expose unique set of statistics. | |
292 | The Extended statistics are implemented and supported only for DPDK physical | |
293 | and vHost ports. | |
294 | ||
295 | To enable statistics, you have to enable OpenFlow 1.4 support for OVS. | |
296 | Configure bridge br0 to support OpenFlow version 1.4:: | |
297 | ||
298 | $ ovs-vsctl set bridge br0 datapath_type=netdev \ | |
299 | protocols=OpenFlow10,OpenFlow11,OpenFlow12,OpenFlow13,OpenFlow14 | |
300 | ||
301 | Check the OVSDB protocols column in the bridge table if OpenFlow 1.4 support | |
302 | is enabled for OVS:: | |
303 | ||
304 | $ ovsdb-client dump Bridge protocols | |
305 | ||
306 | Query the port statistics by explicitly specifying -O OpenFlow14 option:: | |
307 | ||
308 | $ ovs-ofctl -O OpenFlow14 dump-ports br0 | |
309 | ||
310 | Note: vHost ports supports only partial statistics. RX packet size based | |
311 | counter are only supported and doesn't include TX packet size counters. | |
312 | ||
b8374d0d MV |
313 | .. _port-hotplug: |
314 | ||
315 | Port Hotplug | |
316 | ------------ | |
317 | ||
318 | OVS supports port hotplugging, allowing the use of ports that were not bound | |
319 | to DPDK when vswitchd was started. | |
320 | In order to attach a port, it has to be bound to DPDK using the | |
321 | ``dpdk_nic_bind.py`` script:: | |
322 | ||
323 | $ $DPDK_DIR/tools/dpdk_nic_bind.py --bind=igb_uio 0000:01:00.0 | |
324 | ||
325 | Then it can be attached to OVS:: | |
326 | ||
55e075e6 CL |
327 | $ ovs-vsctl add-port br0 dpdkx -- set Interface dpdkx type=dpdk \ |
328 | options:dpdk-devargs=0000:01:00.0 | |
b8374d0d | 329 | |
5dcde09c | 330 | Detaching will be performed while processing del-port command:: |
b8374d0d | 331 | |
5dcde09c | 332 | $ ovs-vsctl del-port dpdkx |
b8374d0d MV |
333 | |
334 | This feature is not supported with VFIO and does not work with some NICs. | |
335 | For more information please refer to the `DPDK Port Hotplug Framework | |
336 | <http://dpdk.org/doc/guides/prog_guide/port_hotplug_framework.html#hotplug>`__. | |
337 | ||
69876ed7 CL |
338 | .. _vdev-support: |
339 | ||
340 | Vdev Support | |
341 | ------------ | |
342 | ||
343 | DPDK provides drivers for both physical and virtual devices. Physical DPDK | |
344 | devices are added to OVS by specifying a valid PCI address in 'dpdk-devargs'. | |
345 | Virtual DPDK devices which do not have PCI addresses can be added using a | |
346 | different format for 'dpdk-devargs'. | |
347 | ||
348 | Typically, the format expected is 'eth_<driver_name><x>' where 'x' is a | |
b132189d | 349 | unique identifier of your choice for the given port. |
69876ed7 CL |
350 | |
351 | For example to add a dpdk port that uses the 'null' DPDK PMD driver:: | |
352 | ||
353 | $ ovs-vsctl add-port br0 null0 -- set Interface null0 type=dpdk \ | |
354 | options:dpdk-devargs=eth_null0 | |
355 | ||
356 | Similarly, to add a dpdk port that uses the 'af_packet' DPDK PMD driver:: | |
357 | ||
358 | $ ovs-vsctl add-port br0 myeth0 -- set Interface myeth0 type=dpdk \ | |
359 | options:dpdk-devargs=eth_af_packet0,iface=eth0 | |
360 | ||
361 | More information on the different types of virtual DPDK PMDs can be found in | |
362 | the `DPDK documentation | |
363 | <http://dpdk.org/doc/guides/nics/overview.html>`__. | |
364 | ||
365 | Note: Not all DPDK virtual PMD drivers have been tested and verified to work. | |
366 | ||
4c30b246 CL |
367 | EMC Insertion Probability |
368 | ------------------------- | |
369 | By default 1 in every 100 flows are inserted into the Exact Match Cache (EMC). | |
370 | It is possible to change this insertion probability by setting the | |
371 | ``emc-insert-inv-prob`` option:: | |
372 | ||
373 | $ ovs-vsctl --no-wait set Open_vSwitch . other_config:emc-insert-inv-prob=N | |
374 | ||
375 | where: | |
376 | ||
377 | ``N`` | |
378 | is a positive integer representing the inverse probability of insertion ie. | |
379 | on average 1 in every N packets with a unique flow will generate an EMC | |
380 | insertion. | |
381 | ||
382 | If ``N`` is set to 1, an insertion will be performed for every flow. If set to | |
383 | 0, no insertions will be performed and the EMC will effectively be disabled. | |
384 | ||
3ca6d30d BB |
385 | With default ``N`` set to 100, higher megaflow hits will occur initially |
386 | as observed with pmd stats:: | |
387 | ||
388 | $ ovs-appctl dpif-netdev/pmd-stats-show | |
389 | ||
390 | For certain traffic profiles with many parallel flows, it's recommended to set | |
391 | ``N`` to '0' to achieve higher forwarding performance. | |
392 | ||
4c30b246 CL |
393 | For more information on the EMC refer to :doc:`/intro/install/dpdk` . |
394 | ||
e69e4f5b SF |
395 | .. _dpdk-ovs-in-guest: |
396 | ||
397 | OVS with DPDK Inside VMs | |
398 | ------------------------ | |
399 | ||
400 | Additional configuration is required if you want to run ovs-vswitchd with DPDK | |
401 | backend inside a QEMU virtual machine. ovs-vswitchd creates separate DPDK TX | |
402 | queues for each CPU core available. This operation fails inside QEMU virtual | |
403 | machine because, by default, VirtIO NIC provided to the guest is configured to | |
404 | support only single TX queue and single RX queue. To change this behavior, you | |
405 | need to turn on ``mq`` (multiqueue) property of all ``virtio-net-pci`` devices | |
406 | emulated by QEMU and used by DPDK. You may do it manually (by changing QEMU | |
407 | command line) or, if you use Libvirt, by adding the following string to | |
408 | ``<interface>`` sections of all network devices used by DPDK:: | |
409 | ||
410 | <driver name='vhost' queues='N'/> | |
411 | ||
412 | where: | |
413 | ||
414 | ``N`` | |
415 | determines how many queues can be used by the guest. | |
416 | ||
417 | This requires QEMU >= 2.2. | |
418 | ||
419 | .. _dpdk-phy-phy: | |
420 | ||
421 | PHY-PHY | |
422 | ------- | |
423 | ||
424 | Add a userspace bridge and two ``dpdk`` (PHY) ports:: | |
425 | ||
426 | # Add userspace bridge | |
427 | $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev | |
428 | ||
429 | # Add two dpdk ports | |
fafa41a6 DDP |
430 | $ ovs-vsctl add-port br0 phy0 -- set Interface phy0 type=dpdk \ |
431 | options:dpdk-devargs=0000:01:00.0 ofport_request=1 | |
432 | ||
433 | $ ovs-vsctl add-port br0 phy1 -- set Interface phy1 type=dpdk | |
434 | options:dpdk-devargs=0000:01:00.1 ofport_request=2 | |
e69e4f5b SF |
435 | |
436 | Add test flows to forward packets betwen DPDK port 0 and port 1:: | |
437 | ||
438 | # Clear current flows | |
439 | $ ovs-ofctl del-flows br0 | |
440 | ||
fafa41a6 | 441 | # Add flows between port 1 (phy0) to port 2 (phy1) |
e69e4f5b SF |
442 | $ ovs-ofctl add-flow br0 in_port=1,action=output:2 |
443 | $ ovs-ofctl add-flow br0 in_port=2,action=output:1 | |
444 | ||
445 | Transmit traffic into either port. You should see it returned via the other. | |
446 | ||
447 | .. _dpdk-vhost-loopback: | |
448 | ||
449 | PHY-VM-PHY (vHost Loopback) | |
450 | --------------------------- | |
451 | ||
452 | Add a userspace bridge, two ``dpdk`` (PHY) ports, and two ``dpdkvhostuser`` | |
453 | ports:: | |
454 | ||
455 | # Add userspace bridge | |
456 | $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev | |
457 | ||
458 | # Add two dpdk ports | |
fafa41a6 DDP |
459 | $ ovs-vsctl add-port br0 phy0 -- set Interface phy0 type=dpdk \ |
460 | options:dpdk-devargs=0000:01:00.0 ofport_request=1 | |
461 | ||
462 | $ ovs-vsctl add-port br0 phy1 -- set Interface phy1 type=dpdk | |
463 | options:dpdk-devargs=0000:01:00.1 ofport_request=2 | |
e69e4f5b SF |
464 | |
465 | # Add two dpdkvhostuser ports | |
466 | $ ovs-vsctl add-port br0 dpdkvhostuser0 \ | |
fafa41a6 | 467 | -- set Interface dpdkvhostuser0 type=dpdkvhostuser ofport_request=3 |
e69e4f5b | 468 | $ ovs-vsctl add-port br0 dpdkvhostuser1 \ |
fafa41a6 | 469 | -- set Interface dpdkvhostuser1 type=dpdkvhostuser ofport_request=4 |
e69e4f5b SF |
470 | |
471 | Add test flows to forward packets betwen DPDK devices and VM ports:: | |
472 | ||
473 | # Clear current flows | |
474 | $ ovs-ofctl del-flows br0 | |
475 | ||
476 | # Add flows | |
477 | $ ovs-ofctl add-flow br0 in_port=1,action=output:3 | |
478 | $ ovs-ofctl add-flow br0 in_port=3,action=output:1 | |
479 | $ ovs-ofctl add-flow br0 in_port=4,action=output:2 | |
480 | $ ovs-ofctl add-flow br0 in_port=2,action=output:4 | |
481 | ||
482 | # Dump flows | |
483 | $ ovs-ofctl dump-flows br0 | |
484 | ||
485 | Create a VM using the following configuration: | |
486 | ||
487 | +----------------------+--------+-----------------+ | |
488 | | configuration | values | comments | | |
489 | +----------------------+--------+-----------------+ | |
490 | | qemu version | 2.2.0 | n/a | | |
491 | | qemu thread affinity | core 5 | taskset 0x20 | | |
492 | | memory | 4GB | n/a | | |
493 | | cores | 2 | n/a | | |
494 | | Qcow2 image | CentOS7| n/a | | |
495 | | mrg_rxbuf | off | n/a | | |
496 | +----------------------+--------+-----------------+ | |
497 | ||
498 | You can do this directly with QEMU via the ``qemu-system-x86_64`` application:: | |
499 | ||
500 | $ export VM_NAME=vhost-vm | |
501 | $ export GUEST_MEM=3072M | |
502 | $ export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2 | |
503 | $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch | |
504 | ||
505 | $ taskset 0x20 qemu-system-x86_64 -name $VM_NAME -cpu host -enable-kvm \ | |
506 | -m $GUEST_MEM -drive file=$QCOW2_IMAGE --nographic -snapshot \ | |
507 | -numa node,memdev=mem -mem-prealloc -smp sockets=1,cores=2 \ | |
508 | -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \ | |
509 | -chardev socket,id=char0,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \ | |
510 | -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ | |
511 | -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off \ | |
512 | -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \ | |
513 | -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \ | |
514 | -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off | |
515 | ||
516 | For a explanation of this command, along with alternative approaches such as | |
517 | booting the VM via libvirt, refer to :doc:`/topics/dpdk/vhost-user`. | |
518 | ||
519 | Once the guest is configured and booted, configure DPDK packet forwarding | |
520 | within the guest. To accomplish this, build the ``testpmd`` application as | |
521 | described in :ref:`dpdk-testpmd`. Once compiled, run the application:: | |
522 | ||
523 | $ cd $DPDK_DIR/app/test-pmd; | |
524 | $ ./testpmd -c 0x3 -n 4 --socket-mem 1024 -- \ | |
525 | --burst=64 -i --txqflags=0xf00 --disable-hw-vlan | |
526 | $ set fwd mac retry | |
527 | $ start | |
528 | ||
529 | When you finish testing, bind the vNICs back to kernel:: | |
530 | ||
531 | $ $DPDK_DIR/tools/dpdk-devbind.py --bind=virtio-pci 0000:00:03.0 | |
532 | $ $DPDK_DIR/tools/dpdk-devbind.py --bind=virtio-pci 0000:00:04.0 | |
533 | ||
534 | .. note:: | |
535 | ||
536 | Valid PCI IDs must be passed in above example. The PCI IDs can be retrieved | |
537 | like so:: | |
538 | ||
539 | $ $DPDK_DIR/tools/dpdk-devbind.py --status | |
540 | ||
541 | More information on the dpdkvhostuser ports can be found in | |
542 | :doc:`/topics/dpdk/vhost-user`. | |
543 | ||
544 | PHY-VM-PHY (vHost Loopback) (Kernel Forwarding) | |
545 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
546 | ||
547 | :ref:`dpdk-vhost-loopback` details steps for PHY-VM-PHY loopback | |
548 | testcase and packet forwarding using DPDK testpmd application in the Guest VM. | |
549 | For users wishing to do packet forwarding using kernel stack below, you need to | |
550 | run the below commands on the guest:: | |
551 | ||
0b2c7e69 BP |
552 | $ ip addr add 1.1.1.2/24 dev eth1 |
553 | $ ip addr add 1.1.2.2/24 dev eth2 | |
554 | $ ip link set eth1 up | |
555 | $ ip link set eth2 up | |
e69e4f5b SF |
556 | $ systemctl stop firewalld.service |
557 | $ systemctl stop iptables.service | |
558 | $ sysctl -w net.ipv4.ip_forward=1 | |
559 | $ sysctl -w net.ipv4.conf.all.rp_filter=0 | |
560 | $ sysctl -w net.ipv4.conf.eth1.rp_filter=0 | |
561 | $ sysctl -w net.ipv4.conf.eth2.rp_filter=0 | |
562 | $ route add -net 1.1.2.0/24 eth2 | |
563 | $ route add -net 1.1.1.0/24 eth1 | |
564 | $ arp -s 1.1.2.99 DE:AD:BE:EF:CA:FE | |
565 | $ arp -s 1.1.1.99 DE:AD:BE:EF:CA:EE | |
566 | ||
567 | PHY-VM-PHY (vHost Multiqueue) | |
568 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
569 | ||
570 | vHost Multiqueue functionality can also be validated using the PHY-VM-PHY | |
571 | configuration. To begin, follow the steps described in :ref:`dpdk-phy-phy` to | |
572 | create and initialize the database, start ovs-vswitchd and add ``dpdk``-type | |
573 | devices to bridge ``br0``. Once complete, follow the below steps: | |
574 | ||
575 | 1. Configure PMD and RXQs. | |
576 | ||
577 | For example, set the number of dpdk port rx queues to at least 2 The number | |
578 | of rx queues at vhost-user interface gets automatically configured after | |
579 | virtio device connection and doesn't need manual configuration:: | |
580 | ||
581 | $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xc | |
fafa41a6 DDP |
582 | $ ovs-vsctl set Interface phy0 options:n_rxq=2 |
583 | $ ovs-vsctl set Interface phy1 options:n_rxq=2 | |
e69e4f5b SF |
584 | |
585 | 2. Instantiate Guest VM using QEMU cmdline | |
586 | ||
587 | We must configure with appropriate software versions to ensure this feature | |
588 | is supported. | |
589 | ||
590 | .. list-table:: Recommended BIOS Settings | |
591 | :header-rows: 1 | |
592 | ||
593 | * - Setting | |
594 | - Value | |
595 | * - QEMU version | |
596 | - 2.5.0 | |
597 | * - QEMU thread affinity | |
598 | - 2 cores (taskset 0x30) | |
599 | * - Memory | |
600 | - 4 GB | |
601 | * - Cores | |
602 | - 2 | |
603 | * - Distro | |
604 | - Fedora 22 | |
605 | * - Multiqueue | |
606 | - Enabled | |
607 | ||
608 | To do this, instantiate the guest as follows:: | |
609 | ||
610 | $ export VM_NAME=vhost-vm | |
611 | $ export GUEST_MEM=4096M | |
612 | $ export QCOW2_IMAGE=/root/Fedora22_x86_64.qcow2 | |
613 | $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch | |
614 | $ taskset 0x30 qemu-system-x86_64 -cpu host -smp 2,cores=2 -m 4096M \ | |
615 | -drive file=$QCOW2_IMAGE --enable-kvm -name $VM_NAME \ | |
616 | -nographic -numa node,memdev=mem -mem-prealloc \ | |
617 | -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \ | |
618 | -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \ | |
619 | -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce,queues=2 \ | |
620 | -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mq=on,vectors=6 \ | |
621 | -chardev socket,id=char2,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \ | |
622 | -netdev type=vhost-user,id=mynet2,chardev=char2,vhostforce,queues=2 \ | |
623 | -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mq=on,vectors=6 | |
624 | ||
625 | .. note:: | |
626 | Queue value above should match the queues configured in OVS, The vector | |
627 | value should be set to "number of queues x 2 + 2" | |
628 | ||
629 | 3. Configure the guest interface | |
630 | ||
631 | Assuming there are 2 interfaces in the guest named eth0, eth1 check the | |
632 | channel configuration and set the number of combined channels to 2 for | |
633 | virtio devices:: | |
634 | ||
635 | $ ethtool -l eth0 | |
636 | $ ethtool -L eth0 combined 2 | |
637 | $ ethtool -L eth1 combined 2 | |
638 | ||
639 | More information can be found in vHost walkthrough section. | |
640 | ||
641 | 4. Configure kernel packet forwarding | |
642 | ||
643 | Configure IP and enable interfaces:: | |
644 | ||
0b2c7e69 BP |
645 | $ ip addr add 5.5.5.1/24 dev eth0 |
646 | $ ip addr add 90.90.90.1/24 dev eth1 | |
647 | $ ip link set eth0 up | |
648 | $ ip link set eth1 up | |
e69e4f5b SF |
649 | |
650 | Configure IP forwarding and add route entries:: | |
651 | ||
652 | $ sysctl -w net.ipv4.ip_forward=1 | |
653 | $ sysctl -w net.ipv4.conf.all.rp_filter=0 | |
654 | $ sysctl -w net.ipv4.conf.eth0.rp_filter=0 | |
655 | $ sysctl -w net.ipv4.conf.eth1.rp_filter=0 | |
656 | $ ip route add 2.1.1.0/24 dev eth1 | |
657 | $ route add default gw 2.1.1.2 eth1 | |
658 | $ route add default gw 90.90.90.90 eth1 | |
659 | $ arp -s 90.90.90.90 DE:AD:BE:EF:CA:FE | |
660 | $ arp -s 2.1.1.2 DE:AD:BE:EF:CA:FA | |
661 | ||
662 | Check traffic on multiple queues:: | |
663 | ||
664 | $ cat /proc/interrupts | grep virtio |