]>
Commit | Line | Data |
---|---|---|
e69e4f5b SF |
1 | .. |
2 | Licensed under the Apache License, Version 2.0 (the "License"); you may | |
3 | not use this file except in compliance with the License. You may obtain | |
4 | a copy of the License at | |
5 | ||
6 | http://www.apache.org/licenses/LICENSE-2.0 | |
7 | ||
8 | Unless required by applicable law or agreed to in writing, software | |
9 | distributed under the License is distributed on an "AS IS" BASIS, WITHOUT | |
10 | WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the | |
11 | License for the specific language governing permissions and limitations | |
12 | under the License. | |
13 | ||
14 | Convention for heading levels in Open vSwitch documentation: | |
15 | ||
16 | ======= Heading 0 (reserved for the title in a document) | |
17 | ------- Heading 1 | |
18 | ~~~~~~~ Heading 2 | |
19 | +++++++ Heading 3 | |
20 | ''''''' Heading 4 | |
21 | ||
22 | Avoid deeper levels because they do not render well. | |
23 | ||
24 | ============================ | |
25 | Using Open vSwitch with DPDK | |
26 | ============================ | |
27 | ||
28 | This document describes how to use Open vSwitch with DPDK datapath. | |
29 | ||
30 | .. important:: | |
31 | ||
32 | Using the DPDK datapath requires building OVS with DPDK support. Refer to | |
33 | :doc:`/intro/install/dpdk` for more information. | |
34 | ||
35 | Ports and Bridges | |
36 | ----------------- | |
37 | ||
38 | ovs-vsctl can be used to set up bridges and other Open vSwitch features. | |
39 | Bridges should be created with a ``datapath_type=netdev``:: | |
40 | ||
41 | $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev | |
42 | ||
43 | ovs-vsctl can also be used to add DPDK devices. OVS expects DPDK device names | |
44 | to start with ``dpdk`` and end with a portid. ovs-vswitchd should print the | |
45 | number of dpdk devices found in the log file:: | |
46 | ||
47 | $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk | |
48 | $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk | |
49 | ||
50 | After the DPDK ports get added to switch, a polling thread continuously polls | |
51 | DPDK devices and consumes 100% of the core, as can be checked from ``top`` and | |
52 | ``ps`` commands:: | |
53 | ||
54 | $ top -H | |
55 | $ ps -eLo pid,psr,comm | grep pmd | |
56 | ||
57 | Creating bonds of DPDK interfaces is slightly different to creating bonds of | |
58 | system interfaces. For DPDK, the interface type must be explicitly set. For | |
59 | example:: | |
60 | ||
61 | $ ovs-vsctl add-bond br0 dpdkbond dpdk0 dpdk1 \ | |
62 | -- set Interface dpdk0 type=dpdk \ | |
63 | -- set Interface dpdk1 type=dpdk | |
64 | ||
65 | To stop ovs-vswitchd & delete bridge, run:: | |
66 | ||
67 | $ ovs-appctl -t ovs-vswitchd exit | |
68 | $ ovs-appctl -t ovsdb-server exit | |
69 | $ ovs-vsctl del-br br0 | |
70 | ||
71 | PMD Thread Statistics | |
72 | --------------------- | |
73 | ||
74 | To show current stats:: | |
75 | ||
76 | $ ovs-appctl dpif-netdev/pmd-stats-show | |
77 | ||
78 | To clear previous stats:: | |
79 | ||
80 | $ ovs-appctl dpif-netdev/pmd-stats-clear | |
81 | ||
82 | Port/RXQ Assigment to PMD Threads | |
83 | --------------------------------- | |
84 | ||
85 | To show port/rxq assignment:: | |
86 | ||
87 | $ ovs-appctl dpif-netdev/pmd-rxq-show | |
88 | ||
89 | To change default rxq assignment to pmd threads, rxqs may be manually pinned to | |
90 | desired cores using:: | |
91 | ||
92 | $ ovs-vsctl set Interface <iface> \ | |
93 | other_config:pmd-rxq-affinity=<rxq-affinity-list> | |
94 | ||
95 | where: | |
96 | ||
97 | - ``<rxq-affinity-list>`` is a CSV list of ``<queue-id>:<core-id>`` values | |
98 | ||
99 | For example:: | |
100 | ||
101 | $ ovs-vsctl set interface dpdk0 options:n_rxq=4 \ | |
102 | other_config:pmd-rxq-affinity="0:3,1:7,3:8" | |
103 | ||
104 | This will ensure: | |
105 | ||
106 | - Queue #0 pinned to core 3 | |
107 | - Queue #1 pinned to core 7 | |
108 | - Queue #2 not pinned | |
109 | - Queue #3 pinned to core 8 | |
110 | ||
111 | After that PMD threads on cores where RX queues was pinned will become | |
112 | ``isolated``. This means that this thread will poll only pinned RX queues. | |
113 | ||
114 | .. warning:: | |
115 | If there are no ``non-isolated`` PMD threads, ``non-pinned`` RX queues will | |
116 | not be polled. Also, if provided ``core_id`` is not available (ex. this | |
117 | ``core_id`` not in ``pmd-cpu-mask``), RX queue will not be polled by any PMD | |
118 | thread. | |
119 | ||
120 | QoS | |
121 | --- | |
122 | ||
123 | Assuming you have a vhost-user port transmitting traffic consisting of packets | |
124 | of size 64 bytes, the following command would limit the egress transmission | |
125 | rate of the port to ~1,000,000 packets per second:: | |
126 | ||
127 | $ ovs-vsctl set port vhost-user0 qos=@newqos -- \ | |
128 | --id=@newqos create qos type=egress-policer other-config:cir=46000000 \ | |
129 | other-config:cbs=2048` | |
130 | ||
131 | To examine the QoS configuration of the port, run:: | |
132 | ||
133 | $ ovs-appctl -t ovs-vswitchd qos/show vhost-user0 | |
134 | ||
135 | To clear the QoS configuration from the port and ovsdb, run:: | |
136 | ||
137 | $ ovs-vsctl destroy QoS vhost-user0 -- clear Port vhost-user0 qos | |
138 | ||
139 | Refer to vswitch.xml for more details on egress-policer. | |
140 | ||
141 | Rate Limiting | |
142 | -------------- | |
143 | ||
144 | Here is an example on Ingress Policing usage. Assuming you have a vhost-user | |
145 | port receiving traffic consisting of packets of size 64 bytes, the following | |
146 | command would limit the reception rate of the port to ~1,000,000 packets per | |
147 | second:: | |
148 | ||
149 | $ ovs-vsctl set interface vhost-user0 ingress_policing_rate=368000 \ | |
150 | ingress_policing_burst=1000` | |
151 | ||
152 | To examine the ingress policer configuration of the port:: | |
153 | ||
154 | $ ovs-vsctl list interface vhost-user0 | |
155 | ||
156 | To clear the ingress policer configuration from the port:: | |
157 | ||
158 | $ ovs-vsctl set interface vhost-user0 ingress_policing_rate=0 | |
159 | ||
160 | Refer to vswitch.xml for more details on ingress-policer. | |
161 | ||
162 | Flow Control | |
163 | ------------ | |
164 | ||
165 | Flow control can be enabled only on DPDK physical ports. To enable flow control | |
166 | support at tx side while adding a port, run:: | |
167 | ||
168 | $ ovs-vsctl add-port br0 dpdk0 -- \ | |
169 | set Interface dpdk0 type=dpdk options:tx-flow-ctrl=true | |
170 | ||
171 | Similarly, to enable rx flow control, run:: | |
172 | ||
173 | $ ovs-vsctl add-port br0 dpdk0 -- \ | |
174 | set Interface dpdk0 type=dpdk options:rx-flow-ctrl=true | |
175 | ||
176 | To enable flow control auto-negotiation, run:: | |
177 | ||
178 | $ ovs-vsctl add-port br0 dpdk0 -- \ | |
179 | set Interface dpdk0 type=dpdk options:flow-ctrl-autoneg=true | |
180 | ||
181 | To turn ON the tx flow control at run time for an existing port, run:: | |
182 | ||
183 | $ ovs-vsctl set Interface dpdk0 options:tx-flow-ctrl=true | |
184 | ||
185 | The flow control parameters can be turned off by setting ``false`` to the | |
186 | respective parameter. To disable the flow control at tx side, run:: | |
187 | ||
188 | $ ovs-vsctl set Interface dpdk0 options:tx-flow-ctrl=false | |
189 | ||
190 | pdump | |
191 | ----- | |
192 | ||
193 | pdump allows you to listen on DPDK ports and view the traffic that is passing | |
194 | on them. To use this utility, one must have libpcap installed on the system. | |
195 | Furthermore, DPDK must be built with ``CONFIG_RTE_LIBRTE_PDUMP=y`` and | |
196 | ``CONFIG_RTE_LIBRTE_PMD_PCAP=y``. | |
197 | ||
198 | .. warning:: | |
199 | A performance decrease is expected when using a monitoring application like | |
200 | the DPDK pdump app. | |
201 | ||
202 | To use pdump, simply launch OVS as usual, then navigate to the ``app/pdump`` | |
203 | directory in DPDK, ``make`` the application and run like so:: | |
204 | ||
205 | $ sudo ./build/app/dpdk-pdump -- \ | |
206 | --pdump port=0,queue=0,rx-dev=/tmp/pkts.pcap \ | |
207 | --server-socket-path=/usr/local/var/run/openvswitch | |
208 | ||
209 | The above command captures traffic received on queue 0 of port 0 and stores it | |
210 | in ``/tmp/pkts.pcap``. Other combinations of port numbers, queues numbers and | |
211 | pcap locations are of course also available to use. For example, to capture all | |
212 | packets that traverse port 0 in a single pcap file:: | |
213 | ||
214 | $ sudo ./build/app/dpdk-pdump -- \ | |
215 | --pdump 'port=0,queue=*,rx-dev=/tmp/pkts.pcap,tx-dev=/tmp/pkts.pcap' \ | |
216 | --server-socket-path=/usr/local/var/run/openvswitch | |
217 | ||
218 | ``server-socket-path`` must be set to the value of ``ovs_rundir()`` which | |
219 | typically resolves to ``/usr/local/var/run/openvswitch``. | |
220 | ||
221 | Many tools are available to view the contents of the pcap file. Once example is | |
222 | tcpdump. Issue the following command to view the contents of ``pkts.pcap``:: | |
223 | ||
224 | $ tcpdump -r pkts.pcap | |
225 | ||
226 | More information on the pdump app and its usage can be found in the `DPDK docs | |
34aa9cf9 | 227 | <http://dpdk.org/doc/guides/tools/pdump.html>`__. |
e69e4f5b SF |
228 | |
229 | Jumbo Frames | |
230 | ------------ | |
231 | ||
232 | By default, DPDK ports are configured with standard Ethernet MTU (1500B). To | |
233 | enable Jumbo Frames support for a DPDK port, change the Interface's | |
234 | ``mtu_request`` attribute to a sufficiently large value. For example, to add a | |
235 | DPDK Phy port with MTU of 9000:: | |
236 | ||
237 | $ ovs-vsctl add-port br0 dpdk0 \ | |
238 | -- set Interface dpdk0 type=dpdk \ | |
239 | -- set Interface dpdk0 mtu_request=9000` | |
240 | ||
241 | Similarly, to change the MTU of an existing port to 6200:: | |
242 | ||
243 | $ ovs-vsctl set Interface dpdk0 mtu_request=6200 | |
244 | ||
245 | Some additional configuration is needed to take advantage of jumbo frames with | |
246 | vHost ports: | |
247 | ||
248 | 1. *mergeable buffers* must be enabled for vHost ports, as demonstrated in the | |
249 | QEMU command line snippet below:: | |
250 | ||
251 | -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ | |
252 | -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=on | |
253 | ||
254 | 2. Where virtio devices are bound to the Linux kernel driver in a guest | |
255 | environment (i.e. interfaces are not bound to an in-guest DPDK driver), the | |
256 | MTU of those logical network interfaces must also be increased to a | |
257 | sufficiently large value. This avoids segmentation of Jumbo Frames received | |
258 | in the guest. Note that 'MTU' refers to the length of the IP packet only, | |
259 | and not that of the entire frame. | |
260 | ||
261 | To calculate the exact MTU of a standard IPv4 frame, subtract the L2 header | |
262 | and CRC lengths (i.e. 18B) from the max supported frame size. So, to set | |
263 | the MTU for a 9018B Jumbo Frame:: | |
264 | ||
265 | $ ifconfig eth1 mtu 9000 | |
266 | ||
267 | When Jumbo Frames are enabled, the size of a DPDK port's mbuf segments are | |
268 | increased, such that a full Jumbo Frame of a specific size may be accommodated | |
269 | within a single mbuf segment. | |
270 | ||
271 | Jumbo frame support has been validated against 9728B frames, which is the | |
272 | largest frame size supported by Fortville NIC using the DPDK i40e driver, but | |
273 | larger frames and other DPDK NIC drivers may be supported. These cases are | |
274 | common for use cases involving East-West traffic only. | |
275 | ||
1a2bb118 SC |
276 | Rx Checksum Offload |
277 | ------------------- | |
278 | ||
279 | By default, DPDK physical ports are enabled with Rx checksum offload. Rx | |
280 | checksum offload can be configured on a DPDK physical port either when adding | |
281 | or at run time. | |
282 | ||
283 | To disable Rx checksum offload when adding a DPDK port dpdk0:: | |
284 | ||
285 | $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk \ | |
286 | options:rx-checksum-offload=false | |
287 | ||
288 | Similarly to disable the Rx checksum offloading on a existing DPDK port dpdk0:: | |
289 | ||
290 | $ ovs-vsctl set Interface dpdk0 type=dpdk options:rx-checksum-offload=false | |
291 | ||
292 | Rx checksum offload can offer performance improvement only for tunneling | |
293 | traffic in OVS-DPDK because the checksum validation of tunnel packets is | |
294 | offloaded to the NIC. Also enabling Rx checksum may slightly reduce the | |
295 | performance of non-tunnel traffic, specifically for smaller size packet. | |
296 | DPDK vectorization is disabled when checksum offloading is configured on DPDK | |
297 | physical ports which in turn effects the non-tunnel traffic performance. | |
298 | So it is advised to turn off the Rx checksum offload for non-tunnel traffic use | |
299 | cases to achieve the best performance. | |
300 | ||
e69e4f5b SF |
301 | .. _dpdk-ovs-in-guest: |
302 | ||
303 | OVS with DPDK Inside VMs | |
304 | ------------------------ | |
305 | ||
306 | Additional configuration is required if you want to run ovs-vswitchd with DPDK | |
307 | backend inside a QEMU virtual machine. ovs-vswitchd creates separate DPDK TX | |
308 | queues for each CPU core available. This operation fails inside QEMU virtual | |
309 | machine because, by default, VirtIO NIC provided to the guest is configured to | |
310 | support only single TX queue and single RX queue. To change this behavior, you | |
311 | need to turn on ``mq`` (multiqueue) property of all ``virtio-net-pci`` devices | |
312 | emulated by QEMU and used by DPDK. You may do it manually (by changing QEMU | |
313 | command line) or, if you use Libvirt, by adding the following string to | |
314 | ``<interface>`` sections of all network devices used by DPDK:: | |
315 | ||
316 | <driver name='vhost' queues='N'/> | |
317 | ||
318 | where: | |
319 | ||
320 | ``N`` | |
321 | determines how many queues can be used by the guest. | |
322 | ||
323 | This requires QEMU >= 2.2. | |
324 | ||
325 | .. _dpdk-phy-phy: | |
326 | ||
327 | PHY-PHY | |
328 | ------- | |
329 | ||
330 | Add a userspace bridge and two ``dpdk`` (PHY) ports:: | |
331 | ||
332 | # Add userspace bridge | |
333 | $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev | |
334 | ||
335 | # Add two dpdk ports | |
336 | $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk | |
337 | $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk | |
338 | ||
339 | Add test flows to forward packets betwen DPDK port 0 and port 1:: | |
340 | ||
341 | # Clear current flows | |
342 | $ ovs-ofctl del-flows br0 | |
343 | ||
344 | # Add flows between port 1 (dpdk0) to port 2 (dpdk1) | |
345 | $ ovs-ofctl add-flow br0 in_port=1,action=output:2 | |
346 | $ ovs-ofctl add-flow br0 in_port=2,action=output:1 | |
347 | ||
348 | Transmit traffic into either port. You should see it returned via the other. | |
349 | ||
350 | .. _dpdk-vhost-loopback: | |
351 | ||
352 | PHY-VM-PHY (vHost Loopback) | |
353 | --------------------------- | |
354 | ||
355 | Add a userspace bridge, two ``dpdk`` (PHY) ports, and two ``dpdkvhostuser`` | |
356 | ports:: | |
357 | ||
358 | # Add userspace bridge | |
359 | $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev | |
360 | ||
361 | # Add two dpdk ports | |
362 | $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk | |
363 | $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk | |
364 | ||
365 | # Add two dpdkvhostuser ports | |
366 | $ ovs-vsctl add-port br0 dpdkvhostuser0 \ | |
367 | -- set Interface dpdkvhostuser0 type=dpdkvhostuser | |
368 | $ ovs-vsctl add-port br0 dpdkvhostuser1 \ | |
369 | -- set Interface dpdkvhostuser1 type=dpdkvhostuser | |
370 | ||
371 | Add test flows to forward packets betwen DPDK devices and VM ports:: | |
372 | ||
373 | # Clear current flows | |
374 | $ ovs-ofctl del-flows br0 | |
375 | ||
376 | # Add flows | |
377 | $ ovs-ofctl add-flow br0 in_port=1,action=output:3 | |
378 | $ ovs-ofctl add-flow br0 in_port=3,action=output:1 | |
379 | $ ovs-ofctl add-flow br0 in_port=4,action=output:2 | |
380 | $ ovs-ofctl add-flow br0 in_port=2,action=output:4 | |
381 | ||
382 | # Dump flows | |
383 | $ ovs-ofctl dump-flows br0 | |
384 | ||
385 | Create a VM using the following configuration: | |
386 | ||
387 | +----------------------+--------+-----------------+ | |
388 | | configuration | values | comments | | |
389 | +----------------------+--------+-----------------+ | |
390 | | qemu version | 2.2.0 | n/a | | |
391 | | qemu thread affinity | core 5 | taskset 0x20 | | |
392 | | memory | 4GB | n/a | | |
393 | | cores | 2 | n/a | | |
394 | | Qcow2 image | CentOS7| n/a | | |
395 | | mrg_rxbuf | off | n/a | | |
396 | +----------------------+--------+-----------------+ | |
397 | ||
398 | You can do this directly with QEMU via the ``qemu-system-x86_64`` application:: | |
399 | ||
400 | $ export VM_NAME=vhost-vm | |
401 | $ export GUEST_MEM=3072M | |
402 | $ export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2 | |
403 | $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch | |
404 | ||
405 | $ taskset 0x20 qemu-system-x86_64 -name $VM_NAME -cpu host -enable-kvm \ | |
406 | -m $GUEST_MEM -drive file=$QCOW2_IMAGE --nographic -snapshot \ | |
407 | -numa node,memdev=mem -mem-prealloc -smp sockets=1,cores=2 \ | |
408 | -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \ | |
409 | -chardev socket,id=char0,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \ | |
410 | -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ | |
411 | -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off \ | |
412 | -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \ | |
413 | -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \ | |
414 | -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off | |
415 | ||
416 | For a explanation of this command, along with alternative approaches such as | |
417 | booting the VM via libvirt, refer to :doc:`/topics/dpdk/vhost-user`. | |
418 | ||
419 | Once the guest is configured and booted, configure DPDK packet forwarding | |
420 | within the guest. To accomplish this, build the ``testpmd`` application as | |
421 | described in :ref:`dpdk-testpmd`. Once compiled, run the application:: | |
422 | ||
423 | $ cd $DPDK_DIR/app/test-pmd; | |
424 | $ ./testpmd -c 0x3 -n 4 --socket-mem 1024 -- \ | |
425 | --burst=64 -i --txqflags=0xf00 --disable-hw-vlan | |
426 | $ set fwd mac retry | |
427 | $ start | |
428 | ||
429 | When you finish testing, bind the vNICs back to kernel:: | |
430 | ||
431 | $ $DPDK_DIR/tools/dpdk-devbind.py --bind=virtio-pci 0000:00:03.0 | |
432 | $ $DPDK_DIR/tools/dpdk-devbind.py --bind=virtio-pci 0000:00:04.0 | |
433 | ||
434 | .. note:: | |
435 | ||
436 | Valid PCI IDs must be passed in above example. The PCI IDs can be retrieved | |
437 | like so:: | |
438 | ||
439 | $ $DPDK_DIR/tools/dpdk-devbind.py --status | |
440 | ||
441 | More information on the dpdkvhostuser ports can be found in | |
442 | :doc:`/topics/dpdk/vhost-user`. | |
443 | ||
444 | PHY-VM-PHY (vHost Loopback) (Kernel Forwarding) | |
445 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
446 | ||
447 | :ref:`dpdk-vhost-loopback` details steps for PHY-VM-PHY loopback | |
448 | testcase and packet forwarding using DPDK testpmd application in the Guest VM. | |
449 | For users wishing to do packet forwarding using kernel stack below, you need to | |
450 | run the below commands on the guest:: | |
451 | ||
452 | $ ifconfig eth1 1.1.1.2/24 | |
453 | $ ifconfig eth2 1.1.2.2/24 | |
454 | $ systemctl stop firewalld.service | |
455 | $ systemctl stop iptables.service | |
456 | $ sysctl -w net.ipv4.ip_forward=1 | |
457 | $ sysctl -w net.ipv4.conf.all.rp_filter=0 | |
458 | $ sysctl -w net.ipv4.conf.eth1.rp_filter=0 | |
459 | $ sysctl -w net.ipv4.conf.eth2.rp_filter=0 | |
460 | $ route add -net 1.1.2.0/24 eth2 | |
461 | $ route add -net 1.1.1.0/24 eth1 | |
462 | $ arp -s 1.1.2.99 DE:AD:BE:EF:CA:FE | |
463 | $ arp -s 1.1.1.99 DE:AD:BE:EF:CA:EE | |
464 | ||
465 | PHY-VM-PHY (vHost Multiqueue) | |
466 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
467 | ||
468 | vHost Multiqueue functionality can also be validated using the PHY-VM-PHY | |
469 | configuration. To begin, follow the steps described in :ref:`dpdk-phy-phy` to | |
470 | create and initialize the database, start ovs-vswitchd and add ``dpdk``-type | |
471 | devices to bridge ``br0``. Once complete, follow the below steps: | |
472 | ||
473 | 1. Configure PMD and RXQs. | |
474 | ||
475 | For example, set the number of dpdk port rx queues to at least 2 The number | |
476 | of rx queues at vhost-user interface gets automatically configured after | |
477 | virtio device connection and doesn't need manual configuration:: | |
478 | ||
479 | $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xc | |
480 | $ ovs-vsctl set Interface dpdk0 options:n_rxq=2 | |
481 | $ ovs-vsctl set Interface dpdk1 options:n_rxq=2 | |
482 | ||
483 | 2. Instantiate Guest VM using QEMU cmdline | |
484 | ||
485 | We must configure with appropriate software versions to ensure this feature | |
486 | is supported. | |
487 | ||
488 | .. list-table:: Recommended BIOS Settings | |
489 | :header-rows: 1 | |
490 | ||
491 | * - Setting | |
492 | - Value | |
493 | * - QEMU version | |
494 | - 2.5.0 | |
495 | * - QEMU thread affinity | |
496 | - 2 cores (taskset 0x30) | |
497 | * - Memory | |
498 | - 4 GB | |
499 | * - Cores | |
500 | - 2 | |
501 | * - Distro | |
502 | - Fedora 22 | |
503 | * - Multiqueue | |
504 | - Enabled | |
505 | ||
506 | To do this, instantiate the guest as follows:: | |
507 | ||
508 | $ export VM_NAME=vhost-vm | |
509 | $ export GUEST_MEM=4096M | |
510 | $ export QCOW2_IMAGE=/root/Fedora22_x86_64.qcow2 | |
511 | $ export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch | |
512 | $ taskset 0x30 qemu-system-x86_64 -cpu host -smp 2,cores=2 -m 4096M \ | |
513 | -drive file=$QCOW2_IMAGE --enable-kvm -name $VM_NAME \ | |
514 | -nographic -numa node,memdev=mem -mem-prealloc \ | |
515 | -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on \ | |
516 | -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser0 \ | |
517 | -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce,queues=2 \ | |
518 | -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mq=on,vectors=6 \ | |
519 | -chardev socket,id=char2,path=$VHOST_SOCK_DIR/dpdkvhostuser1 \ | |
520 | -netdev type=vhost-user,id=mynet2,chardev=char2,vhostforce,queues=2 \ | |
521 | -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mq=on,vectors=6 | |
522 | ||
523 | .. note:: | |
524 | Queue value above should match the queues configured in OVS, The vector | |
525 | value should be set to "number of queues x 2 + 2" | |
526 | ||
527 | 3. Configure the guest interface | |
528 | ||
529 | Assuming there are 2 interfaces in the guest named eth0, eth1 check the | |
530 | channel configuration and set the number of combined channels to 2 for | |
531 | virtio devices:: | |
532 | ||
533 | $ ethtool -l eth0 | |
534 | $ ethtool -L eth0 combined 2 | |
535 | $ ethtool -L eth1 combined 2 | |
536 | ||
537 | More information can be found in vHost walkthrough section. | |
538 | ||
539 | 4. Configure kernel packet forwarding | |
540 | ||
541 | Configure IP and enable interfaces:: | |
542 | ||
543 | $ ifconfig eth0 5.5.5.1/24 up | |
544 | $ ifconfig eth1 90.90.90.1/24 up | |
545 | ||
546 | Configure IP forwarding and add route entries:: | |
547 | ||
548 | $ sysctl -w net.ipv4.ip_forward=1 | |
549 | $ sysctl -w net.ipv4.conf.all.rp_filter=0 | |
550 | $ sysctl -w net.ipv4.conf.eth0.rp_filter=0 | |
551 | $ sysctl -w net.ipv4.conf.eth1.rp_filter=0 | |
552 | $ ip route add 2.1.1.0/24 dev eth1 | |
553 | $ route add default gw 2.1.1.2 eth1 | |
554 | $ route add default gw 90.90.90.90 eth1 | |
555 | $ arp -s 90.90.90.90 DE:AD:BE:EF:CA:FE | |
556 | $ arp -s 2.1.1.2 DE:AD:BE:EF:CA:FA | |
557 | ||
558 | Check traffic on multiple queues:: | |
559 | ||
560 | $ cat /proc/interrupts | grep virtio |