]>
Commit | Line | Data |
---|---|---|
11fdf7f2 TL |
1 | .. BSD LICENSE |
2 | Copyright(c) 2016 Red Hat, Inc. All rights reserved. | |
3 | All rights reserved. | |
4 | ||
5 | Redistribution and use in source and binary forms, with or without | |
6 | modification, are permitted provided that the following conditions | |
7 | are met: | |
8 | ||
9 | * Redistributions of source code must retain the above copyright | |
10 | notice, this list of conditions and the following disclaimer. | |
11 | * Redistributions in binary form must reproduce the above copyright | |
12 | notice, this list of conditions and the following disclaimer in | |
13 | the documentation and/or other materials provided with the | |
14 | distribution. | |
15 | * Neither the name of Intel Corporation nor the names of its | |
16 | contributors may be used to endorse or promote products derived | |
17 | from this software without specific prior written permission. | |
18 | ||
19 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS | |
20 | "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT | |
21 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR | |
22 | A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT | |
23 | OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, | |
24 | SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT | |
25 | LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, | |
26 | DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY | |
27 | THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | |
28 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | |
29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | |
30 | ||
31 | ||
32 | PVP reference benchmark setup using testpmd | |
33 | =========================================== | |
34 | ||
35 | This guide lists the steps required to setup a PVP benchmark using testpmd as | |
36 | a simple forwarder between NICs and Vhost interfaces. The goal of this setup | |
37 | is to have a reference PVP benchmark without using external vSwitches (OVS, | |
38 | VPP, ...) to make it easier to obtain reproducible results and to facilitate | |
39 | continuous integration testing. | |
40 | ||
41 | The guide covers two ways of launching the VM, either by directly calling the | |
42 | QEMU command line, or by relying on libvirt. It has been tested with DPDK | |
43 | v16.11 using RHEL7 for both host and guest. | |
44 | ||
45 | ||
46 | Setup overview | |
47 | -------------- | |
48 | ||
49 | .. _figure_pvp_2nics: | |
50 | ||
51 | .. figure:: img/pvp_2nics.* | |
52 | ||
53 | PVP setup using 2 NICs | |
54 | ||
55 | In this diagram, each red arrow represents one logical core. This use-case | |
56 | requires 6 dedicated logical cores. A forwarding configuration with a single | |
57 | NIC is also possible, requiring 3 logical cores. | |
58 | ||
59 | ||
60 | Host setup | |
61 | ---------- | |
62 | ||
63 | In this setup, we isolate 6 cores (from CPU2 to CPU7) on the same NUMA | |
64 | node. Two cores are assigned to the VM vCPUs running testpmd and four are | |
65 | assigned to testpmd on the host. | |
66 | ||
67 | ||
68 | Host tuning | |
69 | ~~~~~~~~~~~ | |
70 | ||
71 | #. On BIOS, disable turbo-boost and hyper-threads. | |
72 | ||
73 | #. Append these options to Kernel command line: | |
74 | ||
75 | .. code-block:: console | |
76 | ||
77 | intel_pstate=disable mce=ignore_ce default_hugepagesz=1G hugepagesz=1G hugepages=6 isolcpus=2-7 rcu_nocbs=2-7 nohz_full=2-7 iommu=pt intel_iommu=on | |
78 | ||
79 | #. Disable hyper-threads at runtime if necessary or if BIOS is not accessible: | |
80 | ||
81 | .. code-block:: console | |
82 | ||
83 | cat /sys/devices/system/cpu/cpu*[0-9]/topology/thread_siblings_list \ | |
84 | | sort | uniq \ | |
85 | | awk -F, '{system("echo 0 > /sys/devices/system/cpu/cpu"$2"/online")}' | |
86 | ||
87 | #. Disable NMIs: | |
88 | ||
89 | .. code-block:: console | |
90 | ||
91 | echo 0 > /proc/sys/kernel/nmi_watchdog | |
92 | ||
93 | #. Exclude isolated CPUs from the writeback cpumask: | |
94 | ||
95 | .. code-block:: console | |
96 | ||
97 | echo ffffff03 > /sys/bus/workqueue/devices/writeback/cpumask | |
98 | ||
99 | #. Isolate CPUs from IRQs: | |
100 | ||
101 | .. code-block:: console | |
102 | ||
103 | clear_mask=0xfc #Isolate CPU2 to CPU7 from IRQs | |
104 | for i in /proc/irq/*/smp_affinity | |
105 | do | |
106 | echo "obase=16;$(( 0x$(cat $i) & ~$clear_mask ))" | bc > $i | |
107 | done | |
108 | ||
109 | ||
110 | Qemu build | |
111 | ~~~~~~~~~~ | |
112 | ||
113 | Build Qemu: | |
114 | ||
115 | .. code-block:: console | |
116 | ||
117 | git clone git://git.qemu.org/qemu.git | |
118 | cd qemu | |
119 | mkdir bin | |
120 | cd bin | |
121 | ../configure --target-list=x86_64-softmmu | |
122 | make | |
123 | ||
124 | ||
125 | DPDK build | |
126 | ~~~~~~~~~~ | |
127 | ||
128 | Build DPDK: | |
129 | ||
130 | .. code-block:: console | |
131 | ||
132 | git clone git://dpdk.org/dpdk | |
133 | cd dpdk | |
134 | export RTE_SDK=$PWD | |
9f95a23c | 135 | make install T=x86_64-native-linux-gcc DESTDIR=install |
11fdf7f2 TL |
136 | |
137 | ||
138 | Testpmd launch | |
139 | ~~~~~~~~~~~~~~ | |
140 | ||
141 | #. Assign NICs to DPDK: | |
142 | ||
143 | .. code-block:: console | |
144 | ||
145 | modprobe vfio-pci | |
146 | $RTE_SDK/install/sbin/dpdk-devbind -b vfio-pci 0000:11:00.0 0000:11:00.1 | |
147 | ||
148 | .. Note:: | |
149 | ||
150 | The Sandy Bridge family seems to have some IOMMU limitations giving poor | |
151 | performance results. To achieve good performance on these machines | |
152 | consider using UIO instead. | |
153 | ||
154 | #. Launch the testpmd application: | |
155 | ||
156 | .. code-block:: console | |
157 | ||
158 | $RTE_SDK/install/bin/testpmd -l 0,2,3,4,5 --socket-mem=1024 -n 4 \ | |
159 | --vdev 'net_vhost0,iface=/tmp/vhost-user1' \ | |
160 | --vdev 'net_vhost1,iface=/tmp/vhost-user2' -- \ | |
9f95a23c | 161 | --portmask=f -i --rxq=1 --txq=1 \ |
11fdf7f2 TL |
162 | --nb-cores=4 --forward-mode=io |
163 | ||
164 | With this command, isolated CPUs 2 to 5 will be used as lcores for PMD threads. | |
165 | ||
166 | #. In testpmd interactive mode, set the portlist to obtain the correct port | |
167 | chaining: | |
168 | ||
169 | .. code-block:: console | |
170 | ||
171 | set portlist 0,2,1,3 | |
172 | start | |
173 | ||
174 | ||
175 | VM launch | |
176 | ~~~~~~~~~ | |
177 | ||
178 | The VM may be launched either by calling QEMU directly, or by using libvirt. | |
179 | ||
180 | Qemu way | |
181 | ^^^^^^^^ | |
182 | ||
183 | Launch QEMU with two Virtio-net devices paired to the vhost-user sockets | |
184 | created by testpmd. Below example uses default Virtio-net options, but options | |
185 | may be specified, for example to disable mergeable buffers or indirect | |
186 | descriptors. | |
187 | ||
188 | .. code-block:: console | |
189 | ||
190 | <QEMU path>/bin/x86_64-softmmu/qemu-system-x86_64 \ | |
191 | -enable-kvm -cpu host -m 3072 -smp 3 \ | |
192 | -chardev socket,id=char0,path=/tmp/vhost-user1 \ | |
193 | -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ | |
194 | -device virtio-net-pci,netdev=mynet1,mac=52:54:00:02:d9:01,addr=0x10 \ | |
195 | -chardev socket,id=char1,path=/tmp/vhost-user2 \ | |
196 | -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \ | |
197 | -device virtio-net-pci,netdev=mynet2,mac=52:54:00:02:d9:02,addr=0x11 \ | |
198 | -object memory-backend-file,id=mem,size=3072M,mem-path=/dev/hugepages,share=on \ | |
199 | -numa node,memdev=mem -mem-prealloc \ | |
200 | -net user,hostfwd=tcp::1002$1-:22 -net nic \ | |
201 | -qmp unix:/tmp/qmp.socket,server,nowait \ | |
202 | -monitor stdio <vm_image>.qcow2 | |
203 | ||
204 | You can use this `qmp-vcpu-pin <https://patchwork.kernel.org/patch/9361617/>`_ | |
205 | script to pin vCPUs. | |
206 | ||
207 | It can be used as follows, for example to pin 3 vCPUs to CPUs 1, 6 and 7, | |
208 | where isolated CPUs 6 and 7 will be used as lcores for Virtio PMDs: | |
209 | ||
210 | .. code-block:: console | |
211 | ||
212 | export PYTHONPATH=$PYTHONPATH:<QEMU path>/scripts/qmp | |
213 | ./qmp-vcpu-pin -s /tmp/qmp.socket 1 6 7 | |
214 | ||
215 | Libvirt way | |
216 | ^^^^^^^^^^^ | |
217 | ||
218 | Some initial steps are required for libvirt to be able to connect to testpmd's | |
219 | sockets. | |
220 | ||
221 | First, SELinux policy needs to be set to permissive, since testpmd is | |
222 | generally run as root (note, as reboot is required): | |
223 | ||
224 | .. code-block:: console | |
225 | ||
226 | cat /etc/selinux/config | |
227 | ||
228 | # This file controls the state of SELinux on the system. | |
229 | # SELINUX= can take one of these three values: | |
230 | # enforcing - SELinux security policy is enforced. | |
231 | # permissive - SELinux prints warnings instead of enforcing. | |
232 | # disabled - No SELinux policy is loaded. | |
233 | SELINUX=permissive | |
234 | ||
235 | # SELINUXTYPE= can take one of three two values: | |
236 | # targeted - Targeted processes are protected, | |
237 | # minimum - Modification of targeted policy. | |
238 | # Only selected processes are protected. | |
239 | # mls - Multi Level Security protection. | |
240 | SELINUXTYPE=targeted | |
241 | ||
242 | ||
243 | Also, Qemu needs to be run as root, which has to be specified in | |
244 | ``/etc/libvirt/qemu.conf``: | |
245 | ||
246 | .. code-block:: console | |
247 | ||
248 | user = "root" | |
249 | ||
250 | Once the domain created, the following snippet is an extract of he most | |
251 | important information (hugepages, vCPU pinning, Virtio PCI devices): | |
252 | ||
253 | .. code-block:: xml | |
254 | ||
255 | <domain type='kvm'> | |
256 | <memory unit='KiB'>3145728</memory> | |
257 | <currentMemory unit='KiB'>3145728</currentMemory> | |
258 | <memoryBacking> | |
259 | <hugepages> | |
260 | <page size='1048576' unit='KiB' nodeset='0'/> | |
261 | </hugepages> | |
262 | <locked/> | |
263 | </memoryBacking> | |
264 | <vcpu placement='static'>3</vcpu> | |
265 | <cputune> | |
266 | <vcpupin vcpu='0' cpuset='1'/> | |
267 | <vcpupin vcpu='1' cpuset='6'/> | |
268 | <vcpupin vcpu='2' cpuset='7'/> | |
269 | <emulatorpin cpuset='0'/> | |
270 | </cputune> | |
271 | <numatune> | |
272 | <memory mode='strict' nodeset='0'/> | |
273 | </numatune> | |
274 | <os> | |
275 | <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type> | |
276 | <boot dev='hd'/> | |
277 | </os> | |
278 | <cpu mode='host-passthrough'> | |
279 | <topology sockets='1' cores='3' threads='1'/> | |
280 | <numa> | |
281 | <cell id='0' cpus='0-2' memory='3145728' unit='KiB' memAccess='shared'/> | |
282 | </numa> | |
283 | </cpu> | |
284 | <devices> | |
285 | <interface type='vhostuser'> | |
286 | <mac address='56:48:4f:53:54:01'/> | |
287 | <source type='unix' path='/tmp/vhost-user1' mode='client'/> | |
288 | <model type='virtio'/> | |
289 | <driver name='vhost' rx_queue_size='256' /> | |
290 | <address type='pci' domain='0x0000' bus='0x00' slot='0x10' function='0x0'/> | |
291 | </interface> | |
292 | <interface type='vhostuser'> | |
293 | <mac address='56:48:4f:53:54:02'/> | |
294 | <source type='unix' path='/tmp/vhost-user2' mode='client'/> | |
295 | <model type='virtio'/> | |
296 | <driver name='vhost' rx_queue_size='256' /> | |
297 | <address type='pci' domain='0x0000' bus='0x00' slot='0x11' function='0x0'/> | |
298 | </interface> | |
299 | </devices> | |
300 | </domain> | |
301 | ||
302 | ||
303 | Guest setup | |
304 | ----------- | |
305 | ||
306 | ||
307 | Guest tuning | |
308 | ~~~~~~~~~~~~ | |
309 | ||
310 | #. Append these options to the Kernel command line: | |
311 | ||
312 | .. code-block:: console | |
313 | ||
314 | default_hugepagesz=1G hugepagesz=1G hugepages=1 intel_iommu=on iommu=pt isolcpus=1,2 rcu_nocbs=1,2 nohz_full=1,2 | |
315 | ||
316 | #. Disable NMIs: | |
317 | ||
318 | .. code-block:: console | |
319 | ||
320 | echo 0 > /proc/sys/kernel/nmi_watchdog | |
321 | ||
322 | #. Exclude isolated CPU1 and CPU2 from the writeback cpumask: | |
323 | ||
324 | .. code-block:: console | |
325 | ||
326 | echo 1 > /sys/bus/workqueue/devices/writeback/cpumask | |
327 | ||
328 | #. Isolate CPUs from IRQs: | |
329 | ||
330 | .. code-block:: console | |
331 | ||
332 | clear_mask=0x6 #Isolate CPU1 and CPU2 from IRQs | |
333 | for i in /proc/irq/*/smp_affinity | |
334 | do | |
335 | echo "obase=16;$(( 0x$(cat $i) & ~$clear_mask ))" | bc > $i | |
336 | done | |
337 | ||
338 | ||
339 | DPDK build | |
340 | ~~~~~~~~~~ | |
341 | ||
342 | Build DPDK: | |
343 | ||
344 | .. code-block:: console | |
345 | ||
346 | git clone git://dpdk.org/dpdk | |
347 | cd dpdk | |
348 | export RTE_SDK=$PWD | |
9f95a23c | 349 | make install T=x86_64-native-linux-gcc DESTDIR=install |
11fdf7f2 TL |
350 | |
351 | ||
352 | Testpmd launch | |
353 | ~~~~~~~~~~~~~~ | |
354 | ||
355 | Probe vfio module without iommu: | |
356 | ||
357 | .. code-block:: console | |
358 | ||
359 | modprobe -r vfio_iommu_type1 | |
360 | modprobe -r vfio | |
361 | modprobe vfio enable_unsafe_noiommu_mode=1 | |
362 | cat /sys/module/vfio/parameters/enable_unsafe_noiommu_mode | |
363 | modprobe vfio-pci | |
364 | ||
365 | Bind the virtio-net devices to DPDK: | |
366 | ||
367 | .. code-block:: console | |
368 | ||
369 | $RTE_SDK/usertools/dpdk-devbind.py -b vfio-pci 0000:00:10.0 0000:00:11.0 | |
370 | ||
371 | Start testpmd: | |
372 | ||
373 | .. code-block:: console | |
374 | ||
375 | $RTE_SDK/install/bin/testpmd -l 0,1,2 --socket-mem 1024 -n 4 \ | |
376 | --proc-type auto --file-prefix pg -- \ | |
377 | --portmask=3 --forward-mode=macswap --port-topology=chained \ | |
9f95a23c | 378 | --disable-rss -i --rxq=1 --txq=1 \ |
11fdf7f2 TL |
379 | --rxd=256 --txd=256 --nb-cores=2 --auto-start |
380 | ||
381 | Results template | |
382 | ---------------- | |
383 | ||
384 | Below template should be used when sharing results: | |
385 | ||
386 | .. code-block:: none | |
387 | ||
388 | Traffic Generator: <Test equipment (e.g. IXIA, Moongen, ...)> | |
389 | Acceptable Loss: <n>% | |
390 | Validation run time: <n>min | |
391 | Host DPDK version/commit: <version, SHA-1> | |
392 | Guest DPDK version/commit: <version, SHA-1> | |
393 | Patches applied: <link to patchwork> | |
394 | QEMU version/commit: <version> | |
395 | Virtio features: <features (e.g. mrg_rxbuf='off', leave empty if default)> | |
396 | CPU: <CPU model>, <CPU frequency> | |
397 | NIC: <NIC model> | |
398 | Result: <n> Mpps |