]>
Commit | Line | Data |
---|---|---|
11fdf7f2 TL |
1 | .. SPDX-License-Identifier: BSD-3-Clause |
2 | Copyright(c) 2010-2014 Intel Corporation. | |
3 | ||
4 | Intel Virtual Function Driver | |
5 | ============================= | |
7c673cae FG |
6 | |
7 | Supported Intel® Ethernet Controllers (see the *DPDK Release Notes* for details) | |
8 | support the following modes of operation in a virtualized environment: | |
9 | ||
10 | * **SR-IOV mode**: Involves direct assignment of part of the port resources to different guest operating systems | |
11 | using the PCI-SIG Single Root I/O Virtualization (SR IOV) standard, | |
12 | also known as "native mode" or "pass-through" mode. | |
13 | In this chapter, this mode is referred to as IOV mode. | |
14 | ||
15 | * **VMDq mode**: Involves central management of the networking resources by an IO Virtual Machine (IOVM) or | |
16 | a Virtual Machine Monitor (VMM), also known as software switch acceleration mode. | |
17 | In this chapter, this mode is referred to as the Next Generation VMDq mode. | |
18 | ||
19 | SR-IOV Mode Utilization in a DPDK Environment | |
20 | --------------------------------------------- | |
21 | ||
22 | The DPDK uses the SR-IOV feature for hardware-based I/O sharing in IOV mode. | |
23 | Therefore, it is possible to partition SR-IOV capability on Ethernet controller NIC resources logically and | |
24 | expose them to a virtual machine as a separate PCI function called a "Virtual Function". | |
25 | Refer to :numref:`figure_single_port_nic`. | |
26 | ||
27 | Therefore, a NIC is logically distributed among multiple virtual machines (as shown in :numref:`figure_single_port_nic`), | |
28 | while still having global data in common to share with the Physical Function and other Virtual Functions. | |
29 | The DPDK fm10kvf, i40evf, igbvf or ixgbevf as a Poll Mode Driver (PMD) serves for the Intel® 82576 Gigabit Ethernet Controller, | |
30 | Intel® Ethernet Controller I350 family, Intel® 82599 10 Gigabit Ethernet Controller NIC, | |
31 | Intel® Fortville 10/40 Gigabit Ethernet Controller NIC's virtual PCI function, or PCIe host-interface of the Intel Ethernet Switch | |
32 | FM10000 Series. | |
33 | Meanwhile the DPDK Poll Mode Driver (PMD) also supports "Physical Function" of such NIC's on the host. | |
34 | ||
35 | The DPDK PF/VF Poll Mode Driver (PMD) supports the Layer 2 switch on Intel® 82576 Gigabit Ethernet Controller, | |
36 | Intel® Ethernet Controller I350 family, Intel® 82599 10 Gigabit Ethernet Controller, | |
37 | and Intel® Fortville 10/40 Gigabit Ethernet Controller NICs so that guest can choose it for inter virtual machine traffic in SR-IOV mode. | |
38 | ||
39 | For more detail on SR-IOV, please refer to the following documents: | |
40 | ||
41 | * `SR-IOV provides hardware based I/O sharing <http://www.intel.com/network/connectivity/solutions/vmdc.htm>`_ | |
42 | ||
43 | * `PCI-SIG-Single Root I/O Virtualization Support on IA | |
44 | <http://www.intel.com/content/www/us/en/pci-express/pci-sig-single-root-io-virtualization-support-in-virtualization-technology-for-connectivity-paper.html>`_ | |
45 | ||
46 | * `Scalable I/O Virtualized Servers <http://www.intel.com/content/www/us/en/virtualization/server-virtualization/scalable-i-o-virtualized-servers-paper.html>`_ | |
47 | ||
48 | .. _figure_single_port_nic: | |
49 | ||
50 | .. figure:: img/single_port_nic.* | |
51 | ||
52 | Virtualization for a Single Port NIC in SR-IOV Mode | |
53 | ||
54 | ||
55 | Physical and Virtual Function Infrastructure | |
56 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
57 | ||
58 | The following describes the Physical Function and Virtual Functions infrastructure for the supported Ethernet Controller NICs. | |
59 | ||
60 | Virtual Functions operate under the respective Physical Function on the same NIC Port and therefore have no access | |
61 | to the global NIC resources that are shared between other functions for the same NIC port. | |
62 | ||
63 | A Virtual Function has basic access to the queue resources and control structures of the queues assigned to it. | |
64 | For global resource access, a Virtual Function has to send a request to the Physical Function for that port, | |
65 | and the Physical Function operates on the global resources on behalf of the Virtual Function. | |
66 | For this out-of-band communication, an SR-IOV enabled NIC provides a memory buffer for each Virtual Function, | |
67 | which is called a "Mailbox". | |
68 | ||
11fdf7f2 TL |
69 | Intel® Ethernet Adaptive Virtual Function |
70 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
9f95a23c TL |
71 | Adaptive Virtual Function (IAVF) is a SR-IOV Virtual Function with the same device id (8086:1889) on different Intel Ethernet Controller. |
72 | IAVF Driver is VF driver which supports for all future Intel devices without requiring a VM update. And since this happens to be an adaptive VF driver, | |
11fdf7f2 | 73 | every new drop of the VF driver would add more and more advanced features that can be turned on in the VM if the underlying HW device supports those |
9f95a23c TL |
74 | advanced features based on a device agnostic way without ever compromising on the base functionality. IAVF provides generic hardware interface and |
75 | interface between IAVF driver and a compliant PF driver is specified. | |
11fdf7f2 TL |
76 | |
77 | Intel products starting Ethernet Controller 700 Series to support Adaptive Virtual Function. | |
78 | ||
79 | The way to generate Virtual Function is like normal, and the resource of VF assignment depends on the NIC Infrastructure. | |
80 | ||
81 | For more detail on SR-IOV, please refer to the following documents: | |
82 | ||
9f95a23c | 83 | * `Intel® IAVF HAS <https://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/ethernet-adaptive-virtual-function-hardware-spec.pdf>`_ |
11fdf7f2 TL |
84 | |
85 | .. note:: | |
86 | ||
9f95a23c TL |
87 | To use DPDK IAVF PMD on Intel® 700 Series Ethernet Controller, the device id (0x1889) need to specified during device |
88 | assignment in hypervisor. Take qemu for example, the device assignment should carry the IAVF device id (0x1889) like | |
11fdf7f2 TL |
89 | ``-device vfio-pci,x-pci-device-id=0x1889,host=03:0a.0``. |
90 | ||
7c673cae FG |
91 | The PCIE host-interface of Intel Ethernet Switch FM10000 Series VF infrastructure |
92 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
93 | ||
94 | In a virtualized environment, the programmer can enable a maximum of *64 Virtual Functions (VF)* | |
95 | globally per PCIE host-interface of the Intel Ethernet Switch FM10000 Series device. | |
96 | Each VF can have a maximum of 16 queue pairs. | |
97 | The Physical Function in host could be only configured by the Linux* fm10k driver | |
98 | (in the case of the Linux Kernel-based Virtual Machine [KVM]), DPDK PMD PF driver doesn't support it yet. | |
99 | ||
100 | For example, | |
101 | ||
102 | * Using Linux* fm10k driver: | |
103 | ||
104 | .. code-block:: console | |
105 | ||
106 | rmmod fm10k (To remove the fm10k module) | |
107 | insmod fm0k.ko max_vfs=2,2 (To enable two Virtual Functions per port) | |
108 | ||
109 | Virtual Function enumeration is performed in the following sequence by the Linux* pci driver for a dual-port NIC. | |
110 | When you enable the four Virtual Functions with the above command, the four enabled functions have a Function# | |
111 | represented by (Bus#, Device#, Function#) in sequence starting from 0 to 3. | |
112 | However: | |
113 | ||
114 | * Virtual Functions 0 and 2 belong to Physical Function 0 | |
115 | ||
116 | * Virtual Functions 1 and 3 belong to Physical Function 1 | |
117 | ||
118 | .. note:: | |
119 | ||
120 | The above is an important consideration to take into account when targeting specific packets to a selected port. | |
121 | ||
11fdf7f2 TL |
122 | Intel® X710/XL710 Gigabit Ethernet Controller VF Infrastructure |
123 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
7c673cae FG |
124 | |
125 | In a virtualized environment, the programmer can enable a maximum of *128 Virtual Functions (VF)* | |
11fdf7f2 TL |
126 | globally per Intel® X710/XL710 Gigabit Ethernet Controller NIC device. |
127 | The number of queue pairs of each VF can be configured by ``CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF`` in ``config`` file. | |
7c673cae FG |
128 | The Physical Function in host could be either configured by the Linux* i40e driver |
129 | (in the case of the Linux Kernel-based Virtual Machine [KVM]) or by DPDK PMD PF driver. | |
130 | When using both DPDK PMD PF/VF drivers, the whole NIC will be taken over by DPDK based application. | |
131 | ||
132 | For example, | |
133 | ||
134 | * Using Linux* i40e driver: | |
135 | ||
136 | .. code-block:: console | |
137 | ||
138 | rmmod i40e (To remove the i40e module) | |
139 | insmod i40e.ko max_vfs=2,2 (To enable two Virtual Functions per port) | |
140 | ||
141 | * Using the DPDK PMD PF i40e driver: | |
142 | ||
143 | Kernel Params: iommu=pt, intel_iommu=on | |
144 | ||
145 | .. code-block:: console | |
146 | ||
147 | modprobe uio | |
148 | insmod igb_uio | |
149 | ./dpdk-devbind.py -b igb_uio bb:ss.f | |
150 | echo 2 > /sys/bus/pci/devices/0000\:bb\:ss.f/max_vfs (To enable two VFs on a specific PCI device) | |
151 | ||
152 | Launch the DPDK testpmd/example or your own host daemon application using the DPDK PMD library. | |
153 | ||
7c673cae FG |
154 | Virtual Function enumeration is performed in the following sequence by the Linux* pci driver for a dual-port NIC. |
155 | When you enable the four Virtual Functions with the above command, the four enabled functions have a Function# | |
156 | represented by (Bus#, Device#, Function#) in sequence starting from 0 to 3. | |
157 | However: | |
158 | ||
159 | * Virtual Functions 0 and 2 belong to Physical Function 0 | |
160 | ||
161 | * Virtual Functions 1 and 3 belong to Physical Function 1 | |
162 | ||
163 | .. note:: | |
164 | ||
165 | The above is an important consideration to take into account when targeting specific packets to a selected port. | |
166 | ||
11fdf7f2 TL |
167 | For Intel® X710/XL710 Gigabit Ethernet Controller, queues are in pairs. One queue pair means one receive queue and |
168 | one transmit queue. The default number of queue pairs per VF is 4, and can be 16 in maximum. | |
169 | ||
7c673cae FG |
170 | Intel® 82599 10 Gigabit Ethernet Controller VF Infrastructure |
171 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
172 | ||
173 | The programmer can enable a maximum of *63 Virtual Functions* and there must be *one Physical Function* per Intel® 82599 | |
174 | 10 Gigabit Ethernet Controller NIC port. | |
175 | The reason for this is that the device allows for a maximum of 128 queues per port and a virtual/physical function has to | |
176 | have at least one queue pair (RX/TX). | |
177 | The current implementation of the DPDK ixgbevf driver supports a single queue pair (RX/TX) per Virtual Function. | |
178 | The Physical Function in host could be either configured by the Linux* ixgbe driver | |
179 | (in the case of the Linux Kernel-based Virtual Machine [KVM]) or by DPDK PMD PF driver. | |
180 | When using both DPDK PMD PF/VF drivers, the whole NIC will be taken over by DPDK based application. | |
181 | ||
182 | For example, | |
183 | ||
184 | * Using Linux* ixgbe driver: | |
185 | ||
186 | .. code-block:: console | |
187 | ||
188 | rmmod ixgbe (To remove the ixgbe module) | |
189 | insmod ixgbe max_vfs=2,2 (To enable two Virtual Functions per port) | |
190 | ||
191 | * Using the DPDK PMD PF ixgbe driver: | |
192 | ||
193 | Kernel Params: iommu=pt, intel_iommu=on | |
194 | ||
195 | .. code-block:: console | |
196 | ||
197 | modprobe uio | |
198 | insmod igb_uio | |
199 | ./dpdk-devbind.py -b igb_uio bb:ss.f | |
200 | echo 2 > /sys/bus/pci/devices/0000\:bb\:ss.f/max_vfs (To enable two VFs on a specific PCI device) | |
201 | ||
202 | Launch the DPDK testpmd/example or your own host daemon application using the DPDK PMD library. | |
203 | ||
11fdf7f2 TL |
204 | * Using the DPDK PMD PF ixgbe driver to enable VF RSS: |
205 | ||
206 | Same steps as above to install the modules of uio, igb_uio, specify max_vfs for PCI device, and | |
207 | launch the DPDK testpmd/example or your own host daemon application using the DPDK PMD library. | |
208 | ||
209 | The available queue number (at most 4) per VF depends on the total number of pool, which is | |
210 | determined by the max number of VF at PF initialization stage and the number of queue specified | |
211 | in config: | |
212 | ||
213 | * If the max number of VFs (max_vfs) is set in the range of 1 to 32: | |
214 | ||
215 | If the number of Rx queues is specified as 4 (``--rxq=4`` in testpmd), then there are totally 32 | |
216 | pools (ETH_32_POOLS), and each VF could have 4 Rx queues; | |
217 | ||
218 | If the number of Rx queues is specified as 2 (``--rxq=2`` in testpmd), then there are totally 32 | |
219 | pools (ETH_32_POOLS), and each VF could have 2 Rx queues; | |
220 | ||
221 | * If the max number of VFs (max_vfs) is in the range of 33 to 64: | |
222 | ||
223 | If the number of Rx queues in specified as 4 (``--rxq=4`` in testpmd), then error message is expected | |
224 | as ``rxq`` is not correct at this case; | |
225 | ||
226 | If the number of rxq is 2 (``--rxq=2`` in testpmd), then there is totally 64 pools (ETH_64_POOLS), | |
227 | and each VF have 2 Rx queues; | |
228 | ||
229 | On host, to enable VF RSS functionality, rx mq mode should be set as ETH_MQ_RX_VMDQ_RSS | |
230 | or ETH_MQ_RX_RSS mode, and SRIOV mode should be activated (max_vfs >= 1). | |
231 | It also needs config VF RSS information like hash function, RSS key, RSS key length. | |
232 | ||
233 | .. note:: | |
234 | ||
235 | The limitation for VF RSS on Intel® 82599 10 Gigabit Ethernet Controller is: | |
236 | The hash and key are shared among PF and all VF, the RETA table with 128 entries is also shared | |
237 | among PF and all VF; So it could not to provide a method to query the hash and reta content per | |
238 | VF on guest, while, if possible, please query them on host for the shared RETA information. | |
239 | ||
7c673cae FG |
240 | Virtual Function enumeration is performed in the following sequence by the Linux* pci driver for a dual-port NIC. |
241 | When you enable the four Virtual Functions with the above command, the four enabled functions have a Function# | |
242 | represented by (Bus#, Device#, Function#) in sequence starting from 0 to 3. | |
243 | However: | |
244 | ||
245 | * Virtual Functions 0 and 2 belong to Physical Function 0 | |
246 | ||
247 | * Virtual Functions 1 and 3 belong to Physical Function 1 | |
248 | ||
249 | .. note:: | |
250 | ||
251 | The above is an important consideration to take into account when targeting specific packets to a selected port. | |
252 | ||
253 | Intel® 82576 Gigabit Ethernet Controller and Intel® Ethernet Controller I350 Family VF Infrastructure | |
254 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
255 | ||
256 | In a virtualized environment, an Intel® 82576 Gigabit Ethernet Controller serves up to eight virtual machines (VMs). | |
257 | The controller has 16 TX and 16 RX queues. | |
258 | They are generally referred to (or thought of) as queue pairs (one TX and one RX queue). | |
259 | This gives the controller 16 queue pairs. | |
260 | ||
261 | A pool is a group of queue pairs for assignment to the same VF, used for transmit and receive operations. | |
262 | The controller has eight pools, with each pool containing two queue pairs, that is, two TX and two RX queues assigned to each VF. | |
263 | ||
264 | In a virtualized environment, an Intel® Ethernet Controller I350 family device serves up to eight virtual machines (VMs) per port. | |
265 | The eight queues can be accessed by eight different VMs if configured correctly (the i350 has 4x1GbE ports each with 8T X and 8 RX queues), | |
266 | that means, one Transmit and one Receive queue assigned to each VF. | |
267 | ||
268 | For example, | |
269 | ||
270 | * Using Linux* igb driver: | |
271 | ||
272 | .. code-block:: console | |
273 | ||
274 | rmmod igb (To remove the igb module) | |
275 | insmod igb max_vfs=2,2 (To enable two Virtual Functions per port) | |
276 | ||
277 | * Using DPDK PMD PF igb driver: | |
278 | ||
279 | Kernel Params: iommu=pt, intel_iommu=on modprobe uio | |
280 | ||
281 | .. code-block:: console | |
282 | ||
283 | insmod igb_uio | |
284 | ./dpdk-devbind.py -b igb_uio bb:ss.f | |
285 | echo 2 > /sys/bus/pci/devices/0000\:bb\:ss.f/max_vfs (To enable two VFs on a specific pci device) | |
286 | ||
287 | Launch DPDK testpmd/example or your own host daemon application using the DPDK PMD library. | |
288 | ||
289 | Virtual Function enumeration is performed in the following sequence by the Linux* pci driver for a four-port NIC. | |
290 | When you enable the four Virtual Functions with the above command, the four enabled functions have a Function# | |
291 | represented by (Bus#, Device#, Function#) in sequence, starting from 0 to 7. | |
292 | However: | |
293 | ||
294 | * Virtual Functions 0 and 4 belong to Physical Function 0 | |
295 | ||
296 | * Virtual Functions 1 and 5 belong to Physical Function 1 | |
297 | ||
298 | * Virtual Functions 2 and 6 belong to Physical Function 2 | |
299 | ||
300 | * Virtual Functions 3 and 7 belong to Physical Function 3 | |
301 | ||
302 | .. note:: | |
303 | ||
304 | The above is an important consideration to take into account when targeting specific packets to a selected port. | |
305 | ||
306 | Validated Hypervisors | |
307 | ~~~~~~~~~~~~~~~~~~~~~ | |
308 | ||
309 | The validated hypervisor is: | |
310 | ||
311 | * KVM (Kernel Virtual Machine) with Qemu, version 0.14.0 | |
312 | ||
313 | However, the hypervisor is bypassed to configure the Virtual Function devices using the Mailbox interface, | |
314 | the solution is hypervisor-agnostic. | |
315 | Xen* and VMware* (when SR- IOV is supported) will also be able to support the DPDK with Virtual Function driver support. | |
316 | ||
317 | Expected Guest Operating System in Virtual Machine | |
318 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
319 | ||
320 | The expected guest operating systems in a virtualized environment are: | |
321 | ||
322 | * Fedora* 14 (64-bit) | |
323 | ||
324 | * Ubuntu* 10.04 (64-bit) | |
325 | ||
326 | For supported kernel versions, refer to the *DPDK Release Notes*. | |
327 | ||
328 | Setting Up a KVM Virtual Machine Monitor | |
329 | ---------------------------------------- | |
330 | ||
331 | The following describes a target environment: | |
332 | ||
333 | * Host Operating System: Fedora 14 | |
334 | ||
335 | * Hypervisor: KVM (Kernel Virtual Machine) with Qemu version 0.14.0 | |
336 | ||
337 | * Guest Operating System: Fedora 14 | |
338 | ||
339 | * Linux Kernel Version: Refer to the *DPDK Getting Started Guide* | |
340 | ||
341 | * Target Applications: l2fwd, l3fwd-vf | |
342 | ||
343 | The setup procedure is as follows: | |
344 | ||
345 | #. Before booting the Host OS, open **BIOS setup** and enable **Intel® VT features**. | |
346 | ||
347 | #. While booting the Host OS kernel, pass the intel_iommu=on kernel command line argument using GRUB. | |
348 | When using DPDK PF driver on host, pass the iommu=pt kernel command line argument in GRUB. | |
349 | ||
350 | #. Download qemu-kvm-0.14.0 from | |
351 | `http://sourceforge.net/projects/kvm/files/qemu-kvm/ <http://sourceforge.net/projects/kvm/files/qemu-kvm/>`_ | |
352 | and install it in the Host OS using the following steps: | |
353 | ||
354 | When using a recent kernel (2.6.25+) with kvm modules included: | |
355 | ||
356 | .. code-block:: console | |
357 | ||
358 | tar xzf qemu-kvm-release.tar.gz | |
359 | cd qemu-kvm-release | |
360 | ./configure --prefix=/usr/local/kvm | |
361 | make | |
362 | sudo make install | |
363 | sudo /sbin/modprobe kvm-intel | |
364 | ||
365 | When using an older kernel, or a kernel from a distribution without the kvm modules, | |
366 | you must download (from the same link), compile and install the modules yourself: | |
367 | ||
368 | .. code-block:: console | |
369 | ||
370 | tar xjf kvm-kmod-release.tar.bz2 | |
371 | cd kvm-kmod-release | |
372 | ./configure | |
373 | make | |
374 | sudo make install | |
375 | sudo /sbin/modprobe kvm-intel | |
376 | ||
377 | qemu-kvm installs in the /usr/local/bin directory. | |
378 | ||
379 | For more details about KVM configuration and usage, please refer to: | |
380 | ||
381 | `http://www.linux-kvm.org/page/HOWTO1 <http://www.linux-kvm.org/page/HOWTO1>`_. | |
382 | ||
383 | #. Create a Virtual Machine and install Fedora 14 on the Virtual Machine. | |
384 | This is referred to as the Guest Operating System (Guest OS). | |
385 | ||
386 | #. Download and install the latest ixgbe driver from: | |
387 | ||
388 | `http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=14687 <http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=14687>`_ | |
389 | ||
390 | #. In the Host OS | |
391 | ||
392 | When using Linux kernel ixgbe driver, unload the Linux ixgbe driver and reload it with the max_vfs=2,2 argument: | |
393 | ||
394 | .. code-block:: console | |
395 | ||
396 | rmmod ixgbe | |
397 | modprobe ixgbe max_vfs=2,2 | |
398 | ||
399 | When using DPDK PMD PF driver, insert DPDK kernel module igb_uio and set the number of VF by sysfs max_vfs: | |
400 | ||
401 | .. code-block:: console | |
402 | ||
403 | modprobe uio | |
404 | insmod igb_uio | |
405 | ./dpdk-devbind.py -b igb_uio 02:00.0 02:00.1 0e:00.0 0e:00.1 | |
406 | echo 2 > /sys/bus/pci/devices/0000\:02\:00.0/max_vfs | |
407 | echo 2 > /sys/bus/pci/devices/0000\:02\:00.1/max_vfs | |
408 | echo 2 > /sys/bus/pci/devices/0000\:0e\:00.0/max_vfs | |
409 | echo 2 > /sys/bus/pci/devices/0000\:0e\:00.1/max_vfs | |
410 | ||
411 | .. note:: | |
412 | ||
413 | You need to explicitly specify number of vfs for each port, for example, | |
414 | in the command above, it creates two vfs for the first two ixgbe ports. | |
415 | ||
416 | Let say we have a machine with four physical ixgbe ports: | |
417 | ||
418 | ||
419 | 0000:02:00.0 | |
420 | ||
421 | 0000:02:00.1 | |
422 | ||
423 | 0000:0e:00.0 | |
424 | ||
425 | 0000:0e:00.1 | |
426 | ||
427 | The command above creates two vfs for device 0000:02:00.0: | |
428 | ||
429 | .. code-block:: console | |
430 | ||
431 | ls -alrt /sys/bus/pci/devices/0000\:02\:00.0/virt* | |
432 | lrwxrwxrwx. 1 root root 0 Apr 13 05:40 /sys/bus/pci/devices/0000:02:00.0/virtfn1 -> ../0000:02:10.2 | |
433 | lrwxrwxrwx. 1 root root 0 Apr 13 05:40 /sys/bus/pci/devices/0000:02:00.0/virtfn0 -> ../0000:02:10.0 | |
434 | ||
435 | It also creates two vfs for device 0000:02:00.1: | |
436 | ||
437 | .. code-block:: console | |
438 | ||
439 | ls -alrt /sys/bus/pci/devices/0000\:02\:00.1/virt* | |
440 | lrwxrwxrwx. 1 root root 0 Apr 13 05:51 /sys/bus/pci/devices/0000:02:00.1/virtfn1 -> ../0000:02:10.3 | |
441 | lrwxrwxrwx. 1 root root 0 Apr 13 05:51 /sys/bus/pci/devices/0000:02:00.1/virtfn0 -> ../0000:02:10.1 | |
442 | ||
443 | #. List the PCI devices connected and notice that the Host OS shows two Physical Functions (traditional ports) | |
444 | and four Virtual Functions (two for each port). | |
445 | This is the result of the previous step. | |
446 | ||
447 | #. Insert the pci_stub module to hold the PCI devices that are freed from the default driver using the following command | |
448 | (see http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM Section 4 for more information): | |
449 | ||
450 | .. code-block:: console | |
451 | ||
452 | sudo /sbin/modprobe pci-stub | |
453 | ||
454 | Unbind the default driver from the PCI devices representing the Virtual Functions. | |
455 | A script to perform this action is as follows: | |
456 | ||
457 | .. code-block:: console | |
458 | ||
459 | echo "8086 10ed" > /sys/bus/pci/drivers/pci-stub/new_id | |
460 | echo 0000:08:10.0 > /sys/bus/pci/devices/0000:08:10.0/driver/unbind | |
461 | echo 0000:08:10.0 > /sys/bus/pci/drivers/pci-stub/bind | |
462 | ||
463 | where, 0000:08:10.0 belongs to the Virtual Function visible in the Host OS. | |
464 | ||
465 | #. Now, start the Virtual Machine by running the following command: | |
466 | ||
467 | .. code-block:: console | |
468 | ||
469 | /usr/local/kvm/bin/qemu-system-x86_64 -m 4096 -smp 4 -boot c -hda lucid.qcow2 -device pci-assign,host=08:10.0 | |
470 | ||
471 | where: | |
472 | ||
473 | — -m = memory to assign | |
474 | ||
475 | — -smp = number of smp cores | |
476 | ||
477 | — -boot = boot option | |
478 | ||
479 | — -hda = virtual disk image | |
480 | ||
481 | — -device = device to attach | |
482 | ||
483 | .. note:: | |
484 | ||
485 | — The pci-assign,host=08:10.0 value indicates that you want to attach a PCI device | |
486 | to a Virtual Machine and the respective (Bus:Device.Function) | |
487 | numbers should be passed for the Virtual Function to be attached. | |
488 | ||
489 | — qemu-kvm-0.14.0 allows a maximum of four PCI devices assigned to a VM, | |
490 | but this is qemu-kvm version dependent since qemu-kvm-0.14.1 allows a maximum of five PCI devices. | |
491 | ||
492 | — qemu-system-x86_64 also has a -cpu command line option that is used to select the cpu_model | |
493 | to emulate in a Virtual Machine. Therefore, it can be used as: | |
494 | ||
495 | .. code-block:: console | |
496 | ||
497 | /usr/local/kvm/bin/qemu-system-x86_64 -cpu ? | |
498 | ||
499 | (to list all available cpu_models) | |
500 | ||
501 | /usr/local/kvm/bin/qemu-system-x86_64 -m 4096 -cpu host -smp 4 -boot c -hda lucid.qcow2 -device pci-assign,host=08:10.0 | |
502 | ||
503 | (to use the same cpu_model equivalent to the host cpu) | |
504 | ||
505 | For more information, please refer to: `http://wiki.qemu.org/Features/CPUModels <http://wiki.qemu.org/Features/CPUModels>`_. | |
506 | ||
11fdf7f2 TL |
507 | #. If use vfio-pci to pass through device instead of pci-assign, steps 8 and 9 need to be updated to bind device to vfio-pci and |
508 | replace pci-assign with vfio-pci when start virtual machine. | |
509 | ||
510 | .. code-block:: console | |
511 | ||
512 | sudo /sbin/modprobe vfio-pci | |
513 | ||
514 | echo "8086 10ed" > /sys/bus/pci/drivers/vfio-pci/new_id | |
515 | echo 0000:08:10.0 > /sys/bus/pci/devices/0000:08:10.0/driver/unbind | |
516 | echo 0000:08:10.0 > /sys/bus/pci/drivers/vfio-pci/bind | |
517 | ||
518 | /usr/local/kvm/bin/qemu-system-x86_64 -m 4096 -smp 4 -boot c -hda lucid.qcow2 -device vfio-pci,host=08:10.0 | |
519 | ||
7c673cae FG |
520 | #. Install and run DPDK host app to take over the Physical Function. Eg. |
521 | ||
522 | .. code-block:: console | |
523 | ||
9f95a23c TL |
524 | make install T=x86_64-native-linux-gcc |
525 | ./x86_64-native-linux-gcc/app/testpmd -l 0-3 -n 4 -- -i | |
7c673cae FG |
526 | |
527 | #. Finally, access the Guest OS using vncviewer with the localhost:5900 port and check the lspci command output in the Guest OS. | |
528 | The virtual functions will be listed as available for use. | |
529 | ||
9f95a23c | 530 | #. Configure and install the DPDK with an x86_64-native-linux-gcc configuration on the Guest OS as normal, |
7c673cae FG |
531 | that is, there is no change to the normal installation procedure. |
532 | ||
533 | .. code-block:: console | |
534 | ||
9f95a23c TL |
535 | make config T=x86_64-native-linux-gcc O=x86_64-native-linux-gcc |
536 | cd x86_64-native-linux-gcc | |
7c673cae FG |
537 | make |
538 | ||
539 | .. note:: | |
540 | ||
541 | If you are unable to compile the DPDK and you are getting "error: CPU you selected does not support x86-64 instruction set", | |
542 | power off the Guest OS and start the virtual machine with the correct -cpu option in the qemu- system-x86_64 command as shown in step 9. | |
543 | You must select the best x86_64 cpu_model to emulate or you can select host option if available. | |
544 | ||
545 | .. note:: | |
546 | ||
547 | Run the DPDK l2fwd sample application in the Guest OS with Hugepages enabled. | |
548 | For the expected benchmark performance, you must pin the cores from the Guest OS to the Host OS (taskset can be used to do this) and | |
549 | you must also look at the PCI Bus layout on the board to ensure you are not running the traffic over the QPI Interface. | |
550 | ||
551 | .. note:: | |
552 | ||
553 | * The Virtual Machine Manager (the Fedora package name is virt-manager) is a utility for virtual machine management | |
554 | that can also be used to create, start, stop and delete virtual machines. | |
555 | If this option is used, step 2 and 6 in the instructions provided will be different. | |
556 | ||
557 | * virsh, a command line utility for virtual machine management, | |
558 | can also be used to bind and unbind devices to a virtual machine in Ubuntu. | |
559 | If this option is used, step 6 in the instructions provided will be different. | |
560 | ||
561 | * The Virtual Machine Monitor (see :numref:`figure_perf_benchmark`) is equivalent to a Host OS with KVM installed as described in the instructions. | |
562 | ||
563 | .. _figure_perf_benchmark: | |
564 | ||
565 | .. figure:: img/perf_benchmark.* | |
566 | ||
567 | Performance Benchmark Setup | |
568 | ||
569 | ||
570 | DPDK SR-IOV PMD PF/VF Driver Usage Model | |
571 | ---------------------------------------- | |
572 | ||
573 | Fast Host-based Packet Processing | |
574 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
575 | ||
576 | Software Defined Network (SDN) trends are demanding fast host-based packet handling. | |
577 | In a virtualization environment, | |
578 | the DPDK VF PMD driver performs the same throughput result as a non-VT native environment. | |
579 | ||
580 | With such host instance fast packet processing, lots of services such as filtering, QoS, | |
581 | DPI can be offloaded on the host fast path. | |
582 | ||
583 | :numref:`figure_fast_pkt_proc` shows the scenario where some VMs directly communicate externally via a VFs, | |
584 | while others connect to a virtual switch and share the same uplink bandwidth. | |
585 | ||
586 | .. _figure_fast_pkt_proc: | |
587 | ||
588 | .. figure:: img/fast_pkt_proc.* | |
589 | ||
590 | Fast Host-based Packet Processing | |
591 | ||
592 | ||
593 | SR-IOV (PF/VF) Approach for Inter-VM Communication | |
594 | -------------------------------------------------- | |
595 | ||
596 | Inter-VM data communication is one of the traffic bottle necks in virtualization platforms. | |
597 | SR-IOV device assignment helps a VM to attach the real device, taking advantage of the bridge in the NIC. | |
598 | So VF-to-VF traffic within the same physical port (VM0<->VM1) have hardware acceleration. | |
599 | However, when VF crosses physical ports (VM0<->VM2), there is no such hardware bridge. | |
600 | In this case, the DPDK PMD PF driver provides host forwarding between such VMs. | |
601 | ||
602 | :numref:`figure_inter_vm_comms` shows an example. | |
603 | In this case an update of the MAC address lookup tables in both the NIC and host DPDK application is required. | |
604 | ||
605 | In the NIC, writing the destination of a MAC address belongs to another cross device VM to the PF specific pool. | |
606 | So when a packet comes in, its destination MAC address will match and forward to the host DPDK PMD application. | |
607 | ||
608 | In the host DPDK application, the behavior is similar to L2 forwarding, | |
609 | that is, the packet is forwarded to the correct PF pool. | |
610 | The SR-IOV NIC switch forwards the packet to a specific VM according to the MAC destination address | |
611 | which belongs to the destination VF on the VM. | |
612 | ||
613 | .. _figure_inter_vm_comms: | |
614 | ||
615 | .. figure:: img/inter_vm_comms.* | |
616 | ||
617 | Inter-VM Communication |