]>
Commit | Line | Data |
---|---|---|
9f95a23c TL |
1 | .. SPDX-License-Identifier: BSD-3-Clause |
2 | Copyright(c) 2016 Intel Corporation. | |
7c673cae FG |
3 | |
4 | I40E Poll Mode Driver | |
5 | ====================== | |
6 | ||
9f95a23c TL |
7 | The i40e PMD (librte_pmd_i40e) provides poll mode driver support for |
8 | 10/25/40 Gbps Intel® Ethernet 700 Series Network Adapters based on | |
9 | the Intel Ethernet Controller X710/XL710/XXV710 and Intel Ethernet | |
10 | Connection X722 (only support part of features). | |
7c673cae FG |
11 | |
12 | ||
13 | Features | |
14 | -------- | |
15 | ||
9f95a23c | 16 | Features of the i40e PMD are: |
7c673cae FG |
17 | |
18 | - Multiple queues for TX and RX | |
19 | - Receiver Side Scaling (RSS) | |
20 | - MAC/VLAN filtering | |
21 | - Packet type information | |
22 | - Flow director | |
23 | - Cloud filter | |
24 | - Checksum offload | |
25 | - VLAN/QinQ stripping and inserting | |
26 | - TSO offload | |
27 | - Promiscuous mode | |
28 | - Multicast mode | |
29 | - Port hardware statistics | |
30 | - Jumbo frames | |
31 | - Link state information | |
32 | - Link flow control | |
33 | - Mirror on port, VLAN and VSI | |
34 | - Interrupt mode for RX | |
35 | - Scattered and gather for TX and RX | |
36 | - Vector Poll mode driver | |
37 | - DCB | |
38 | - VMDQ | |
39 | - SR-IOV VF | |
40 | - Hot plug | |
41 | - IEEE1588/802.1AS timestamping | |
11fdf7f2 | 42 | - VF Daemon (VFD) - EXPERIMENTAL |
9f95a23c TL |
43 | - Dynamic Device Personalization (DDP) |
44 | - Queue region configuration | |
45 | - Virtual Function Port Representors | |
7c673cae FG |
46 | |
47 | Prerequisites | |
48 | ------------- | |
49 | ||
50 | - Identifying your adapter using `Intel Support | |
51 | <http://www.intel.com/support>`_ and get the latest NVM/FW images. | |
52 | ||
53 | - Follow the DPDK :ref:`Getting Started Guide for Linux <linux_gsg>` to setup the basic DPDK environment. | |
54 | ||
55 | - To get better performance on Intel platforms, please follow the "How to get best performance with NICs on Intel platforms" | |
56 | section of the :ref:`Getting Started Guide for Linux <linux_gsg>`. | |
57 | ||
9f95a23c TL |
58 | - Upgrade the NVM/FW version following the `Intel® Ethernet NVM Update Tool Quick Usage Guide for Linux |
59 | <https://www-ssl.intel.com/content/www/us/en/embedded/products/networking/nvm-update-tool-quick-linux-usage-guide.html>`_ and `Intel® Ethernet NVM Update Tool: Quick Usage Guide for EFI <https://www.intel.com/content/www/us/en/embedded/products/networking/nvm-update-tool-quick-efi-usage-guide.html>`_ if needed. | |
60 | ||
61 | Recommended Matching List | |
62 | ------------------------- | |
63 | ||
64 | It is highly recommended to upgrade the i40e kernel driver and firmware to | |
65 | avoid the compatibility issues with i40e PMD. Here is the suggested matching | |
66 | list which has been tested and verified. The detailed information can refer | |
67 | to chapter Tested Platforms/Tested NICs in release notes. | |
68 | ||
69 | +--------------+-----------------------+------------------+ | |
70 | | DPDK version | Kernel driver version | Firmware version | | |
71 | +==============+=======================+==================+ | |
72 | | 19.05 | 2.7.29 | 6.80 | | |
73 | +--------------+-----------------------+------------------+ | |
74 | | 19.02 | 2.7.26 | 6.80 | | |
75 | +--------------+-----------------------+------------------+ | |
76 | | 18.11 | 2.4.6 | 6.01 | | |
77 | +--------------+-----------------------+------------------+ | |
78 | | 18.08 | 2.4.6 | 6.01 | | |
79 | +--------------+-----------------------+------------------+ | |
80 | | 18.05 | 2.4.6 | 6.01 | | |
81 | +--------------+-----------------------+------------------+ | |
82 | | 18.02 | 2.4.3 | 6.01 | | |
83 | +--------------+-----------------------+------------------+ | |
84 | | 17.11 | 2.1.26 | 6.01 | | |
85 | +--------------+-----------------------+------------------+ | |
86 | | 17.08 | 2.0.19 | 6.01 | | |
87 | +--------------+-----------------------+------------------+ | |
88 | | 17.05 | 1.5.23 | 5.05 | | |
89 | +--------------+-----------------------+------------------+ | |
90 | | 17.02 | 1.5.23 | 5.05 | | |
91 | +--------------+-----------------------+------------------+ | |
92 | | 16.11 | 1.5.23 | 5.05 | | |
93 | +--------------+-----------------------+------------------+ | |
94 | | 16.07 | 1.4.25 | 5.04 | | |
95 | +--------------+-----------------------+------------------+ | |
96 | | 16.04 | 1.4.25 | 5.02 | | |
97 | +--------------+-----------------------+------------------+ | |
7c673cae FG |
98 | |
99 | Pre-Installation Configuration | |
100 | ------------------------------ | |
101 | ||
102 | Config File Options | |
103 | ~~~~~~~~~~~~~~~~~~~ | |
104 | ||
105 | The following options can be modified in the ``config`` file. | |
106 | Please note that enabling debugging options may affect system performance. | |
107 | ||
108 | - ``CONFIG_RTE_LIBRTE_I40E_PMD`` (default ``y``) | |
109 | ||
110 | Toggle compilation of the ``librte_pmd_i40e`` driver. | |
111 | ||
112 | - ``CONFIG_RTE_LIBRTE_I40E_DEBUG_*`` (default ``n``) | |
113 | ||
114 | Toggle display of generic debugging messages. | |
115 | ||
116 | - ``CONFIG_RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC`` (default ``y``) | |
117 | ||
118 | Toggle bulk allocation for RX. | |
119 | ||
120 | - ``CONFIG_RTE_LIBRTE_I40E_INC_VECTOR`` (default ``n``) | |
121 | ||
122 | Toggle the use of Vector PMD instead of normal RX/TX path. | |
123 | To enable vPMD for RX, bulk allocation for Rx must be allowed. | |
124 | ||
7c673cae FG |
125 | - ``CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC`` (default ``n``) |
126 | ||
127 | Toggle to use a 16-byte RX descriptor, by default the RX descriptor is 32 byte. | |
128 | ||
129 | - ``CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_PF`` (default ``64``) | |
130 | ||
131 | Number of queues reserved for PF. | |
132 | ||
7c673cae FG |
133 | - ``CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM`` (default ``4``) |
134 | ||
135 | Number of queues reserved for each VMDQ Pool. | |
136 | ||
9f95a23c TL |
137 | Runtime Config Options |
138 | ~~~~~~~~~~~~~~~~~~~~~~ | |
139 | ||
140 | - ``Reserved number of Queues per VF`` (default ``4``) | |
141 | ||
142 | The number of reserved queue per VF is determined by its host PF. If the | |
143 | PCI address of an i40e PF is aaaa:bb.cc, the number of reserved queues per | |
144 | VF can be configured with EAL parameter like -w aaaa:bb.cc,queue-num-per-vf=n. | |
145 | The value n can be 1, 2, 4, 8 or 16. If no such parameter is configured, the | |
146 | number of reserved queues per VF is 4 by default. If VF request more than | |
147 | reserved queues per VF, PF will able to allocate max to 16 queues after a VF | |
148 | reset. | |
149 | ||
150 | ||
151 | - ``Support multiple driver`` (default ``disable``) | |
152 | ||
153 | There was a multiple driver support issue during use of 700 series Ethernet | |
154 | Adapter with both Linux kernel and DPDK PMD. To fix this issue, ``devargs`` | |
155 | parameter ``support-multi-driver`` is introduced, for example:: | |
156 | ||
157 | -w 84:00.0,support-multi-driver=1 | |
158 | ||
159 | With the above configuration, DPDK PMD will not change global registers, and | |
160 | will switch PF interrupt from IntN to Int0 to avoid interrupt conflict between | |
161 | DPDK and Linux Kernel. | |
162 | ||
163 | - ``Support VF Port Representor`` (default ``not enabled``) | |
164 | ||
165 | The i40e PF PMD supports the creation of VF port representors for the control | |
166 | and monitoring of i40e virtual function devices. Each port representor | |
167 | corresponds to a single virtual function of that device. Using the ``devargs`` | |
168 | option ``representor`` the user can specify which virtual functions to create | |
169 | port representors for on initialization of the PF PMD by passing the VF IDs of | |
170 | the VFs which are required.:: | |
7c673cae | 171 | |
9f95a23c | 172 | -w DBDF,representor=[0,1,4] |
7c673cae | 173 | |
9f95a23c TL |
174 | Currently hot-plugging of representor ports is not supported so all required |
175 | representors must be specified on the creation of the PF. | |
176 | ||
177 | - ``Use latest supported vector`` (default ``disable``) | |
178 | ||
179 | Latest supported vector path may not always get the best perf so vector path was | |
180 | recommended to use only on later platform. But users may want the latest vector path | |
181 | since it can get better perf in some real work loading cases. So ``devargs`` param | |
182 | ``use-latest-supported-vec`` is introduced, for example:: | |
183 | ||
184 | -w 84:00.0,use-latest-supported-vec=1 | |
185 | ||
186 | Vector RX Pre-conditions | |
187 | ~~~~~~~~~~~~~~~~~~~~~~~~ | |
188 | For Vector RX it is assumed that the number of descriptor rings will be a power | |
189 | of 2. With this pre-condition, the ring pointer can easily scroll back to the | |
190 | head after hitting the tail without a conditional check. In addition Vector RX | |
191 | can use this assumption to do a bit mask using ``ring_size - 1``. | |
7c673cae | 192 | |
11fdf7f2 TL |
193 | Driver compilation and testing |
194 | ------------------------------ | |
7c673cae | 195 | |
11fdf7f2 TL |
196 | Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` |
197 | for details. | |
7c673cae FG |
198 | |
199 | ||
200 | SR-IOV: Prerequisites and sample Application Notes | |
11fdf7f2 | 201 | -------------------------------------------------- |
7c673cae FG |
202 | |
203 | #. Load the kernel module: | |
204 | ||
205 | .. code-block:: console | |
206 | ||
207 | modprobe i40e | |
208 | ||
209 | Check the output in dmesg: | |
210 | ||
211 | .. code-block:: console | |
212 | ||
213 | i40e 0000:83:00.1 ens802f0: renamed from eth0 | |
214 | ||
215 | #. Bring up the PF ports: | |
216 | ||
217 | .. code-block:: console | |
218 | ||
219 | ifconfig ens802f0 up | |
220 | ||
221 | #. Create VF device(s): | |
222 | ||
223 | Echo the number of VFs to be created into the ``sriov_numvfs`` sysfs entry | |
224 | of the parent PF. | |
225 | ||
226 | Example: | |
227 | ||
228 | .. code-block:: console | |
229 | ||
230 | echo 2 > /sys/devices/pci0000:00/0000:00:03.0/0000:81:00.0/sriov_numvfs | |
231 | ||
232 | ||
233 | #. Assign VF MAC address: | |
234 | ||
235 | Assign MAC address to the VF using iproute2 utility. The syntax is: | |
236 | ||
237 | .. code-block:: console | |
238 | ||
239 | ip link set <PF netdev id> vf <VF id> mac <macaddr> | |
240 | ||
241 | Example: | |
242 | ||
243 | .. code-block:: console | |
244 | ||
245 | ip link set ens802f0 vf 0 mac a0:b0:c0:d0:e0:f0 | |
246 | ||
247 | #. Assign VF to VM, and bring up the VM. | |
248 | Please see the documentation for the *I40E/IXGBE/IGB Virtual Function Driver*. | |
249 | ||
11fdf7f2 TL |
250 | #. Running testpmd: |
251 | ||
252 | Follow instructions available in the document | |
253 | :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` | |
254 | to run testpmd. | |
255 | ||
256 | Example output: | |
257 | ||
258 | .. code-block:: console | |
259 | ||
260 | ... | |
261 | EAL: PCI device 0000:83:00.0 on NUMA socket 1 | |
262 | EAL: probe driver: 8086:1572 rte_i40e_pmd | |
263 | EAL: PCI memory mapped at 0x7f7f80000000 | |
264 | EAL: PCI memory mapped at 0x7f7f80800000 | |
265 | PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 8000208a | |
266 | Interactive-mode selected | |
267 | Configuring Port 0 (socket 0) | |
268 | ... | |
269 | ||
270 | PMD: i40e_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are | |
271 | satisfied.Rx Burst Bulk Alloc function will be used on port=0, queue=0. | |
272 | ||
273 | ... | |
274 | Port 0: 68:05:CA:26:85:84 | |
275 | Checking link statuses... | |
276 | Port 0 Link Up - speed 10000 Mbps - full-duplex | |
277 | Done | |
278 | ||
279 | testpmd> | |
280 | ||
7c673cae FG |
281 | |
282 | Sample Application Notes | |
283 | ------------------------ | |
284 | ||
285 | Vlan filter | |
286 | ~~~~~~~~~~~ | |
287 | ||
288 | Vlan filter only works when Promiscuous mode is off. | |
289 | ||
290 | To start ``testpmd``, and add vlan 10 to port 0: | |
291 | ||
292 | .. code-block:: console | |
293 | ||
11fdf7f2 | 294 | ./app/testpmd -l 0-15 -n 4 -- -i --forward-mode=mac |
7c673cae FG |
295 | ... |
296 | ||
297 | testpmd> set promisc 0 off | |
298 | testpmd> rx_vlan add 10 0 | |
299 | ||
300 | ||
301 | Flow Director | |
302 | ~~~~~~~~~~~~~ | |
303 | ||
304 | The Flow Director works in receive mode to identify specific flows or sets of flows and route them to specific queues. | |
305 | The Flow Director filters can match the different fields for different type of packet: flow type, specific input set per flow type and the flexible payload. | |
306 | ||
307 | The default input set of each flow type is:: | |
308 | ||
309 | ipv4-other : src_ip_address, dst_ip_address | |
310 | ipv4-frag : src_ip_address, dst_ip_address | |
311 | ipv4-tcp : src_ip_address, dst_ip_address, src_port, dst_port | |
312 | ipv4-udp : src_ip_address, dst_ip_address, src_port, dst_port | |
313 | ipv4-sctp : src_ip_address, dst_ip_address, src_port, dst_port, | |
314 | verification_tag | |
315 | ipv6-other : src_ip_address, dst_ip_address | |
316 | ipv6-frag : src_ip_address, dst_ip_address | |
317 | ipv6-tcp : src_ip_address, dst_ip_address, src_port, dst_port | |
318 | ipv6-udp : src_ip_address, dst_ip_address, src_port, dst_port | |
319 | ipv6-sctp : src_ip_address, dst_ip_address, src_port, dst_port, | |
320 | verification_tag | |
321 | l2_payload : ether_type | |
322 | ||
323 | The flex payload is selected from offset 0 to 15 of packet's payload by default, while it is masked out from matching. | |
324 | ||
325 | Start ``testpmd`` with ``--disable-rss`` and ``--pkt-filter-mode=perfect``: | |
326 | ||
327 | .. code-block:: console | |
328 | ||
11fdf7f2 | 329 | ./app/testpmd -l 0-15 -n 4 -- -i --disable-rss --pkt-filter-mode=perfect \ |
7c673cae FG |
330 | --rxq=8 --txq=8 --nb-cores=8 --nb-ports=1 |
331 | ||
332 | Add a rule to direct ``ipv4-udp`` packet whose ``dst_ip=2.2.2.5, src_ip=2.2.2.3, src_port=32, dst_port=32`` to queue 1: | |
333 | ||
334 | .. code-block:: console | |
335 | ||
336 | testpmd> flow_director_filter 0 mode IP add flow ipv4-udp \ | |
337 | src 2.2.2.3 32 dst 2.2.2.5 32 vlan 0 flexbytes () \ | |
338 | fwd pf queue 1 fd_id 1 | |
339 | ||
340 | Check the flow director status: | |
341 | ||
342 | .. code-block:: console | |
343 | ||
344 | testpmd> show port fdir 0 | |
345 | ||
346 | ######################## FDIR infos for port 0 #################### | |
347 | MODE: PERFECT | |
348 | SUPPORTED FLOW TYPE: ipv4-frag ipv4-tcp ipv4-udp ipv4-sctp ipv4-other | |
349 | ipv6-frag ipv6-tcp ipv6-udp ipv6-sctp ipv6-other | |
350 | l2_payload | |
351 | FLEX PAYLOAD INFO: | |
352 | max_len: 16 payload_limit: 480 | |
353 | payload_unit: 2 payload_seg: 3 | |
354 | bitmask_unit: 2 bitmask_num: 2 | |
355 | MASK: | |
356 | vlan_tci: 0x0000, | |
357 | src_ipv4: 0x00000000, | |
358 | dst_ipv4: 0x00000000, | |
359 | src_port: 0x0000, | |
360 | dst_port: 0x0000 | |
361 | src_ipv6: 0x00000000,0x00000000,0x00000000,0x00000000, | |
362 | dst_ipv6: 0x00000000,0x00000000,0x00000000,0x00000000 | |
363 | FLEX PAYLOAD SRC OFFSET: | |
364 | L2_PAYLOAD: 0 1 2 3 4 5 6 ... | |
365 | L3_PAYLOAD: 0 1 2 3 4 5 6 ... | |
366 | L4_PAYLOAD: 0 1 2 3 4 5 6 ... | |
367 | FLEX MASK CFG: | |
368 | ipv4-udp: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
369 | ipv4-tcp: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
370 | ipv4-sctp: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
371 | ipv4-other: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
372 | ipv4-frag: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
373 | ipv6-udp: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
374 | ipv6-tcp: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
375 | ipv6-sctp: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
376 | ipv6-other: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
377 | ipv6-frag: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
378 | l2_payload: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
379 | guarant_count: 1 best_count: 0 | |
380 | guarant_space: 512 best_space: 7168 | |
381 | collision: 0 free: 0 | |
382 | maxhash: 0 maxlen: 0 | |
383 | add: 0 remove: 0 | |
384 | f_add: 0 f_remove: 0 | |
385 | ||
386 | ||
387 | Delete all flow director rules on a port: | |
388 | ||
389 | .. code-block:: console | |
390 | ||
391 | testpmd> flush_flow_director 0 | |
392 | ||
393 | Floating VEB | |
394 | ~~~~~~~~~~~~~ | |
395 | ||
9f95a23c | 396 | The Intel® Ethernet 700 Series support a feature called |
7c673cae FG |
397 | "Floating VEB". |
398 | ||
399 | A Virtual Ethernet Bridge (VEB) is an IEEE Edge Virtual Bridging (EVB) term | |
400 | for functionality that allows local switching between virtual endpoints within | |
401 | a physical endpoint and also with an external bridge/network. | |
402 | ||
403 | A "Floating" VEB doesn't have an uplink connection to the outside world so all | |
404 | switching is done internally and remains within the host. As such, this | |
405 | feature provides security benefits. | |
406 | ||
407 | In addition, a Floating VEB overcomes a limitation of normal VEBs where they | |
408 | cannot forward packets when the physical link is down. Floating VEBs don't need | |
409 | to connect to the NIC port so they can still forward traffic from VF to VF | |
410 | even when the physical link is down. | |
411 | ||
412 | Therefore, with this feature enabled VFs can be limited to communicating with | |
413 | each other but not an outside network, and they can do so even when there is | |
414 | no physical uplink on the associated NIC port. | |
415 | ||
416 | To enable this feature, the user should pass a ``devargs`` parameter to the | |
417 | EAL, for example:: | |
418 | ||
419 | -w 84:00.0,enable_floating_veb=1 | |
420 | ||
421 | In this configuration the PMD will use the floating VEB feature for all the | |
422 | VFs created by this PF device. | |
423 | ||
424 | Alternatively, the user can specify which VFs need to connect to this floating | |
425 | VEB using the ``floating_veb_list`` argument:: | |
426 | ||
427 | -w 84:00.0,enable_floating_veb=1,floating_veb_list=1;3-4 | |
428 | ||
429 | In this example ``VF1``, ``VF3`` and ``VF4`` connect to the floating VEB, | |
430 | while other VFs connect to the normal VEB. | |
431 | ||
432 | The current implementation only supports one floating VEB and one regular | |
433 | VEB. VFs can connect to a floating VEB or a regular VEB according to the | |
434 | configuration passed on the EAL command line. | |
435 | ||
436 | The floating VEB functionality requires a NIC firmware version of 5.0 | |
437 | or greater. | |
438 | ||
9f95a23c TL |
439 | Dynamic Device Personalization (DDP) |
440 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
441 | ||
442 | The Intel® Ethernet 700 Series except for the Intel Ethernet Connection | |
443 | X722 support a feature called "Dynamic Device Personalization (DDP)", | |
444 | which is used to configure hardware by downloading a profile to support | |
445 | protocols/filters which are not supported by default. The DDP | |
446 | functionality requires a NIC firmware version of 6.0 or greater. | |
447 | ||
448 | Current implementation supports GTP-C/GTP-U/PPPoE/PPPoL2TP, | |
449 | steering can be used with rte_flow API. | |
450 | ||
451 | GTPv1 package is released, and it can be downloaded from | |
452 | https://downloadcenter.intel.com/download/27587. | |
453 | ||
454 | PPPoE package is released, and it can be downloaded from | |
455 | https://downloadcenter.intel.com/download/28040. | |
456 | ||
457 | Load a profile which supports GTP and store backup profile: | |
458 | ||
459 | .. code-block:: console | |
460 | ||
461 | testpmd> ddp add 0 ./gtp.pkgo,./backup.pkgo | |
462 | ||
463 | Delete a GTP profile and restore backup profile: | |
464 | ||
465 | .. code-block:: console | |
466 | ||
467 | testpmd> ddp del 0 ./backup.pkgo | |
468 | ||
469 | Get loaded DDP package info list: | |
470 | ||
471 | .. code-block:: console | |
472 | ||
473 | testpmd> ddp get list 0 | |
474 | ||
475 | Display information about a GTP profile: | |
476 | ||
477 | .. code-block:: console | |
478 | ||
479 | testpmd> ddp get info ./gtp.pkgo | |
480 | ||
481 | Input set configuration | |
482 | ~~~~~~~~~~~~~~~~~~~~~~~ | |
483 | Input set for any PCTYPE can be configured with user defined configuration, | |
484 | For example, to use only 48bit prefix for IPv6 src address for IPv6 TCP RSS: | |
485 | ||
486 | .. code-block:: console | |
487 | ||
488 | testpmd> port config 0 pctype 43 hash_inset clear all | |
489 | testpmd> port config 0 pctype 43 hash_inset set field 13 | |
490 | testpmd> port config 0 pctype 43 hash_inset set field 14 | |
491 | testpmd> port config 0 pctype 43 hash_inset set field 15 | |
492 | ||
493 | Queue region configuration | |
494 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
495 | The Intel® Ethernet 700 Series supports a feature of queue regions | |
496 | configuration for RSS in the PF, so that different traffic classes or | |
497 | different packet classification types can be separated to different | |
498 | queues in different queue regions. There is an API for configuration | |
499 | of queue regions in RSS with a command line. It can parse the parameters | |
500 | of the region index, queue number, queue start index, user priority, traffic | |
501 | classes and so on. Depending on commands from the command line, it will call | |
502 | i40e private APIs and start the process of setting or flushing the queue | |
503 | region configuration. As this feature is specific for i40e only private | |
504 | APIs are used. These new ``test_pmd`` commands are as shown below. For | |
505 | details please refer to :doc:`../testpmd_app_ug/index`. | |
506 | ||
507 | .. code-block:: console | |
508 | ||
509 | testpmd> set port (port_id) queue-region region_id (value) \ | |
510 | queue_start_index (value) queue_num (value) | |
511 | testpmd> set port (port_id) queue-region region_id (value) flowtype (value) | |
512 | testpmd> set port (port_id) queue-region UP (value) region_id (value) | |
513 | testpmd> set port (port_id) queue-region flush (on|off) | |
514 | testpmd> show port (port_id) queue-region | |
7c673cae FG |
515 | |
516 | Limitations or Known issues | |
517 | --------------------------- | |
518 | ||
9f95a23c TL |
519 | MPLS packet classification |
520 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
7c673cae FG |
521 | |
522 | For firmware versions prior to 5.0, MPLS packets are not recognized by the NIC. | |
523 | The L2 Payload flow type in flow director can be used to classify MPLS packet | |
524 | by using a command in testpmd like: | |
525 | ||
526 | testpmd> flow_director_filter 0 mode IP add flow l2_payload ether \ | |
527 | 0x8847 flexbytes () fwd pf queue <N> fd_id <M> | |
528 | ||
529 | With the NIC firmware version 5.0 or greater, some limited MPLS support | |
530 | is added: Native MPLS (MPLS in Ethernet) skip is implemented, while no | |
531 | new packet type, no classification or offload are possible. With this change, | |
532 | L2 Payload flow type in flow director cannot be used to classify MPLS packet | |
533 | as with previous firmware versions. Meanwhile, the Ethertype filter can be | |
534 | used to classify MPLS packet by using a command in testpmd like: | |
535 | ||
536 | testpmd> ethertype_filter 0 add mac_ignr 00:00:00:00:00:00 ethertype \ | |
537 | 0x8847 fwd queue <M> | |
538 | ||
9f95a23c TL |
539 | 16 Byte RX Descriptor setting on DPDK VF |
540 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
7c673cae | 541 | |
9f95a23c TL |
542 | Currently the VF's RX descriptor mode is decided by PF. There's no PF-VF |
543 | interface for VF to request the RX descriptor mode, also no interface to notify | |
544 | VF its own RX descriptor mode. | |
545 | For all available versions of the i40e driver, these drivers don't support 16 | |
546 | byte RX descriptor. If the Linux i40e kernel driver is used as host driver, | |
547 | while DPDK i40e PMD is used as the VF driver, DPDK cannot choose 16 byte receive | |
548 | descriptor. The reason is that the RX descriptor is already set to 32 byte by | |
549 | the i40e kernel driver. That is to say, user should keep | |
550 | ``CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n`` in config file. | |
551 | In the future, if the Linux i40e driver supports 16 byte RX descriptor, user | |
552 | should make sure the DPDK VF uses the same RX descriptor mode, 16 byte or 32 | |
553 | byte, as the PF driver. | |
554 | ||
555 | The same rule for DPDK PF + DPDK VF. The PF and VF should use the same RX | |
556 | descriptor mode. Or the VF RX will not work. | |
7c673cae FG |
557 | |
558 | Receive packets with Ethertype 0x88A8 | |
559 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
560 | ||
561 | Due to the FW limitation, PF can receive packets with Ethertype 0x88A8 | |
562 | only when floating VEB is disabled. | |
11fdf7f2 TL |
563 | |
564 | Incorrect Rx statistics when packet is oversize | |
565 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
566 | ||
567 | When a packet is over maximum frame size, the packet is dropped. | |
9f95a23c | 568 | However, the Rx statistics, when calling `rte_eth_stats_get` incorrectly |
11fdf7f2 TL |
569 | shows it as received. |
570 | ||
571 | VF & TC max bandwidth setting | |
572 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
573 | ||
574 | The per VF max bandwidth and per TC max bandwidth cannot be enabled in parallel. | |
9f95a23c | 575 | The behavior is different when handling per VF and per TC max bandwidth setting. |
11fdf7f2 TL |
576 | When enabling per VF max bandwidth, SW will check if per TC max bandwidth is |
577 | enabled. If so, return failure. | |
578 | When enabling per TC max bandwidth, SW will check if per VF max bandwidth | |
579 | is enabled. If so, disable per VF max bandwidth and continue with per TC max | |
580 | bandwidth setting. | |
581 | ||
582 | TC TX scheduling mode setting | |
583 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
584 | ||
9f95a23c | 585 | There are 2 TX scheduling modes for TCs, round robin and strict priority mode. |
11fdf7f2 TL |
586 | If a TC is set to strict priority mode, it can consume unlimited bandwidth. |
587 | It means if APP has set the max bandwidth for that TC, it comes to no | |
588 | effect. | |
589 | It's suggested to set the strict priority mode for a TC that is latency | |
590 | sensitive but no consuming much bandwidth. | |
9f95a23c TL |
591 | |
592 | VF performance is impacted by PCI extended tag setting | |
593 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
594 | ||
595 | To reach maximum NIC performance in the VF the PCI extended tag must be | |
596 | enabled. The DPDK i40e PF driver will set this feature during initialization, | |
597 | but the kernel PF driver does not. So when running traffic on a VF which is | |
598 | managed by the kernel PF driver, a significant NIC performance downgrade has | |
599 | been observed (for 64 byte packets, there is about 25% line-rate downgrade for | |
600 | a 25GbE device and about 35% for a 40GbE device). | |
601 | ||
602 | For kernel version >= 4.11, the kernel's PCI driver will enable the extended | |
603 | tag if it detects that the device supports it. So by default, this is not an | |
604 | issue. For kernels <= 4.11 or when the PCI extended tag is disabled it can be | |
605 | enabled using the steps below. | |
606 | ||
607 | #. Get the current value of the PCI configure register:: | |
608 | ||
609 | setpci -s <XX:XX.X> a8.w | |
610 | ||
611 | #. Set bit 8:: | |
612 | ||
613 | value = value | 0x100 | |
614 | ||
615 | #. Set the PCI configure register with new value:: | |
616 | ||
617 | setpci -s <XX:XX.X> a8.w=<value> | |
618 | ||
619 | Vlan strip of VF | |
620 | ~~~~~~~~~~~~~~~~ | |
621 | ||
622 | The VF vlan strip function is only supported in the i40e kernel driver >= 2.1.26. | |
623 | ||
624 | DCB function | |
625 | ~~~~~~~~~~~~ | |
626 | ||
627 | DCB works only when RSS is enabled. | |
628 | ||
629 | Global configuration warning | |
630 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
631 | ||
632 | I40E PMD will set some global registers to enable some function or set some | |
633 | configure. Then when using different ports of the same NIC with Linux kernel | |
634 | and DPDK, the port with Linux kernel will be impacted by the port with DPDK. | |
635 | For example, register I40E_GL_SWT_L2TAGCTRL is used to control L2 tag, i40e | |
636 | PMD uses I40E_GL_SWT_L2TAGCTRL to set vlan TPID. If setting TPID in port A | |
637 | with DPDK, then the configuration will also impact port B in the NIC with | |
638 | kernel driver, which don't want to use the TPID. | |
639 | So PMD reports warning to clarify what is changed by writing global register. | |
640 | ||
641 | High Performance of Small Packets on 40GbE NIC | |
642 | ---------------------------------------------- | |
643 | ||
644 | As there might be firmware fixes for performance enhancement in latest version | |
645 | of firmware image, the firmware update might be needed for getting high performance. | |
646 | Check the Intel support website for the latest firmware updates. | |
647 | Users should consult the release notes specific to a DPDK release to identify | |
648 | the validated firmware version for a NIC using the i40e driver. | |
649 | ||
650 | Use 16 Bytes RX Descriptor Size | |
651 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
652 | ||
653 | As i40e PMD supports both 16 and 32 bytes RX descriptor sizes, and 16 bytes size can provide helps to high performance of small packets. | |
654 | Configuration of ``CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC`` in config files can be changed to use 16 bytes size RX descriptors. | |
655 | ||
656 | Example of getting best performance with l3fwd example | |
657 | ------------------------------------------------------ | |
658 | ||
659 | The following is an example of running the DPDK ``l3fwd`` sample application to get high performance with a | |
660 | server with Intel Xeon processors and Intel Ethernet CNA XL710. | |
661 | ||
662 | The example scenario is to get best performance with two Intel Ethernet CNA XL710 40GbE ports. | |
663 | See :numref:`figure_intel_perf_test_setup` for the performance test setup. | |
664 | ||
665 | .. _figure_intel_perf_test_setup: | |
666 | ||
667 | .. figure:: img/intel_perf_test_setup.* | |
668 | ||
669 | Performance Test Setup | |
670 | ||
671 | ||
672 | 1. Add two Intel Ethernet CNA XL710 to the platform, and use one port per card to get best performance. | |
673 | The reason for using two NICs is to overcome a PCIe v3.0 limitation since it cannot provide 80GbE bandwidth | |
674 | for two 40GbE ports, but two different PCIe v3.0 x8 slot can. | |
675 | Refer to the sample NICs output above, then we can select ``82:00.0`` and ``85:00.0`` as test ports:: | |
676 | ||
677 | 82:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583] | |
678 | 85:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583] | |
679 | ||
680 | 2. Connect the ports to the traffic generator. For high speed testing, it's best to use a hardware traffic generator. | |
681 | ||
682 | 3. Check the PCI devices numa node (socket id) and get the cores number on the exact socket id. | |
683 | In this case, ``82:00.0`` and ``85:00.0`` are both in socket 1, and the cores on socket 1 in the referenced platform | |
684 | are 18-35 and 54-71. | |
685 | Note: Don't use 2 logical cores on the same core (e.g core18 has 2 logical cores, core18 and core54), instead, use 2 logical | |
686 | cores from different cores (e.g core18 and core19). | |
687 | ||
688 | 4. Bind these two ports to igb_uio. | |
689 | ||
690 | 5. As to Intel Ethernet CNA XL710 40GbE port, we need at least two queue pairs to achieve best performance, then two queues per port | |
691 | will be required, and each queue pair will need a dedicated CPU core for receiving/transmitting packets. | |
692 | ||
693 | 6. The DPDK sample application ``l3fwd`` will be used for performance testing, with using two ports for bi-directional forwarding. | |
694 | Compile the ``l3fwd sample`` with the default lpm mode. | |
695 | ||
696 | 7. The command line of running l3fwd would be something like the following:: | |
697 | ||
698 | ./l3fwd -l 18-21 -n 4 -w 82:00.0 -w 85:00.0 \ | |
699 | -- -p 0x3 --config '(0,0,18),(0,1,19),(1,0,20),(1,1,21)' | |
700 | ||
701 | This means that the application uses core 18 for port 0, queue pair 0 forwarding, core 19 for port 0, queue pair 1 forwarding, | |
702 | core 20 for port 1, queue pair 0 forwarding, and core 21 for port 1, queue pair 1 forwarding. | |
703 | ||
704 | 8. Configure the traffic at a traffic generator. | |
705 | ||
706 | * Start creating a stream on packet generator. | |
707 | ||
708 | * Set the Ethernet II type to 0x0800. | |
709 | ||
710 | Tx bytes affected by the link status change | |
711 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
712 | ||
713 | For firmware versions prior to 6.01 for X710 series and 3.33 for X722 series, the tx_bytes statistics data is affected by | |
714 | the link down event. Each time the link status changes to down, the tx_bytes decreases 110 bytes. |