]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | .. BSD LICENSE |
2 | Copyright(c) 2016 Intel Corporation. All rights reserved. | |
3 | All rights reserved. | |
4 | ||
5 | Redistribution and use in source and binary forms, with or without | |
6 | modification, are permitted provided that the following conditions | |
7 | are met: | |
8 | ||
9 | * Redistributions of source code must retain the above copyright | |
10 | notice, this list of conditions and the following disclaimer. | |
11 | * Redistributions in binary form must reproduce the above copyright | |
12 | notice, this list of conditions and the following disclaimer in | |
13 | the documentation and/or other materials provided with the | |
14 | distribution. | |
15 | * Neither the name of Intel Corporation nor the names of its | |
16 | contributors may be used to endorse or promote products derived | |
17 | from this software without specific prior written permission. | |
18 | ||
19 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS | |
20 | "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT | |
21 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR | |
22 | A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT | |
23 | OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, | |
24 | SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT | |
25 | LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, | |
26 | DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY | |
27 | THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | |
28 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | |
29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | |
30 | ||
31 | I40E Poll Mode Driver | |
32 | ====================== | |
33 | ||
34 | The I40E PMD (librte_pmd_i40e) provides poll mode driver support | |
35 | for the Intel X710/XL710/X722 10/40 Gbps family of adapters. | |
36 | ||
37 | ||
38 | Features | |
39 | -------- | |
40 | ||
41 | Features of the I40E PMD are: | |
42 | ||
43 | - Multiple queues for TX and RX | |
44 | - Receiver Side Scaling (RSS) | |
45 | - MAC/VLAN filtering | |
46 | - Packet type information | |
47 | - Flow director | |
48 | - Cloud filter | |
49 | - Checksum offload | |
50 | - VLAN/QinQ stripping and inserting | |
51 | - TSO offload | |
52 | - Promiscuous mode | |
53 | - Multicast mode | |
54 | - Port hardware statistics | |
55 | - Jumbo frames | |
56 | - Link state information | |
57 | - Link flow control | |
58 | - Mirror on port, VLAN and VSI | |
59 | - Interrupt mode for RX | |
60 | - Scattered and gather for TX and RX | |
61 | - Vector Poll mode driver | |
62 | - DCB | |
63 | - VMDQ | |
64 | - SR-IOV VF | |
65 | - Hot plug | |
66 | - IEEE1588/802.1AS timestamping | |
11fdf7f2 | 67 | - VF Daemon (VFD) - EXPERIMENTAL |
7c673cae FG |
68 | |
69 | ||
70 | Prerequisites | |
71 | ------------- | |
72 | ||
73 | - Identifying your adapter using `Intel Support | |
74 | <http://www.intel.com/support>`_ and get the latest NVM/FW images. | |
75 | ||
76 | - Follow the DPDK :ref:`Getting Started Guide for Linux <linux_gsg>` to setup the basic DPDK environment. | |
77 | ||
78 | - To get better performance on Intel platforms, please follow the "How to get best performance with NICs on Intel platforms" | |
79 | section of the :ref:`Getting Started Guide for Linux <linux_gsg>`. | |
80 | ||
81 | ||
82 | Pre-Installation Configuration | |
83 | ------------------------------ | |
84 | ||
85 | Config File Options | |
86 | ~~~~~~~~~~~~~~~~~~~ | |
87 | ||
88 | The following options can be modified in the ``config`` file. | |
89 | Please note that enabling debugging options may affect system performance. | |
90 | ||
91 | - ``CONFIG_RTE_LIBRTE_I40E_PMD`` (default ``y``) | |
92 | ||
93 | Toggle compilation of the ``librte_pmd_i40e`` driver. | |
94 | ||
95 | - ``CONFIG_RTE_LIBRTE_I40E_DEBUG_*`` (default ``n``) | |
96 | ||
97 | Toggle display of generic debugging messages. | |
98 | ||
99 | - ``CONFIG_RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC`` (default ``y``) | |
100 | ||
101 | Toggle bulk allocation for RX. | |
102 | ||
103 | - ``CONFIG_RTE_LIBRTE_I40E_INC_VECTOR`` (default ``n``) | |
104 | ||
105 | Toggle the use of Vector PMD instead of normal RX/TX path. | |
106 | To enable vPMD for RX, bulk allocation for Rx must be allowed. | |
107 | ||
7c673cae FG |
108 | - ``CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC`` (default ``n``) |
109 | ||
110 | Toggle to use a 16-byte RX descriptor, by default the RX descriptor is 32 byte. | |
111 | ||
112 | - ``CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_PF`` (default ``64``) | |
113 | ||
114 | Number of queues reserved for PF. | |
115 | ||
116 | - ``CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF`` (default ``4``) | |
117 | ||
118 | Number of queues reserved for each SR-IOV VF. | |
119 | ||
120 | - ``CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM`` (default ``4``) | |
121 | ||
122 | Number of queues reserved for each VMDQ Pool. | |
123 | ||
124 | - ``CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL`` (default ``-1``) | |
125 | ||
126 | Interrupt Throttling interval. | |
127 | ||
128 | ||
11fdf7f2 TL |
129 | Driver compilation and testing |
130 | ------------------------------ | |
7c673cae | 131 | |
11fdf7f2 TL |
132 | Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` |
133 | for details. | |
7c673cae FG |
134 | |
135 | ||
136 | SR-IOV: Prerequisites and sample Application Notes | |
11fdf7f2 | 137 | -------------------------------------------------- |
7c673cae FG |
138 | |
139 | #. Load the kernel module: | |
140 | ||
141 | .. code-block:: console | |
142 | ||
143 | modprobe i40e | |
144 | ||
145 | Check the output in dmesg: | |
146 | ||
147 | .. code-block:: console | |
148 | ||
149 | i40e 0000:83:00.1 ens802f0: renamed from eth0 | |
150 | ||
151 | #. Bring up the PF ports: | |
152 | ||
153 | .. code-block:: console | |
154 | ||
155 | ifconfig ens802f0 up | |
156 | ||
157 | #. Create VF device(s): | |
158 | ||
159 | Echo the number of VFs to be created into the ``sriov_numvfs`` sysfs entry | |
160 | of the parent PF. | |
161 | ||
162 | Example: | |
163 | ||
164 | .. code-block:: console | |
165 | ||
166 | echo 2 > /sys/devices/pci0000:00/0000:00:03.0/0000:81:00.0/sriov_numvfs | |
167 | ||
168 | ||
169 | #. Assign VF MAC address: | |
170 | ||
171 | Assign MAC address to the VF using iproute2 utility. The syntax is: | |
172 | ||
173 | .. code-block:: console | |
174 | ||
175 | ip link set <PF netdev id> vf <VF id> mac <macaddr> | |
176 | ||
177 | Example: | |
178 | ||
179 | .. code-block:: console | |
180 | ||
181 | ip link set ens802f0 vf 0 mac a0:b0:c0:d0:e0:f0 | |
182 | ||
183 | #. Assign VF to VM, and bring up the VM. | |
184 | Please see the documentation for the *I40E/IXGBE/IGB Virtual Function Driver*. | |
185 | ||
11fdf7f2 TL |
186 | #. Running testpmd: |
187 | ||
188 | Follow instructions available in the document | |
189 | :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` | |
190 | to run testpmd. | |
191 | ||
192 | Example output: | |
193 | ||
194 | .. code-block:: console | |
195 | ||
196 | ... | |
197 | EAL: PCI device 0000:83:00.0 on NUMA socket 1 | |
198 | EAL: probe driver: 8086:1572 rte_i40e_pmd | |
199 | EAL: PCI memory mapped at 0x7f7f80000000 | |
200 | EAL: PCI memory mapped at 0x7f7f80800000 | |
201 | PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 8000208a | |
202 | Interactive-mode selected | |
203 | Configuring Port 0 (socket 0) | |
204 | ... | |
205 | ||
206 | PMD: i40e_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are | |
207 | satisfied.Rx Burst Bulk Alloc function will be used on port=0, queue=0. | |
208 | ||
209 | ... | |
210 | Port 0: 68:05:CA:26:85:84 | |
211 | Checking link statuses... | |
212 | Port 0 Link Up - speed 10000 Mbps - full-duplex | |
213 | Done | |
214 | ||
215 | testpmd> | |
216 | ||
7c673cae FG |
217 | |
218 | Sample Application Notes | |
219 | ------------------------ | |
220 | ||
221 | Vlan filter | |
222 | ~~~~~~~~~~~ | |
223 | ||
224 | Vlan filter only works when Promiscuous mode is off. | |
225 | ||
226 | To start ``testpmd``, and add vlan 10 to port 0: | |
227 | ||
228 | .. code-block:: console | |
229 | ||
11fdf7f2 | 230 | ./app/testpmd -l 0-15 -n 4 -- -i --forward-mode=mac |
7c673cae FG |
231 | ... |
232 | ||
233 | testpmd> set promisc 0 off | |
234 | testpmd> rx_vlan add 10 0 | |
235 | ||
236 | ||
237 | Flow Director | |
238 | ~~~~~~~~~~~~~ | |
239 | ||
240 | The Flow Director works in receive mode to identify specific flows or sets of flows and route them to specific queues. | |
241 | The Flow Director filters can match the different fields for different type of packet: flow type, specific input set per flow type and the flexible payload. | |
242 | ||
243 | The default input set of each flow type is:: | |
244 | ||
245 | ipv4-other : src_ip_address, dst_ip_address | |
246 | ipv4-frag : src_ip_address, dst_ip_address | |
247 | ipv4-tcp : src_ip_address, dst_ip_address, src_port, dst_port | |
248 | ipv4-udp : src_ip_address, dst_ip_address, src_port, dst_port | |
249 | ipv4-sctp : src_ip_address, dst_ip_address, src_port, dst_port, | |
250 | verification_tag | |
251 | ipv6-other : src_ip_address, dst_ip_address | |
252 | ipv6-frag : src_ip_address, dst_ip_address | |
253 | ipv6-tcp : src_ip_address, dst_ip_address, src_port, dst_port | |
254 | ipv6-udp : src_ip_address, dst_ip_address, src_port, dst_port | |
255 | ipv6-sctp : src_ip_address, dst_ip_address, src_port, dst_port, | |
256 | verification_tag | |
257 | l2_payload : ether_type | |
258 | ||
259 | The flex payload is selected from offset 0 to 15 of packet's payload by default, while it is masked out from matching. | |
260 | ||
261 | Start ``testpmd`` with ``--disable-rss`` and ``--pkt-filter-mode=perfect``: | |
262 | ||
263 | .. code-block:: console | |
264 | ||
11fdf7f2 | 265 | ./app/testpmd -l 0-15 -n 4 -- -i --disable-rss --pkt-filter-mode=perfect \ |
7c673cae FG |
266 | --rxq=8 --txq=8 --nb-cores=8 --nb-ports=1 |
267 | ||
268 | Add a rule to direct ``ipv4-udp`` packet whose ``dst_ip=2.2.2.5, src_ip=2.2.2.3, src_port=32, dst_port=32`` to queue 1: | |
269 | ||
270 | .. code-block:: console | |
271 | ||
272 | testpmd> flow_director_filter 0 mode IP add flow ipv4-udp \ | |
273 | src 2.2.2.3 32 dst 2.2.2.5 32 vlan 0 flexbytes () \ | |
274 | fwd pf queue 1 fd_id 1 | |
275 | ||
276 | Check the flow director status: | |
277 | ||
278 | .. code-block:: console | |
279 | ||
280 | testpmd> show port fdir 0 | |
281 | ||
282 | ######################## FDIR infos for port 0 #################### | |
283 | MODE: PERFECT | |
284 | SUPPORTED FLOW TYPE: ipv4-frag ipv4-tcp ipv4-udp ipv4-sctp ipv4-other | |
285 | ipv6-frag ipv6-tcp ipv6-udp ipv6-sctp ipv6-other | |
286 | l2_payload | |
287 | FLEX PAYLOAD INFO: | |
288 | max_len: 16 payload_limit: 480 | |
289 | payload_unit: 2 payload_seg: 3 | |
290 | bitmask_unit: 2 bitmask_num: 2 | |
291 | MASK: | |
292 | vlan_tci: 0x0000, | |
293 | src_ipv4: 0x00000000, | |
294 | dst_ipv4: 0x00000000, | |
295 | src_port: 0x0000, | |
296 | dst_port: 0x0000 | |
297 | src_ipv6: 0x00000000,0x00000000,0x00000000,0x00000000, | |
298 | dst_ipv6: 0x00000000,0x00000000,0x00000000,0x00000000 | |
299 | FLEX PAYLOAD SRC OFFSET: | |
300 | L2_PAYLOAD: 0 1 2 3 4 5 6 ... | |
301 | L3_PAYLOAD: 0 1 2 3 4 5 6 ... | |
302 | L4_PAYLOAD: 0 1 2 3 4 5 6 ... | |
303 | FLEX MASK CFG: | |
304 | ipv4-udp: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
305 | ipv4-tcp: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
306 | ipv4-sctp: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
307 | ipv4-other: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
308 | ipv4-frag: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
309 | ipv6-udp: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
310 | ipv6-tcp: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
311 | ipv6-sctp: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
312 | ipv6-other: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
313 | ipv6-frag: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
314 | l2_payload: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | |
315 | guarant_count: 1 best_count: 0 | |
316 | guarant_space: 512 best_space: 7168 | |
317 | collision: 0 free: 0 | |
318 | maxhash: 0 maxlen: 0 | |
319 | add: 0 remove: 0 | |
320 | f_add: 0 f_remove: 0 | |
321 | ||
322 | ||
323 | Delete all flow director rules on a port: | |
324 | ||
325 | .. code-block:: console | |
326 | ||
327 | testpmd> flush_flow_director 0 | |
328 | ||
329 | Floating VEB | |
330 | ~~~~~~~~~~~~~ | |
331 | ||
332 | The Intel® Ethernet Controller X710 and XL710 Family support a feature called | |
333 | "Floating VEB". | |
334 | ||
335 | A Virtual Ethernet Bridge (VEB) is an IEEE Edge Virtual Bridging (EVB) term | |
336 | for functionality that allows local switching between virtual endpoints within | |
337 | a physical endpoint and also with an external bridge/network. | |
338 | ||
339 | A "Floating" VEB doesn't have an uplink connection to the outside world so all | |
340 | switching is done internally and remains within the host. As such, this | |
341 | feature provides security benefits. | |
342 | ||
343 | In addition, a Floating VEB overcomes a limitation of normal VEBs where they | |
344 | cannot forward packets when the physical link is down. Floating VEBs don't need | |
345 | to connect to the NIC port so they can still forward traffic from VF to VF | |
346 | even when the physical link is down. | |
347 | ||
348 | Therefore, with this feature enabled VFs can be limited to communicating with | |
349 | each other but not an outside network, and they can do so even when there is | |
350 | no physical uplink on the associated NIC port. | |
351 | ||
352 | To enable this feature, the user should pass a ``devargs`` parameter to the | |
353 | EAL, for example:: | |
354 | ||
355 | -w 84:00.0,enable_floating_veb=1 | |
356 | ||
357 | In this configuration the PMD will use the floating VEB feature for all the | |
358 | VFs created by this PF device. | |
359 | ||
360 | Alternatively, the user can specify which VFs need to connect to this floating | |
361 | VEB using the ``floating_veb_list`` argument:: | |
362 | ||
363 | -w 84:00.0,enable_floating_veb=1,floating_veb_list=1;3-4 | |
364 | ||
365 | In this example ``VF1``, ``VF3`` and ``VF4`` connect to the floating VEB, | |
366 | while other VFs connect to the normal VEB. | |
367 | ||
368 | The current implementation only supports one floating VEB and one regular | |
369 | VEB. VFs can connect to a floating VEB or a regular VEB according to the | |
370 | configuration passed on the EAL command line. | |
371 | ||
372 | The floating VEB functionality requires a NIC firmware version of 5.0 | |
373 | or greater. | |
374 | ||
375 | ||
376 | Limitations or Known issues | |
377 | --------------------------- | |
378 | ||
379 | MPLS packet classification on X710/XL710 | |
380 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
381 | ||
382 | For firmware versions prior to 5.0, MPLS packets are not recognized by the NIC. | |
383 | The L2 Payload flow type in flow director can be used to classify MPLS packet | |
384 | by using a command in testpmd like: | |
385 | ||
386 | testpmd> flow_director_filter 0 mode IP add flow l2_payload ether \ | |
387 | 0x8847 flexbytes () fwd pf queue <N> fd_id <M> | |
388 | ||
389 | With the NIC firmware version 5.0 or greater, some limited MPLS support | |
390 | is added: Native MPLS (MPLS in Ethernet) skip is implemented, while no | |
391 | new packet type, no classification or offload are possible. With this change, | |
392 | L2 Payload flow type in flow director cannot be used to classify MPLS packet | |
393 | as with previous firmware versions. Meanwhile, the Ethertype filter can be | |
394 | used to classify MPLS packet by using a command in testpmd like: | |
395 | ||
396 | testpmd> ethertype_filter 0 add mac_ignr 00:00:00:00:00:00 ethertype \ | |
397 | 0x8847 fwd queue <M> | |
398 | ||
399 | 16 Byte Descriptor cannot be used on DPDK VF | |
400 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
401 | ||
402 | If the Linux i40e kernel driver is used as host driver, while DPDK i40e PMD | |
403 | is used as the VF driver, DPDK cannot choose 16 byte receive descriptor. That | |
404 | is to say, user should keep ``CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n`` in | |
405 | config file. | |
406 | ||
11fdf7f2 TL |
407 | Link down with i40e kernel driver after DPDK application exit |
408 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
7c673cae FG |
409 | |
410 | After DPDK application quit, and the device is bound back to Linux i40e | |
411 | kernel driver, the link cannot be up after ``ifconfig <dev> up``. | |
412 | To work around this issue, ``ethtool -s <dev> autoneg on`` should be | |
413 | set first and then the link can be brought up through ``ifconfig <dev> up``. | |
414 | ||
415 | NOTE: requires Linux kernel i40e driver version >= 1.4.X | |
416 | ||
417 | Receive packets with Ethertype 0x88A8 | |
418 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
419 | ||
420 | Due to the FW limitation, PF can receive packets with Ethertype 0x88A8 | |
421 | only when floating VEB is disabled. | |
11fdf7f2 TL |
422 | |
423 | Incorrect Rx statistics when packet is oversize | |
424 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
425 | ||
426 | When a packet is over maximum frame size, the packet is dropped. | |
427 | However the Rx statistics, when calling `rte_eth_stats_get` incorrectly | |
428 | shows it as received. | |
429 | ||
430 | VF & TC max bandwidth setting | |
431 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
432 | ||
433 | The per VF max bandwidth and per TC max bandwidth cannot be enabled in parallel. | |
434 | The dehavior is different when handling per VF and per TC max bandwidth setting. | |
435 | When enabling per VF max bandwidth, SW will check if per TC max bandwidth is | |
436 | enabled. If so, return failure. | |
437 | When enabling per TC max bandwidth, SW will check if per VF max bandwidth | |
438 | is enabled. If so, disable per VF max bandwidth and continue with per TC max | |
439 | bandwidth setting. | |
440 | ||
441 | TC TX scheduling mode setting | |
442 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
443 | ||
444 | There're 2 TX scheduling modes for TCs, round robin and strict priority mode. | |
445 | If a TC is set to strict priority mode, it can consume unlimited bandwidth. | |
446 | It means if APP has set the max bandwidth for that TC, it comes to no | |
447 | effect. | |
448 | It's suggested to set the strict priority mode for a TC that is latency | |
449 | sensitive but no consuming much bandwidth. |