]>
Commit | Line | Data |
---|---|---|
11fdf7f2 TL |
1 | .. SPDX-License-Identifier: BSD-3-Clause |
2 | Copyright(c) 2010-2016 Intel Corporation. | |
7c673cae FG |
3 | |
4 | IXGBE Driver | |
5 | ============ | |
6 | ||
7 | Vector PMD for IXGBE | |
8 | -------------------- | |
9 | ||
10 | Vector PMD uses Intel® SIMD instructions to optimize packet I/O. | |
11 | It improves load/store bandwidth efficiency of L1 data cache by using a wider SSE/AVX register 1 (1). | |
12 | The wider register gives space to hold multiple packet buffers so as to save instruction number when processing bulk of packets. | |
13 | ||
14 | There is no change to PMD API. The RX/TX handler are the only two entries for vPMD packet I/O. | |
15 | They are transparently registered at runtime RX/TX execution if all condition checks pass. | |
16 | ||
17 | 1. To date, only an SSE version of IX GBE vPMD is available. | |
7c673cae FG |
18 | |
19 | Some constraints apply as pre-conditions for specific optimizations on bulk packet transfers. | |
20 | The following sections explain RX and TX constraints in the vPMD. | |
21 | ||
22 | RX Constraints | |
23 | ~~~~~~~~~~~~~~ | |
24 | ||
25 | Prerequisites and Pre-conditions | |
26 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
27 | ||
28 | The following prerequisites apply: | |
29 | ||
30 | * To enable vPMD to work for RX, bulk allocation for Rx must be allowed. | |
31 | ||
32 | Ensure that the following pre-conditions are satisfied: | |
33 | ||
34 | * rxq->rx_free_thresh >= RTE_PMD_IXGBE_RX_MAX_BURST | |
35 | ||
36 | * rxq->rx_free_thresh < rxq->nb_rx_desc | |
37 | ||
38 | * (rxq->nb_rx_desc % rxq->rx_free_thresh) == 0 | |
39 | ||
40 | * rxq->nb_rx_desc < (IXGBE_MAX_RING_DESC - RTE_PMD_IXGBE_RX_MAX_BURST) | |
41 | ||
42 | These conditions are checked in the code. | |
43 | ||
44 | Scattered packets are not supported in this mode. | |
45 | If an incoming packet is greater than the maximum acceptable length of one "mbuf" data size (by default, the size is 2 KB), | |
46 | vPMD for RX would be disabled. | |
47 | ||
48 | By default, IXGBE_MAX_RING_DESC is set to 4096 and RTE_PMD_IXGBE_RX_MAX_BURST is set to 32. | |
49 | ||
50 | Feature not Supported by RX Vector PMD | |
51 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
52 | ||
53 | Some features are not supported when trying to increase the throughput in vPMD. | |
54 | They are: | |
55 | ||
56 | * IEEE1588 | |
57 | ||
58 | * FDIR | |
59 | ||
60 | * Header split | |
61 | ||
62 | * RX checksum off load | |
63 | ||
64 | Other features are supported using optional MACRO configuration. They include: | |
65 | ||
66 | * HW VLAN strip | |
67 | ||
68 | * HW extend dual VLAN | |
69 | ||
11fdf7f2 | 70 | To guarantee the constraint, capabilities in dev_conf.rxmode.offloads will be checked: |
7c673cae | 71 | |
11fdf7f2 | 72 | * DEV_RX_OFFLOAD_VLAN_STRIP |
7c673cae | 73 | |
11fdf7f2 | 74 | * DEV_RX_OFFLOAD_VLAN_EXTEND |
7c673cae | 75 | |
11fdf7f2 | 76 | * DEV_RX_OFFLOAD_CHECKSUM |
7c673cae | 77 | |
11fdf7f2 | 78 | * DEV_RX_OFFLOAD_HEADER_SPLIT |
7c673cae FG |
79 | |
80 | * dev_conf | |
81 | ||
82 | fdir_conf->mode will also be checked. | |
83 | ||
f67539c2 TL |
84 | VF Runtime Options |
85 | ^^^^^^^^^^^^^^^^^^ | |
86 | ||
87 | The following ``devargs`` options can be enabled at runtime. They must | |
88 | be passed as part of EAL arguments. For example, | |
89 | ||
90 | .. code-block:: console | |
91 | ||
92 | testpmd -w af:10.0,pflink_fullchk=1 -- -i | |
93 | ||
94 | - ``pflink_fullchk`` (default **0**) | |
95 | ||
96 | When calling ``rte_eth_link_get_nowait()`` to get VF link status, | |
97 | this option is used to control how VF synchronizes its status with | |
98 | PF's. If set, VF will not only check the PF's physical link status | |
99 | by reading related register, but also check the mailbox status. We | |
100 | call this behavior as fully checking. And checking mailbox will | |
101 | trigger PF's mailbox interrupt generation. If unset, the application | |
102 | can get the VF's link status quickly by just reading the PF's link | |
103 | status register, this will avoid the whole system's mailbox interrupt | |
104 | generation. | |
105 | ||
106 | ``rte_eth_link_get()`` will still use the mailbox method regardless | |
107 | of the pflink_fullchk setting. | |
108 | ||
7c673cae FG |
109 | RX Burst Size |
110 | ^^^^^^^^^^^^^ | |
111 | ||
112 | As vPMD is focused on high throughput, it assumes that the RX burst size is equal to or greater than 32 per burst. | |
113 | It returns zero if using nb_pkt < 32 as the expected packet number in the receive handler. | |
114 | ||
115 | TX Constraint | |
116 | ~~~~~~~~~~~~~ | |
117 | ||
118 | Prerequisite | |
119 | ^^^^^^^^^^^^ | |
120 | ||
121 | The only prerequisite is related to tx_rs_thresh. | |
122 | The tx_rs_thresh value must be greater than or equal to RTE_PMD_IXGBE_TX_MAX_BURST, | |
123 | but less or equal to RTE_IXGBE_TX_MAX_FREE_BUF_SZ. | |
124 | Consequently, by default the tx_rs_thresh value is in the range 32 to 64. | |
125 | ||
11fdf7f2 | 126 | Feature not Supported by TX Vector PMD |
7c673cae FG |
127 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
128 | ||
11fdf7f2 | 129 | TX vPMD only works when offloads is set to 0 |
7c673cae | 130 | |
11fdf7f2 | 131 | This means that it does not support any TX offload. |
7c673cae FG |
132 | |
133 | Application Programming Interface | |
11fdf7f2 | 134 | --------------------------------- |
7c673cae FG |
135 | |
136 | In DPDK release v16.11 an API for ixgbe specific functions has been added to the ixgbe PMD. | |
137 | The declarations for the API functions are in the header ``rte_pmd_ixgbe.h``. | |
138 | ||
139 | Sample Application Notes | |
11fdf7f2 | 140 | ------------------------ |
7c673cae FG |
141 | |
142 | l3fwd | |
11fdf7f2 | 143 | ~~~~~ |
7c673cae FG |
144 | |
145 | When running l3fwd with vPMD, there is one thing to note. | |
11fdf7f2 | 146 | In the configuration, ensure that DEV_RX_OFFLOAD_CHECKSUM in port_conf.rxmode.offloads is NOT set. |
7c673cae FG |
147 | Otherwise, by default, RX vPMD is disabled. |
148 | ||
149 | load_balancer | |
11fdf7f2 | 150 | ~~~~~~~~~~~~~ |
7c673cae | 151 | |
11fdf7f2 | 152 | As in the case of l3fwd, to enable vPMD, do NOT set DEV_RX_OFFLOAD_CHECKSUM in port_conf.rxmode.offloads. |
7c673cae FG |
153 | In addition, for improved performance, use -bsz "(32,32),(64,64),(32,32)" in load_balancer to avoid using the default burst size of 144. |
154 | ||
155 | ||
11fdf7f2 TL |
156 | Limitations or Known issues |
157 | --------------------------- | |
158 | ||
7c673cae | 159 | Malicious Driver Detection not Supported |
11fdf7f2 | 160 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
7c673cae FG |
161 | |
162 | The Intel x550 series NICs support a feature called MDD (Malicious | |
163 | Driver Detection) which checks the behavior of the VF driver. | |
164 | If this feature is enabled, the VF must use the advanced context descriptor | |
165 | correctly and set the CC (Check Context) bit. | |
166 | DPDK PF doesn't support MDD, but kernel PF does. We may hit problem in this | |
167 | scenario kernel PF + DPDK VF. If user enables MDD in kernel PF, DPDK VF will | |
168 | not work. Because kernel PF thinks the VF is malicious. But actually it's not. | |
169 | The only reason is the VF doesn't act as MDD required. | |
170 | There's significant performance impact to support MDD. DPDK should check if | |
171 | the advanced context descriptor should be set and set it. And DPDK has to ask | |
172 | the info about the header length from the upper layer, because parsing the | |
173 | packet itself is not acceptable. So, it's too expensive to support MDD. | |
11fdf7f2 TL |
174 | When using kernel PF + DPDK VF on x550, please make sure to use a kernel |
175 | PF driver that disables MDD or can disable MDD. | |
176 | ||
177 | Some kernel drivers already disable MDD by default while some kernels can use | |
178 | the command ``insmod ixgbe.ko MDD=0,0`` to disable MDD. Each "0" in the | |
179 | command refers to a port. For example, if there are 6 ixgbe ports, the command | |
180 | should be changed to ``insmod ixgbe.ko MDD=0,0,0,0,0,0``. | |
7c673cae FG |
181 | |
182 | ||
183 | Statistics | |
11fdf7f2 | 184 | ~~~~~~~~~~ |
7c673cae FG |
185 | |
186 | The statistics of ixgbe hardware must be polled regularly in order for it to | |
187 | remain consistent. Running a DPDK application without polling the statistics will | |
188 | cause registers on hardware to count to the maximum value, and "stick" at | |
189 | that value. | |
190 | ||
191 | In order to avoid statistic registers every reaching the maximum value, | |
192 | read the statistics from the hardware using ``rte_eth_stats_get()`` or | |
193 | ``rte_eth_xstats_get()``. | |
194 | ||
195 | The maximum time between statistics polls that ensures consistent results can | |
196 | be calculated as follows: | |
197 | ||
198 | .. code-block:: c | |
199 | ||
200 | max_read_interval = UINT_MAX / max_packets_per_second | |
201 | max_read_interval = 4294967295 / 14880952 | |
202 | max_read_interval = 288.6218096127183 (seconds) | |
203 | max_read_interval = ~4 mins 48 sec. | |
204 | ||
205 | In order to ensure valid results, it is recommended to poll every 4 minutes. | |
206 | ||
11fdf7f2 TL |
207 | MTU setting |
208 | ~~~~~~~~~~~ | |
209 | ||
210 | Although the user can set the MTU separately on PF and VF ports, the ixgbe NIC | |
211 | only supports one global MTU per physical port. | |
212 | So when the user sets different MTUs on PF and VF ports in one physical port, | |
213 | the real MTU for all these PF and VF ports is the largest value set. | |
214 | This behavior is based on the kernel driver behavior. | |
215 | ||
216 | VF MAC address setting | |
217 | ~~~~~~~~~~~~~~~~~~~~~~ | |
218 | ||
219 | On ixgbe, the concept of "pool" can be used for different things depending on | |
220 | the mode. In VMDq mode, "pool" means a VMDq pool. In IOV mode, "pool" means a | |
221 | VF. | |
222 | ||
223 | There is no RTE API to add a VF's MAC address from the PF. On ixgbe, the | |
224 | ``rte_eth_dev_mac_addr_add()`` function can be used to add a VF's MAC address, | |
225 | as a workaround. | |
226 | ||
9f95a23c TL |
227 | X550 does not support legacy interrupt mode |
228 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
229 | ||
230 | Description | |
231 | ^^^^^^^^^^^ | |
232 | X550 cannot get interrupts if using ``uio_pci_generic`` module or using legacy | |
233 | interrupt mode of ``igb_uio`` or ``vfio``. Because the errata of X550 states | |
234 | that the Interrupt Status bit is not implemented. The errata is the item #22 | |
235 | from `X550 spec update <https://www.intel.com/content/dam/www/public/us/en/ | |
236 | documents/specification-updates/ethernet-x550-spec-update.pdf>`_ | |
237 | ||
238 | Implication | |
239 | ^^^^^^^^^^^ | |
240 | When using ``uio_pci_generic`` module or using legacy interrupt mode of | |
241 | ``igb_uio`` or ``vfio``, the Interrupt Status bit would be checked if the | |
242 | interrupt is coming. Since the bit is not implemented in X550, the irq cannot | |
243 | be handled correctly and cannot report the event fd to DPDK apps. Then apps | |
244 | cannot get interrupts and ``dmesg`` will show messages like ``irq #No.: `` | |
245 | ``nobody cared.`` | |
246 | ||
247 | Workaround | |
248 | ^^^^^^^^^^ | |
249 | Do not bind the ``uio_pci_generic`` module in X550 NICs. | |
250 | Do not bind ``igb_uio`` with legacy mode in X550 NICs. | |
251 | Before binding ``vfio`` with legacy mode in X550 NICs, use ``modprobe vfio `` | |
252 | ``nointxmask=1`` to load ``vfio`` module if the intx is not shared with other | |
253 | devices. | |
11fdf7f2 TL |
254 | |
255 | Inline crypto processing support | |
256 | -------------------------------- | |
257 | ||
258 | Inline IPsec processing is supported for ``RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO`` | |
259 | mode for ESP packets only: | |
260 | ||
261 | - ESP authentication only: AES-128-GMAC (128-bit key) | |
262 | - ESP encryption and authentication: AES-128-GCM (128-bit key) | |
263 | ||
264 | IPsec Security Gateway Sample Application supports inline IPsec processing for | |
265 | ixgbe PMD. | |
266 | ||
267 | For more details see the IPsec Security Gateway Sample Application and Security | |
268 | library documentation. | |
269 | ||
270 | ||
271 | Virtual Function Port Representors | |
272 | ---------------------------------- | |
273 | The IXGBE PF PMD supports the creation of VF port representors for the control | |
274 | and monitoring of IXGBE virtual function devices. Each port representor | |
275 | corresponds to a single virtual function of that device. Using the ``devargs`` | |
276 | option ``representor`` the user can specify which virtual functions to create | |
277 | port representors for on initialization of the PF PMD by passing the VF IDs of | |
278 | the VFs which are required.:: | |
279 | ||
280 | -w DBDF,representor=[0,1,4] | |
281 | ||
282 | Currently hot-plugging of representor ports is not supported so all required | |
283 | representors must be specified on the creation of the PF. | |
7c673cae FG |
284 | |
285 | Supported Chipsets and NICs | |
286 | --------------------------- | |
287 | ||
288 | - Intel 82599EB 10 Gigabit Ethernet Controller | |
289 | - Intel 82598EB 10 Gigabit Ethernet Controller | |
290 | - Intel 82599ES 10 Gigabit Ethernet Controller | |
291 | - Intel 82599EN 10 Gigabit Ethernet Controller | |
292 | - Intel Ethernet Controller X540-AT2 | |
293 | - Intel Ethernet Controller X550-BT2 | |
294 | - Intel Ethernet Controller X550-AT2 | |
295 | - Intel Ethernet Controller X550-AT | |
296 | - Intel Ethernet Converged Network Adapter X520-SR1 | |
297 | - Intel Ethernet Converged Network Adapter X520-SR2 | |
298 | - Intel Ethernet Converged Network Adapter X520-LR1 | |
299 | - Intel Ethernet Converged Network Adapter X520-DA1 | |
300 | - Intel Ethernet Converged Network Adapter X520-DA2 | |
301 | - Intel Ethernet Converged Network Adapter X520-DA4 | |
302 | - Intel Ethernet Converged Network Adapter X520-QDA1 | |
303 | - Intel Ethernet Converged Network Adapter X520-T2 | |
304 | - Intel 10 Gigabit AF DA Dual Port Server Adapter | |
305 | - Intel 10 Gigabit AT Server Adapter | |
306 | - Intel 10 Gigabit AT2 Server Adapter | |
307 | - Intel 10 Gigabit CX4 Dual Port Server Adapter | |
308 | - Intel 10 Gigabit XF LR Server Adapter | |
309 | - Intel 10 Gigabit XF SR Dual Port Server Adapter | |
310 | - Intel 10 Gigabit XF SR Server Adapter | |
311 | - Intel Ethernet Converged Network Adapter X540-T1 | |
312 | - Intel Ethernet Converged Network Adapter X540-T2 | |
313 | - Intel Ethernet Converged Network Adapter X550-T1 | |
314 | - Intel Ethernet Converged Network Adapter X550-T2 |