]> git.proxmox.com Git - ceph.git/blame - ceph/src/spdk/dpdk/doc/guides/nics/ixgbe.rst
import 15.2.0 Octopus source
[ceph.git] / ceph / src / spdk / dpdk / doc / guides / nics / ixgbe.rst
CommitLineData
11fdf7f2
TL
1.. SPDX-License-Identifier: BSD-3-Clause
2 Copyright(c) 2010-2016 Intel Corporation.
7c673cae
FG
3
4IXGBE Driver
5============
6
7Vector PMD for IXGBE
8--------------------
9
10Vector PMD uses Intel® SIMD instructions to optimize packet I/O.
11It improves load/store bandwidth efficiency of L1 data cache by using a wider SSE/AVX register 1 (1).
12The wider register gives space to hold multiple packet buffers so as to save instruction number when processing bulk of packets.
13
14There is no change to PMD API. The RX/TX handler are the only two entries for vPMD packet I/O.
15They are transparently registered at runtime RX/TX execution if all condition checks pass.
16
171. To date, only an SSE version of IX GBE vPMD is available.
18 To ensure that vPMD is in the binary code, ensure that the option CONFIG_RTE_IXGBE_INC_VECTOR=y is in the configure file.
19
20Some constraints apply as pre-conditions for specific optimizations on bulk packet transfers.
21The following sections explain RX and TX constraints in the vPMD.
22
23RX Constraints
24~~~~~~~~~~~~~~
25
26Prerequisites and Pre-conditions
27^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28
29The following prerequisites apply:
30
31* To enable vPMD to work for RX, bulk allocation for Rx must be allowed.
32
33Ensure that the following pre-conditions are satisfied:
34
35* rxq->rx_free_thresh >= RTE_PMD_IXGBE_RX_MAX_BURST
36
37* rxq->rx_free_thresh < rxq->nb_rx_desc
38
39* (rxq->nb_rx_desc % rxq->rx_free_thresh) == 0
40
41* rxq->nb_rx_desc < (IXGBE_MAX_RING_DESC - RTE_PMD_IXGBE_RX_MAX_BURST)
42
43These conditions are checked in the code.
44
45Scattered packets are not supported in this mode.
46If an incoming packet is greater than the maximum acceptable length of one "mbuf" data size (by default, the size is 2 KB),
47vPMD for RX would be disabled.
48
49By default, IXGBE_MAX_RING_DESC is set to 4096 and RTE_PMD_IXGBE_RX_MAX_BURST is set to 32.
50
51Feature not Supported by RX Vector PMD
52^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
53
54Some features are not supported when trying to increase the throughput in vPMD.
55They are:
56
57* IEEE1588
58
59* FDIR
60
61* Header split
62
63* RX checksum off load
64
65Other features are supported using optional MACRO configuration. They include:
66
67* HW VLAN strip
68
69* HW extend dual VLAN
70
11fdf7f2 71To guarantee the constraint, capabilities in dev_conf.rxmode.offloads will be checked:
7c673cae 72
11fdf7f2 73* DEV_RX_OFFLOAD_VLAN_STRIP
7c673cae 74
11fdf7f2 75* DEV_RX_OFFLOAD_VLAN_EXTEND
7c673cae 76
11fdf7f2 77* DEV_RX_OFFLOAD_CHECKSUM
7c673cae 78
11fdf7f2 79* DEV_RX_OFFLOAD_HEADER_SPLIT
7c673cae
FG
80
81* dev_conf
82
83fdir_conf->mode will also be checked.
84
85RX Burst Size
86^^^^^^^^^^^^^
87
88As vPMD is focused on high throughput, it assumes that the RX burst size is equal to or greater than 32 per burst.
89It returns zero if using nb_pkt < 32 as the expected packet number in the receive handler.
90
91TX Constraint
92~~~~~~~~~~~~~
93
94Prerequisite
95^^^^^^^^^^^^
96
97The only prerequisite is related to tx_rs_thresh.
98The tx_rs_thresh value must be greater than or equal to RTE_PMD_IXGBE_TX_MAX_BURST,
99but less or equal to RTE_IXGBE_TX_MAX_FREE_BUF_SZ.
100Consequently, by default the tx_rs_thresh value is in the range 32 to 64.
101
11fdf7f2 102Feature not Supported by TX Vector PMD
7c673cae
FG
103^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
104
11fdf7f2 105TX vPMD only works when offloads is set to 0
7c673cae 106
11fdf7f2 107This means that it does not support any TX offload.
7c673cae
FG
108
109Application Programming Interface
11fdf7f2 110---------------------------------
7c673cae
FG
111
112In DPDK release v16.11 an API for ixgbe specific functions has been added to the ixgbe PMD.
113The declarations for the API functions are in the header ``rte_pmd_ixgbe.h``.
114
115Sample Application Notes
11fdf7f2 116------------------------
7c673cae
FG
117
118l3fwd
11fdf7f2 119~~~~~
7c673cae
FG
120
121When running l3fwd with vPMD, there is one thing to note.
11fdf7f2 122In the configuration, ensure that DEV_RX_OFFLOAD_CHECKSUM in port_conf.rxmode.offloads is NOT set.
7c673cae
FG
123Otherwise, by default, RX vPMD is disabled.
124
125load_balancer
11fdf7f2 126~~~~~~~~~~~~~
7c673cae 127
11fdf7f2 128As in the case of l3fwd, to enable vPMD, do NOT set DEV_RX_OFFLOAD_CHECKSUM in port_conf.rxmode.offloads.
7c673cae
FG
129In addition, for improved performance, use -bsz "(32,32),(64,64),(32,32)" in load_balancer to avoid using the default burst size of 144.
130
131
11fdf7f2
TL
132Limitations or Known issues
133---------------------------
134
7c673cae 135Malicious Driver Detection not Supported
11fdf7f2 136~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
7c673cae
FG
137
138The Intel x550 series NICs support a feature called MDD (Malicious
139Driver Detection) which checks the behavior of the VF driver.
140If this feature is enabled, the VF must use the advanced context descriptor
141correctly and set the CC (Check Context) bit.
142DPDK PF doesn't support MDD, but kernel PF does. We may hit problem in this
143scenario kernel PF + DPDK VF. If user enables MDD in kernel PF, DPDK VF will
144not work. Because kernel PF thinks the VF is malicious. But actually it's not.
145The only reason is the VF doesn't act as MDD required.
146There's significant performance impact to support MDD. DPDK should check if
147the advanced context descriptor should be set and set it. And DPDK has to ask
148the info about the header length from the upper layer, because parsing the
149packet itself is not acceptable. So, it's too expensive to support MDD.
11fdf7f2
TL
150When using kernel PF + DPDK VF on x550, please make sure to use a kernel
151PF driver that disables MDD or can disable MDD.
152
153Some kernel drivers already disable MDD by default while some kernels can use
154the command ``insmod ixgbe.ko MDD=0,0`` to disable MDD. Each "0" in the
155command refers to a port. For example, if there are 6 ixgbe ports, the command
156should be changed to ``insmod ixgbe.ko MDD=0,0,0,0,0,0``.
7c673cae
FG
157
158
159Statistics
11fdf7f2 160~~~~~~~~~~
7c673cae
FG
161
162The statistics of ixgbe hardware must be polled regularly in order for it to
163remain consistent. Running a DPDK application without polling the statistics will
164cause registers on hardware to count to the maximum value, and "stick" at
165that value.
166
167In order to avoid statistic registers every reaching the maximum value,
168read the statistics from the hardware using ``rte_eth_stats_get()`` or
169``rte_eth_xstats_get()``.
170
171The maximum time between statistics polls that ensures consistent results can
172be calculated as follows:
173
174.. code-block:: c
175
176 max_read_interval = UINT_MAX / max_packets_per_second
177 max_read_interval = 4294967295 / 14880952
178 max_read_interval = 288.6218096127183 (seconds)
179 max_read_interval = ~4 mins 48 sec.
180
181In order to ensure valid results, it is recommended to poll every 4 minutes.
182
11fdf7f2
TL
183MTU setting
184~~~~~~~~~~~
185
186Although the user can set the MTU separately on PF and VF ports, the ixgbe NIC
187only supports one global MTU per physical port.
188So when the user sets different MTUs on PF and VF ports in one physical port,
189the real MTU for all these PF and VF ports is the largest value set.
190This behavior is based on the kernel driver behavior.
191
192VF MAC address setting
193~~~~~~~~~~~~~~~~~~~~~~
194
195On ixgbe, the concept of "pool" can be used for different things depending on
196the mode. In VMDq mode, "pool" means a VMDq pool. In IOV mode, "pool" means a
197VF.
198
199There is no RTE API to add a VF's MAC address from the PF. On ixgbe, the
200``rte_eth_dev_mac_addr_add()`` function can be used to add a VF's MAC address,
201as a workaround.
202
9f95a23c
TL
203X550 does not support legacy interrupt mode
204~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
205
206Description
207^^^^^^^^^^^
208X550 cannot get interrupts if using ``uio_pci_generic`` module or using legacy
209interrupt mode of ``igb_uio`` or ``vfio``. Because the errata of X550 states
210that the Interrupt Status bit is not implemented. The errata is the item #22
211from `X550 spec update <https://www.intel.com/content/dam/www/public/us/en/
212documents/specification-updates/ethernet-x550-spec-update.pdf>`_
213
214Implication
215^^^^^^^^^^^
216When using ``uio_pci_generic`` module or using legacy interrupt mode of
217``igb_uio`` or ``vfio``, the Interrupt Status bit would be checked if the
218interrupt is coming. Since the bit is not implemented in X550, the irq cannot
219be handled correctly and cannot report the event fd to DPDK apps. Then apps
220cannot get interrupts and ``dmesg`` will show messages like ``irq #No.: ``
221``nobody cared.``
222
223Workaround
224^^^^^^^^^^
225Do not bind the ``uio_pci_generic`` module in X550 NICs.
226Do not bind ``igb_uio`` with legacy mode in X550 NICs.
227Before binding ``vfio`` with legacy mode in X550 NICs, use ``modprobe vfio ``
228``nointxmask=1`` to load ``vfio`` module if the intx is not shared with other
229devices.
11fdf7f2
TL
230
231Inline crypto processing support
232--------------------------------
233
234Inline IPsec processing is supported for ``RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO``
235mode for ESP packets only:
236
237- ESP authentication only: AES-128-GMAC (128-bit key)
238- ESP encryption and authentication: AES-128-GCM (128-bit key)
239
240IPsec Security Gateway Sample Application supports inline IPsec processing for
241ixgbe PMD.
242
243For more details see the IPsec Security Gateway Sample Application and Security
244library documentation.
245
246
247Virtual Function Port Representors
248----------------------------------
249The IXGBE PF PMD supports the creation of VF port representors for the control
250and monitoring of IXGBE virtual function devices. Each port representor
251corresponds to a single virtual function of that device. Using the ``devargs``
252option ``representor`` the user can specify which virtual functions to create
253port representors for on initialization of the PF PMD by passing the VF IDs of
254the VFs which are required.::
255
256 -w DBDF,representor=[0,1,4]
257
258Currently hot-plugging of representor ports is not supported so all required
259representors must be specified on the creation of the PF.
7c673cae
FG
260
261Supported Chipsets and NICs
262---------------------------
263
264- Intel 82599EB 10 Gigabit Ethernet Controller
265- Intel 82598EB 10 Gigabit Ethernet Controller
266- Intel 82599ES 10 Gigabit Ethernet Controller
267- Intel 82599EN 10 Gigabit Ethernet Controller
268- Intel Ethernet Controller X540-AT2
269- Intel Ethernet Controller X550-BT2
270- Intel Ethernet Controller X550-AT2
271- Intel Ethernet Controller X550-AT
272- Intel Ethernet Converged Network Adapter X520-SR1
273- Intel Ethernet Converged Network Adapter X520-SR2
274- Intel Ethernet Converged Network Adapter X520-LR1
275- Intel Ethernet Converged Network Adapter X520-DA1
276- Intel Ethernet Converged Network Adapter X520-DA2
277- Intel Ethernet Converged Network Adapter X520-DA4
278- Intel Ethernet Converged Network Adapter X520-QDA1
279- Intel Ethernet Converged Network Adapter X520-T2
280- Intel 10 Gigabit AF DA Dual Port Server Adapter
281- Intel 10 Gigabit AT Server Adapter
282- Intel 10 Gigabit AT2 Server Adapter
283- Intel 10 Gigabit CX4 Dual Port Server Adapter
284- Intel 10 Gigabit XF LR Server Adapter
285- Intel 10 Gigabit XF SR Dual Port Server Adapter
286- Intel 10 Gigabit XF SR Server Adapter
287- Intel Ethernet Converged Network Adapter X540-T1
288- Intel Ethernet Converged Network Adapter X540-T2
289- Intel Ethernet Converged Network Adapter X550-T1
290- Intel Ethernet Converged Network Adapter X550-T2