]> git.proxmox.com Git - ceph.git/blame - ceph/src/seastar/dpdk/doc/guides/howto/flow_bifurcation.rst
import 15.2.0 Octopus source
[ceph.git] / ceph / src / seastar / dpdk / doc / guides / howto / flow_bifurcation.rst
CommitLineData
9f95a23c
TL
1.. SPDX-License-Identifier: BSD-3-Clause
2 Copyright(c) 2016 Intel Corporation.
7c673cae
FG
3
4Flow Bifurcation How-to Guide
5=============================
6
7Flow Bifurcation is a mechanism which uses hardware capable Ethernet devices
8to split traffic between Linux user space and kernel space. Since it is a
9hardware assisted feature this approach can provide line rate processing
10capability. Other than :ref:`KNI <kni>`, the software is just required to
11enable device configuration, there is no need to take care of the packet
12movement during the traffic split. This can yield better performance with
13less CPU overhead.
14
15The Flow Bifurcation splits the incoming data traffic to user space
16applications (such as DPDK applications) and/or kernel space programs (such as
17the Linux kernel stack). It can direct some traffic, for example data plane
18traffic, to DPDK, while directing some other traffic, for example control
19plane traffic, to the traditional Linux networking stack.
20
21There are a number of technical options to achieve this. A typical example is
22to combine the technology of SR-IOV and packet classification filtering.
23
24SR-IOV is a PCI standard that allows the same physical adapter to be split as
25multiple virtual functions. Each virtual function (VF) has separated queues
26with physical functions (PF). The network adapter will direct traffic to a
27virtual function with a matching destination MAC address. In a sense, SR-IOV
28has the capability for queue division.
29
30Packet classification filtering is a hardware capability available on most
31network adapters. Filters can be configured to direct specific flows to a
32given receive queue by hardware. Different NICs may have different filter
33types to direct flows to a Virtual Function or a queue that belong to it.
34
35In this way the Linux networking stack can receive specific traffic through
36the kernel driver while a DPDK application can receive specific traffic
37bypassing the Linux kernel by using drivers like VFIO or the DPDK ``igb_uio``
38module.
39
40.. _figure_flow_bifurcation_overview:
41
42.. figure:: img/flow_bifurcation_overview.*
43
44 Flow Bifurcation Overview
45
46
9f95a23c
TL
47Using Flow Bifurcation on Mellanox ConnectX
48-------------------------------------------
49
50The Mellanox devices are :ref:`natively bifurcated <bifurcated_driver>`,
51so there is no need to split into SR-IOV PF/VF
52in order to get the flow bifurcation mechanism.
53The full device is already shared with the kernel driver.
54
55The DPDK application can setup some flow steering rules,
56and let the rest go to the kernel stack.
57In order to define the filters strictly with flow rules,
58the :ref:`flow_isolated_mode` can be configured.
59
60There is no specific instructions to follow.
61The recommended reading is the :doc:`../prog_guide/rte_flow` guide.
62Below is an example of testpmd commands
63for receiving VXLAN 42 in 4 queues of the DPDK port 0,
64while all other packets go to the kernel:
65
66.. code-block:: console
67
68 testpmd> flow isolate 0 true
69 testpmd> flow create 0 ingress pattern eth / ipv4 / udp / vxlan vni is 42 / end \
70 actions rss queues 0 1 2 3 end / end
71
72
7c673cae
FG
73Using Flow Bifurcation on IXGBE in Linux
74----------------------------------------
75
76On Intel 82599 10 Gigabit Ethernet Controller series NICs Flow Bifurcation can
77be achieved by SR-IOV and Intel Flow Director technologies. Traffic can be
78directed to queues by the Flow Director capability, typically by matching
795-tuple of UDP/TCP packets.
80
81The typical procedure to achieve this is as follows:
82
83#. Boot the system without iommu, or with ``iommu=pt``.
84
85#. Create Virtual Functions:
86
87 .. code-block:: console
88
89 echo 2 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
90
91#. Enable and set flow filters:
92
93 .. code-block:: console
94
95 ethtool -K eth1 ntuple on
96 ethtool -N eth1 flow-type udp4 src-ip 192.0.2.2 dst-ip 198.51.100.2 \
97 action $queue_index_in_VF0
98 ethtool -N eth1 flow-type udp4 src-ip 198.51.100.2 dst-ip 192.0.2.2 \
99 action $queue_index_in_VF1
100
101 Where:
102
103 * ``$queue_index_in_VFn``: Bits 39:32 of the variable defines VF id + 1; the lower 32 bits indicates the queue index of the VF. Thus:
104
105 * ``$queue_index_in_VF0`` = ``(0x1 & 0xFF) << 32 + [queue index]``.
106
107 * ``$queue_index_in_VF1`` = ``(0x2 & 0xFF) << 32 + [queue index]``.
108
109 .. _figure_ixgbe_bifu_queue_idx:
110
111 .. figure:: img/ixgbe_bifu_queue_idx.*
112
113#. Compile the DPDK application and insert ``igb_uio`` or probe the ``vfio-pci`` kernel modules as normal.
114
115#. Bind the virtual functions:
116
117 .. code-block:: console
118
119 modprobe vfio-pci
120 dpdk-devbind.py -b vfio-pci 01:10.0
121 dpdk-devbind.py -b vfio-pci 01:10.1
122
123#. Run a DPDK application on the VFs:
124
125 .. code-block:: console
126
11fdf7f2 127 testpmd -l 0-7 -n 4 -- -i -w 01:10.0 -w 01:10.1 --forward-mode=mac
7c673cae
FG
128
129In this example, traffic matching the rules will go through the VF by matching
130the filter rule. All other traffic, not matching the rules, will go through
131the default queue or scaling on queues in the PF. That is to say UDP packets
132with the specified IP source and destination addresses will go through the
133DPDK application. All other traffic, with different hosts or different
134protocols, will go through the Linux networking stack.
135
136.. note::
137
138 * The above steps work on the Linux kernel v4.2.
139
140 * The Flow Bifurcation is implemented in Linux kernel and ixgbe kernel driver using the following patches:
141
142 * `ethtool: Add helper routines to pass vf to rx_flow_spec <https://patchwork.ozlabs.org/patch/476511/>`_
143
144 * `ixgbe: Allow flow director to use entire queue space <https://patchwork.ozlabs.org/patch/476516/>`_
145
146 * The Ethtool version used in this example is 3.18.
147
148
149Using Flow Bifurcation on I40E in Linux
150---------------------------------------
151
152On Intel X710/XL710 series Ethernet Controllers Flow Bifurcation can be
153achieved by SR-IOV, Cloud Filter and L3 VEB switch. The traffic can be
154directed to queues by the Cloud Filter and L3 VEB switch's matching rule.
155
156* L3 VEB filters work for non-tunneled packets. It can direct a packet just by
157 the Destination IP address to a queue in a VF.
158
159* Cloud filters work for the following types of tunneled packets.
160
161 * Inner mac.
162
163 * Inner mac + VNI.
164
165 * Outer mac + Inner mac + VNI.
166
167 * Inner mac + Inner vlan + VNI.
168
169 * Inner mac + Inner vlan.
170
171The typical procedure to achieve this is as follows:
172
173#. Boot the system without iommu, or with ``iommu=pt``.
174
175#. Build and insert the ``i40e.ko`` module.
176
177#. Create Virtual Functions:
178
179 .. code-block:: console
180
181 echo 2 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
182
183#. Add udp port offload to the NIC if using cloud filter:
184
185 .. code-block:: console
186
187 ip li add vxlan0 type vxlan id 42 group 239.1.1.1 local 10.16.43.214 dev <name>
188 ifconfig vxlan0 up
189 ip -d li show vxlan0
190
191 .. note::
192
193 Output such as ``add vxlan port 8472, index 0 success`` should be
194 found in the system log.
195
196#. Examples of enabling and setting flow filters:
197
198 * L3 VEB filter, for a route whose destination IP is 192.168.50.108 to VF
199 0's queue 2.
200
201 .. code-block:: console
202
203 ethtool -N <dev_name> flow-type ip4 dst-ip 192.168.50.108 \
204 user-def 0xffffffff00000000 action 2 loc 8
205
206 * Inner mac, for a route whose inner destination mac is 0:0:0:0:9:0 to
207 PF's queue 6.
208
209 .. code-block:: console
210
211 ethtool -N <dev_name> flow-type ether dst 00:00:00:00:00:00 \
212 m ff:ff:ff:ff:ff:ff src 00:00:00:00:09:00 m 00:00:00:00:00:00 \
213 user-def 0xffffffff00000003 action 6 loc 1
214
215 * Inner mac + VNI, for a route whose inner destination mac is 0:0:0:0:9:0
216 and VNI is 8 to PF's queue 4.
217
218 .. code-block:: console
219
220 ethtool -N <dev_name> flow-type ether dst 00:00:00:00:00:00 \
221 m ff:ff:ff:ff:ff:ff src 00:00:00:00:09:00 m 00:00:00:00:00:00 \
222 user-def 0x800000003 action 4 loc 4
223
224 * Outer mac + Inner mac + VNI, for a route whose outer mac is
225 68:05:ca:24:03:8b, inner destination mac is c2:1a:e1:53:bc:57, and VNI
226 is 8 to PF's queue 2.
227
228 .. code-block:: console
229
230 ethtool -N <dev_name> flow-type ether dst 68:05:ca:24:03:8b \
231 m 00:00:00:00:00:00 src c2:1a:e1:53:bc:57 m 00:00:00:00:00:00 \
232 user-def 0x800000003 action 2 loc 2
233
234 * Inner mac + Inner vlan + VNI, for a route whose inner destination mac is
235 00:00:00:00:20:00, inner vlan is 10, and VNI is 8 to VF 0's queue 1.
236
237 .. code-block:: console
238
239 ethtool -N <dev_name> flow-type ether dst 00:00:00:00:01:00 \
240 m ff:ff:ff:ff:ff:ff src 00:00:00:00:20:00 m 00:00:00:00:00:00 \
241 vlan 10 user-def 0x800000000 action 1 loc 5
242
243 * Inner mac + Inner vlan, for a route whose inner destination mac is
244 00:00:00:00:20:00, and inner vlan is 10 to VF 0's queue 1.
245
246 .. code-block:: console
247
248 ethtool -N <dev_name> flow-type ether dst 00:00:00:00:01:00 \
249 m ff:ff:ff:ff:ff:ff src 00:00:00:00:20:00 m 00:00:00:00:00:00 \
250 vlan 10 user-def 0xffffffff00000000 action 1 loc 5
251
252 .. note::
253
254 * If the upper 32 bits of 'user-def' are ``0xffffffff``, then the
255 filter can be used for programming an L3 VEB filter, otherwise the
256 upper 32 bits of 'user-def' can carry the tenant ID/VNI if
257 specified/required.
258
259 * Cloud filters can be defined with inner mac, outer mac, inner ip,
260 inner vlan and VNI as part of the cloud tuple. It is always the
261 destination (not source) mac/ip that these filters use. For all
262 these examples dst and src mac address fields are overloaded dst ==
263 outer, src == inner.
264
265 * The filter will direct a packet matching the rule to a vf id
266 specified in the lower 32 bit of user-def to the queue specified by
267 'action'.
268
269 * If the vf id specified by the lower 32 bit of user-def is greater
270 than or equal to ``max_vfs``, then the filter is for the PF queues.
271
272#. Compile the DPDK application and insert ``igb_uio`` or probe the ``vfio-pci``
273 kernel modules as normal.
274
275#. Bind the virtual function:
276
277 .. code-block:: console
278
279 modprobe vfio-pci
280 dpdk-devbind.py -b vfio-pci 01:10.0
281 dpdk-devbind.py -b vfio-pci 01:10.1
282
283#. run DPDK application on VFs:
284
285 .. code-block:: console
286
11fdf7f2 287 testpmd -l 0-7 -n 4 -- -i -w 01:10.0 -w 01:10.1 --forward-mode=mac
7c673cae
FG
288
289.. note::
290
291 * The above steps work on the i40e Linux kernel driver v1.5.16.
292
293 * The Ethtool version used in this example is 3.18. The mask ``ff`` means
294 'not involved', while ``00`` or no mask means 'involved'.
295
296 * For more details of the configuration, refer to the
9f95a23c 297 `cloud filter test plan <http://git.dpdk.org/tools/dts/tree/test_plans/cloud_filter_test_plan.rst>`_